|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PMML 4.1 - General RegressionModel XSD and Tag Description <xs:element name="GeneralRegressionModel"> <xs:complexType> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" ref="Extension"/> <xs:element ref="MiningSchema"/> <xs:element minOccurs="0" ref="Output"/> <xs:element minOccurs="0" ref="ModelStats"/> <xs:element ref="ModelExplanation" minOccurs="0"/> <xs:element minOccurs="0" ref="Targets"/> <xs:element minOccurs="0" ref="LocalTransformations"/> <xs:element ref="ParameterList"/> <xs:element minOccurs="0" ref="FactorList"/> <xs:element minOccurs="0" ref="CovariateList"/> <xs:element ref="PPMatrix"/> <xs:element minOccurs="0" ref="PCovMatrix"/> <xs:element ref="ParamMatrix"/> <xs:element minOccurs="0" ref="EventValues"/> <xs:element minOccurs="0" ref="BaseCumHazardTables"/> <xs:element ref="ModelVerification" minOccurs="0"/> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="targetVariableName" type="FIELD-NAME"/> <xs:attribute name="modelType" use="required"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="regression"/> <xs:enumeration value="generalLinear"/> <xs:enumeration value="multinomialLogistic"/> <xs:enumeration value="ordinalMultinomial"/> <xs:enumeration value="generalizedLinear"/> <xs:enumeration value="CoxRegression"/> </xs:restriction> </xs:simpleType> </xs:attribute> <xs:attribute name="modelName" type="xs:string"/> <xs:attribute name="functionName" type="MINING-FUNCTION" use="required"/> <xs:attribute name="algorithmName" type="xs:string"/> <xs:attribute name="targetReferenceCategory" type="xs:string"/> <xs:attribute name="cumulativeLink" type="CUMULATIVE-LINK-FUNCTION"/> <xs:attribute name="linkFunction" type="LINK-FUNCTION"/> <xs:attribute name="linkParameter" type="REAL-NUMBER"/> <xs:attribute name="trialsVariable" type="FIELD-NAME"/> <xs:attribute name="trialsValue" type="INT-NUMBER"/> <xs:attribute name="distribution"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="binomial"/> <xs:enumeration value="gamma"/> <xs:enumeration value="igauss"/> <xs:enumeration value="negbin"/> <xs:enumeration value="normal"/> <xs:enumeration value="poisson"/> <xs:enumeration value="tweedie"/> </xs:restriction> </xs:simpleType> </xs:attribute> <xs:attribute name="distParameter" type="REAL-NUMBER"/> <xs:attribute name="offsetVariable" type="FIELD-NAME"/> <xs:attribute name="offsetValue" type="REAL-NUMBER"/> <xs:attribute name="modelDF" type="REAL-NUMBER"/> <xs:attribute name="endTimeVariable" type="FIELD-NAME"/> <xs:attribute name="startTimeVariable" type="FIELD-NAME"/> <xs:attribute name="subjectIDVariable" type="FIELD-NAME"/> <xs:attribute name="statusVariable" type="FIELD-NAME"/> <xs:attribute name="baselineStrataVariable" type="FIELD-NAME"/> <xs:attribute name="isScorable" type="xs:boolean" default="true"/> </xs:complexType> </xs:element> <xs:element name="ParameterList"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element ref="Parameter" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Parameter"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="name" type="xs:string" use="required"/> <xs:attribute name="label" type="xs:string"/> <xs:attribute name="referencePoint" type="REAL-NUMBER" default="0"/> </xs:complexType> </xs:element> <xs:element name="FactorList"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element minOccurs="0" maxOccurs="unbounded" ref="Predictor"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="CovariateList"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element minOccurs="0" maxOccurs="unbounded" ref="Predictor"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Predictor"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element ref="Categories" minOccurs="0" maxOccurs="1"/> <xs:element ref="Matrix" minOccurs="0"/> </xs:sequence> <xs:attribute name="name" type="FIELD-NAME" use="required"/> <xs:attribute name="contrastMatrixType" type="xs:string"/> </xs:complexType> </xs:element> <xs:element name="Categories"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element ref="Category" minOccurs="1" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Category"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="value" type="xs:string" use="required"/> </xs:complexType> </xs:element> <xs:element name="PPMatrix"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element ref="PPCell" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="PPCell"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="value" type="xs:string" use="required"/> <xs:attribute name="predictorName" type="FIELD-NAME" use="required"/> <xs:attribute name="parameterName" type="xs:string" use="required"/> <xs:attribute name="targetCategory" type="xs:string"/> </xs:complexType> </xs:element> <xs:element name="PCovMatrix"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element maxOccurs="unbounded" ref="PCovCell"/> </xs:sequence> <xs:attribute name="type"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="model"/> <xs:enumeration value="robust"/> </xs:restriction> </xs:simpleType> </xs:attribute> </xs:complexType> </xs:element> <xs:element name="PCovCell"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="pRow" type="xs:string" use="required"/> <xs:attribute name="pCol" type="xs:string" use="required"/> <xs:attribute name="tRow" type="xs:string"/> <xs:attribute name="tCol" type="xs:string"/> <xs:attribute name="value" type="REAL-NUMBER" use="required"/> <xs:attribute name="targetCategory" type="xs:string"/> </xs:complexType> </xs:element> <xs:element name="ParamMatrix"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element ref="PCell" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="PCell"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="targetCategory" type="xs:string"/> <xs:attribute name="parameterName" type="xs:string" use="required"/> <xs:attribute name="beta" type="REAL-NUMBER" use="required"/> <xs:attribute name="df" type="INT-NUMBER"/> </xs:complexType> </xs:element> <xs:element name="BaseCumHazardTables"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:choice> <xs:element maxOccurs="unbounded" ref="BaselineStratum"/> <xs:element maxOccurs="unbounded" ref="BaselineCell"/> </xs:choice> </xs:sequence> <xs:attribute name="maxTime" type="REAL-NUMBER" use="optional"/> </xs:complexType> </xs:element> <xs:element name="BaselineStratum"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element minOccurs="0" maxOccurs="unbounded" ref="BaselineCell"/> </xs:sequence> <xs:attribute name="value" type="xs:string" use="required"/> <xs:attribute name="label" type="xs:string"/> <xs:attribute name="maxTime" type="REAL-NUMBER" use="required"/> </xs:complexType> </xs:element> <xs:element name="BaselineCell"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="time" type="REAL-NUMBER" use="required"/> <xs:attribute name="cumHazard" type="REAL-NUMBER" use="required"/> </xs:complexType> </xs:element> <xs:element name="EventValues"> <xs:complexType> <xs:sequence> <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/> <xs:element minOccurs="0" maxOccurs="unbounded" ref="Value"/> <xs:element minOccurs="0" maxOccurs="unbounded" ref="Interval"/> </xs:sequence> </xs:complexType> </xs:element>
General Regression Samples: Multinomial Logistic ExampleHere is the information about the variables:
The Parameter estimates are displayed as follows:
The PPMatrix is:
This Predictor-to-Parameter combinations mapping is the same for each target variable category. The corresponding XML model is: <PMML xmlns="https://www.dmg.org/PMML-4_1" version="4.1"> <Header copyright="dmg.org"/> <DataDictionary numberOfFields="5"> <DataField name="jobcat" optype="categorical" dataType="double"/> <DataField name="minority" optype="categorical" dataType="double"/> <DataField name="sex" optype="categorical" dataType="double"/> <DataField name="age" optype="continuous" dataType="double"/> <DataField name="work" optype="continuous" dataType="double"/> </DataDictionary> <GeneralRegressionModel targetVariableName="jobcat" modelType="multinomialLogistic" functionName="classification"> <MiningSchema> <MiningField name="jobcat" usageType="predicted"/> <MiningField name="minority" usageType="active"/> <MiningField name="sex" usageType="active"/> <MiningField name="age" usageType="active"/> <MiningField name="work" usageType="active"/> </MiningSchema> <ParameterList> <Parameter name="p0" label="Intercept"/> <Parameter name="p1" label="[SEX=0]"/> <Parameter name="p2" label="[SEX=1]"/> <Parameter name="p3" label="[MINORITY=0]([SEX=0])"/> <Parameter name="p4" label="[MINORITY=1]([SEX=0])"/> <Parameter name="p5" label="[MINORITY=0]([SEX=1])"/> <Parameter name="p6" label="[MINORITY=1]([SEX=1])"/> <Parameter name="p7" label="age"/> <Parameter name="p8" label="work"/> </ParameterList> <FactorList> <Predictor name="sex"/> <Predictor name="minority"/> </FactorList> <CovariateList> <Predictor name="age"/> <Predictor name="work"/> </CovariateList> <PPMatrix> <PPCell value="0" predictorName="sex" parameterName="p1"/> <PPCell value="1" predictorName="sex" parameterName="p2"/> <PPCell value="0" predictorName="sex" parameterName="p3"/> <PPCell value="0" predictorName="sex" parameterName="p4"/> <PPCell value="1" predictorName="sex" parameterName="p5"/> <PPCell value="1" predictorName="sex" parameterName="p6"/> <PPCell value="0" predictorName="minority" parameterName="p3"/> <PPCell value="1" predictorName="minority" parameterName="p4"/> <PPCell value="0" predictorName="minority" parameterName="p5"/> <PPCell value="1" predictorName="minority" parameterName="p6"/> <PPCell value="1" predictorName="age" parameterName="p7"/> <PPCell value="1" predictorName="work" parameterName="p8"/> </PPMatrix> <ParamMatrix> <PCell targetCategory="1" parameterName="p0" beta="26.836" df="1"/> <PCell targetCategory="1" parameterName="p1" beta="-.719" df="1"/> <PCell targetCategory="1" parameterName="p3" beta="-19.214" df="1"/> <PCell targetCategory="1" parameterName="p5" beta="-.114" df="1"/> <PCell targetCategory="1" parameterName="p7" beta="-.133" df="1"/> <PCell targetCategory="1" parameterName="p8" beta="7.885E-02" df="1"/> <PCell targetCategory="2" parameterName="p0" beta="31.077" df="1"/> <PCell targetCategory="2" parameterName="p1" beta="-.869" df="1"/> <PCell targetCategory="2" parameterName="p3" beta="-18.99" df="1"/> <PCell targetCategory="2" parameterName="p5" beta="1.01" df="1"/> <PCell targetCategory="2" parameterName="p7" beta="-.3" df="1"/> <PCell targetCategory="2" parameterName="p8" beta=".152" df="1"/> <PCell targetCategory="3" parameterName="p0" beta="6.836" df="1"/> <PCell targetCategory="3" parameterName="p1" beta="16.305" df="1"/> <PCell targetCategory="3" parameterName="p3" beta="-20.041" df="1"/> <PCell targetCategory="3" parameterName="p5" beta="-.73" df="1"/> <PCell targetCategory="3" parameterName="p7" beta="-.156" df="1"/> <PCell targetCategory="3" parameterName="p8" beta=".267" df="1"/> <PCell targetCategory="4" parameterName="p0" beta="8.816" df="1"/> <PCell targetCategory="4" parameterName="p1" beta="15.264" df="1"/> <PCell targetCategory="4" parameterName="p3" beta="-16.799" df="1"/> <PCell targetCategory="4" parameterName="p5" beta="16.48" df="1"/> <PCell targetCategory="4" parameterName="p7" beta="-.133" df="1"/> <PCell targetCategory="4" parameterName="p8" beta="-.16" df="1"/> <PCell targetCategory="5" parameterName="p0" beta="5.862" df="1"/> <PCell targetCategory="5" parameterName="p1" beta="16.437" df="1"/> <PCell targetCategory="5" parameterName="p3" beta="-17.309" df="1"/> <PCell targetCategory="5" parameterName="p5" beta="15.888" df="1"/> <PCell targetCategory="5" parameterName="p7" beta="-.105" df="1"/> <PCell targetCategory="5" parameterName="p8" beta="6.914E-02" df="1"/> <PCell targetCategory="6" parameterName="p0" beta="6.495" df="1"/> <PCell targetCategory="6" parameterName="p1" beta="17.297" df="1"/> <PCell targetCategory="6" parameterName="p3" beta="-19.098" df="1"/> <PCell targetCategory="6" parameterName="p5" beta="16.841" df="1"/> <PCell targetCategory="6" parameterName="p7" beta="-.141" df="1"/> <PCell targetCategory="6" parameterName="p8" beta="-5.058E-02" df="1"/> </ParamMatrix> </GeneralRegressionModel> </PMML> Scoring Algorithm We will use the above example to illustrate the steps that should be followed in the scoring process. Say the following case (observation) must be scored: obs = (sex=1 minority=0 age=25 work=4)
General Regression Samples: General Linear ExampleThe information about the variables is the same as in the previous example, but now the target variable JOBCAT is considered to be continuous. The Predictor-to-Parameter combinations mapping is the same as above. The corresponding XML model is: <PMML xmlns="https://www.dmg.org/PMML-4_1" version="4.1"> <Header copyright="dmg.org"/> <DataDictionary numberOfFields="5"> <DataField name="jobcat" optype="continuous" dataType="double"/> <DataField name="minority" optype="categorical" dataType="double"/> <DataField name="sex" optype="categorical" dataType="double"/> <DataField name="age" optype="continuous" dataType="double"/> <DataField name="work" optype="continuous" dataType="double"/> </DataDictionary> <GeneralRegressionModel targetVariableName="jobcat" modelType="generalLinear" functionName="regression"> <MiningSchema> <MiningField name="jobcat" usageType="predicted"/> <MiningField name="minority" usageType="active"/> <MiningField name="sex" usageType="active"/> <MiningField name="age" usageType="active"/> <MiningField name="work" usageType="active"/> </MiningSchema> <ParameterList> <Parameter name="p0" label="Intercept"/> <Parameter name="p1" label="[SEX=0]"/> <Parameter name="p2" label="[SEX=1]"/> <Parameter name="p3" label="[MINORITY=0]([SEX=0])"/> <Parameter name="p4" label="[MINORITY=1]([SEX=0])"/> <Parameter name="p5" label="[MINORITY=0]([SEX=1])"/> <Parameter name="p6" label="[MINORITY=1]([SEX=1])"/> <Parameter name="p7" label="age"/> <Parameter name="p8" label="work"/> </ParameterList> <FactorList> <Predictor name="sex"/> <Predictor name="minority"/> </FactorList> <CovariateList> <Predictor name="age"/> <Predictor name="work"/> </CovariateList> <PPMatrix> <PPCell value="0" predictorName="sex" parameterName="p1"/> <PPCell value="1" predictorName="sex" parameterName="p2"/> <PPCell value="0" predictorName="sex" parameterName="p3"/> <PPCell value="0" predictorName="sex" parameterName="p4"/> <PPCell value="1" predictorName="sex" parameterName="p5"/> <PPCell value="1" predictorName="sex" parameterName="p6"/> <PPCell value="0" predictorName="minority" parameterName="p3"/> <PPCell value="1" predictorName="minority" parameterName="p4"/> <PPCell value="0" predictorName="minority" parameterName="p5"/> <PPCell value="1" predictorName="minority" parameterName="p6"/> <PPCell value="1" predictorName="age" parameterName="p7"/> <PPCell value="1" predictorName="work" parameterName="p8"/> </PPMatrix> <ParamMatrix> <PCell parameterName="p0" beta="1.602" df="1"/> <PCell parameterName="p1" beta="0.580" df="1"/> <PCell parameterName="p3" beta="0.831" df="1"/> <PCell parameterName="p5" beta="0.429" df="1"/> <PCell parameterName="p7" beta="-0.012" df="1"/> <PCell parameterName="p8" beta="0.010" df="1"/> </ParamMatrix> </GeneralRegressionModel> </PMML> Scoring Algorithm For this example the steps that should be followed in the scoring process are similar to the previous one but fewer. Say the following case (observation) must be scored: obs = (sex=1 minority=0 age=25 work=4)
General Regression Samples: Ordinal Multinomial ExampleThe information about the variables is the same as in the previous examples, but now the target variable JOBCAT is considered to be ordinal. The Predictor-to-Parameter combinations mapping is the same as above. The corresponding XML model is: <PMML xmlns="https://www.dmg.org/PMML-4_1" version="4.1"> <Header copyright="dmg.org"/> <DataDictionary numberOfFields="5"> <DataField name="jobcat" optype="ordinal" dataType="double"/> <DataField name="minority" optype="categorical" dataType="double"/> <DataField name="sex" optype="categorical" dataType="double"/> <DataField name="age" optype="continuous" dataType="double"/> <DataField name="work" optype="continuous" dataType="double"/> </DataDictionary> <GeneralRegressionModel targetVariableName="jobcat" modelType="ordinalMultinomial" functionName="classification" cumulativeLink="logit"> <MiningSchema> <MiningField name="jobcat" usageType="predicted"/> <MiningField name="minority" usageType="active"/> <MiningField name="sex" usageType="active"/> <MiningField name="age" usageType="active"/> <MiningField name="work" usageType="active"/> </MiningSchema> <ParameterList> <Parameter name="p0" label="Intercept"/> <Parameter name="p1" label="[SEX=0]"/> <Parameter name="p2" label="[SEX=1]"/> <Parameter name="p3" label="[MINORITY=0]([SEX=0])"/> <Parameter name="p4" label="[MINORITY=1]([SEX=0])"/> <Parameter name="p5" label="[MINORITY=0]([SEX=1])"/> <Parameter name="p6" label="[MINORITY=1]([SEX=1])"/> <Parameter name="p7" label="age"/> <Parameter name="p8" label="work"/> </ParameterList> <FactorList> <Predictor name="sex"/> <Predictor name="minority"/> </FactorList> <CovariateList> <Predictor name="age"/> <Predictor name="work"/> </CovariateList> <PPMatrix> <PPCell value="0" predictorName="sex" parameterName="p1"/> <PPCell value="1" predictorName="sex" parameterName="p2"/> <PPCell value="0" predictorName="sex" parameterName="p3"/> <PPCell value="0" predictorName="sex" parameterName="p4"/> <PPCell value="1" predictorName="sex" parameterName="p5"/> <PPCell value="1" predictorName="sex" parameterName="p6"/> <PPCell value="0" predictorName="minority" parameterName="p3"/> <PPCell value="1" predictorName="minority" parameterName="p4"/> <PPCell value="0" predictorName="minority" parameterName="p5"/> <PPCell value="1" predictorName="minority" parameterName="p6"/> <PPCell value="1" predictorName="age" parameterName="p7"/> <PPCell value="1" predictorName="work" parameterName="p8"/> </PPMatrix> <ParamMatrix> <PCell targetCategory="1" parameterName="p0" beta="-0.683" df="1"/> <PCell targetCategory="2" parameterName="p0" beta="0.723" df="1"/> <PCell targetCategory="3" parameterName="p0" beta="1.104" df="1"/> <PCell targetCategory="4" parameterName="p0" beta="1.922" df="1"/> <PCell targetCategory="5" parameterName="p0" beta="3.386" df="1"/> <PCell targetCategory="6" parameterName="p0" beta="4.006" df="1"/> <PCell parameterName="p1" beta="1.096" df="1"/> <PCell parameterName="p3" beta="0.957" df="1"/> <PCell parameterName="p5" beta="1.149" df="1"/> <PCell parameterName="p7" beta="-0.067" df="1"/> <PCell parameterName="p8" beta="0.060" df="1"/> </ParamMatrix> </GeneralRegressionModel> </PMML> Scoring Algorithm For this example the steps that should be followed in the scoring process are somewhat similar to the first example but also the link function is used. Say the following case (observation) must be scored: obs = (sex=1 minority=0 age=25 work=4)
How to compute pj := probability of target=ValuejFor each response category (value of the target variable) j, let βj be the vector of Parameter estimates for that response category. (If k is the last response category, βk is not specified.) For the given case let <x,βj> be the result of evaluating the inner product just like in the multinomialLogistic model and yj = <x,βj> + a. Predicted probability for each category is then computed according to the following formulas: p1 = F(y1) Function F is an inverse of the specified link function:
General Regression Samples: Simple Regression ExampleOnly two continuous predictors are used in this example, and the target variable JOBCAT is considered to be continuous. The Predictor-to-Parameter combinations mapping is trivial. The corresponding XML model is: <PMML xmlns="https://www.dmg.org/PMML-4_1" version="4.1"> <Header copyright="dmg.org"/> <DataDictionary numberOfFields="5"> <DataField name="jobcat" optype="continuous" dataType="double"/> <DataField name="minority" optype="continuous" dataType="double"/> <DataField name="sex" optype="continuous" dataType="double"/> <DataField name="age" optype="continuous" dataType="double"/> <DataField name="work" optype="continuous" dataType="double"/> </DataDictionary> <GeneralRegressionModel targetVariableName="jobcat" modelType="regression" functionName="regression"> <MiningSchema> <MiningField name="jobcat" usageType="predicted"/> <MiningField name="age" usageType="active"/> <MiningField name="work" usageType="active"/> </MiningSchema> <ParameterList> <Parameter name="p0" label="Intercept"/> <Parameter name="p1" label="age"/> <Parameter name="p2" label="work"/> </ParameterList> <CovariateList> <Predictor name="age"/> <Predictor name="work"/> </CovariateList> <PPMatrix> <PPCell value="1" predictorName="age" parameterName="p1"/> <PPCell value="1" predictorName="work" parameterName="p2"/> </PPMatrix> <ParamMatrix> <PCell parameterName="p0" beta="2.922" df="1"/> <PCell parameterName="p1" beta="-0.031" df="1"/> <PCell parameterName="p2" beta="0.034" df="1"/> </ParamMatrix> </GeneralRegressionModel> </PMML> Scoring Algorithm For this example the steps that should be followed in the scoring process are somewhat similar to the general linear example but are even simpler. Say the following case (observation) must be scored: obs = (age=25 work=4)
General Regression Samples: Generalized Linear Model ExampleThe information about the variables is the same as in the previous examples, but now the target variable JOBCAT is considered to be continuous. The Predictor-to-Parameter combinations mapping is the same as above. The corresponding XML model is: <PMML xmlns="https://www.dmg.org/PMML-4_1" version="4.1"> <Header copyright="dmg.org"/> <DataDictionary numberOfFields="5"> <DataField name="jobcat" optype="continuous" dataType="double"/> <DataField name="minority" optype="categorical" dataType="double"/> <DataField name="sex" optype="categorical" dataType="double"/> <DataField name="age" optype="continuous" dataType="double"/> <DataField name="work" optype="continuous" dataType="double"/> </DataDictionary> <GeneralRegressionModel targetVariableName="jobcat" modelType="generalLinear" modelName="GZLM" functionName="regression" distribution="gamma" linkFunction="power" linkParameter="-1" offsetValue="3"> <MiningSchema> <MiningField name="jobcat" usageType="predicted"/> <MiningField name="minority" usageType="active"/> <MiningField name="sex" usageType="active"/> <MiningField name="age" usageType="active"/> <MiningField name="work" usageType="active"/> </MiningSchema> <ParameterList> <Parameter name="p0" label="Intercept"/> <Parameter name="p1" label="[SEX=0]"/> <Parameter name="p2" label="[SEX=1]"/> <Parameter name="p3" label="[MINORITY=0]([SEX=0])"/> <Parameter name="p4" label="[MINORITY=1]([SEX=0])"/> <Parameter name="p5" label="[MINORITY=0]([SEX=1])"/> <Parameter name="p6" label="[MINORITY=1]([SEX=1])"/> <Parameter name="p7" label="age"/> <Parameter name="p8" label="work"/> </ParameterList> <FactorList> <Predictor name="sex"/> <Predictor name="minority"/> </FactorList> <CovariateList> <Predictor name="age"/> <Predictor name="work"/> </CovariateList> <PPMatrix> <PPCell value="0" predictorName="sex" parameterName="p1"/> <PPCell value="1" predictorName="sex" parameterName="p2"/> <PPCell value="0" predictorName="sex" parameterName="p3"/> <PPCell value="0" predictorName="sex" parameterName="p4"/> <PPCell value="1" predictorName="sex" parameterName="p5"/> <PPCell value="1" predictorName="sex" parameterName="p6"/> <PPCell value="0" predictorName="minority" parameterName="p3"/> <PPCell value="1" predictorName="minority" parameterName="p4"/> <PPCell value="0" predictorName="minority" parameterName="p5"/> <PPCell value="1" predictorName="minority" parameterName="p6"/> <PPCell value="1" predictorName="age" parameterName="p7"/> <PPCell value="1" predictorName="work" parameterName="p8"/> </PPMatrix> <ParamMatrix> <PCell parameterName="p0" beta="-2.30824444845005" df="1"/> <PCell parameterName="p1" beta="-0.268177596945098" df="1"/> <PCell parameterName="p3" beta="-0.169104566719988" df="1"/> <PCell parameterName="p5" beta="-0.219215962160056" df="1"/> <PCell parameterName="p7" beta="0.00427629446211706" df="1"/> <PCell parameterName="p8" beta="-0.00397117497757107" df="1"/> </ParamMatrix> </GeneralRegressionModel> </PMML> Scoring Algorithm For this example the steps that should be followed in the scoring process are somewhat similar to the second example but also the link function is used. Say the following case (observation) must be scored: obs = (sex=1 minority=0 age=25 work=4)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|