DMG logo PMML 4.1 - Model Verification
PMML4.1 Menu

Home

PMML Notice and License

Changes

XML Schema

Conformance

Interoperability

General Structure

Field Scope

Header

Data
Dictionary


Mining
Schema


Transformations

Statistics

Taxomony

Targets

Output

Functions

Built-in Functions

Model Verification

Model Explanation

Multiple Models

Association Rules

Baseline Models

Cluster
Models


General
Regression


k-Nearest
Neighbors


Naive
Bayes


Neural
Network


Regression

Ruleset

Scorecard

Sequences

Text Models

Time Series

Trees

Vector Machine

PMML 4.1 - Model Verification

Providers and consumers of PMML models need a mechanism to ensure a model deployed to a new environment generates results consistent with environment where the model was developed. Since differences in operating systems, data precision and algorithm implementation can affect the model's performance, the ModelVerification schema provides a dataset of model inputs and known results that can be used to verify accurate results are generated, regardless of the environment.

To use Model Verification, the producer of a PMML model adds a set of verification records to the model. These should include a meaningful sample of the training dataset, including normal cases as well as exception cases, such as missing data, outliers, and other extreme situations. The consumer can ignore this part when scoring the model. But the consumer may provide a feature that performs the verification of these records and returns the results to the user. These verification results can take many forms, such as providing a result for each verification record, listing records that fail to verify, maximum deviation, etc.

<xs:element name="ModelVerification">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="VerificationFields"/>
      <xs:element ref="InlineTable"/>
    </xs:sequence>
    <xs:attribute name="recordCount" type="INT-NUMBER" use="optional"/>
    <xs:attribute name="fieldCount" type="INT-NUMBER" use="optional"/>
  </xs:complexType>
</xs:element>

<xs:element name="VerificationFields">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element maxOccurs="unbounded" ref="VerificationField"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="VerificationField">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="field" type="xs:string" use="required"/>
    <xs:attribute name="column" type="xs:string" use="optional"/>
    <xs:attribute name="precision" type="xs:double" default="1E-6"/>
    <xs:attribute name="zeroThreshold" type="xs:double" default="1E-16"/>
  </xs:complexType>
</xs:element>

ModelVerification consists of two parts. First, the VerificationFields element contains the fields that will appear in the verification records. These include the inputs for each record and one or more outputs. The inputs must relate to fields in the MiningSchema. In particular, these inputs must be MiningFields that have usageType="active" or usageType="group".

The VerificationFields for outputs ideally should refer to OutputField elements. This allows more than one output to be verified. For example, including two OutputField references, one with feature="predictedValue" and one with feature="probability", means that both these results can be verified.

Often times, a VerificationField will refer to the MiningField whose usageType="predicted". This can cause confusion since some will interpret this field to represent the dependent variable used to train the model, while others may consider it to be the expected result of the model. Therefore, by convention, when there exist VerificationField elements that refer to OutputFields, any VerificationField that refers to a MiningField whose usageType="predicted" should be considered to represent a dependent variable from the training dataset, not an expected output. On the other hand, if there are no VerificationField elements that refer to an OutputField, then any VerificationField that refers to a MiningField whose usageType="predicted" should be considered to represent an expected output. The former is the preferred approach but the latter allows Model Verification to be used by producers that have yet to implement OutputFields. Below is an example of some OutputFields:

<Output>
  <OutputField name="Iris-setosa Prob" optype="continuous" dataType="double" targetField="response" feature="probability" value="YES"/> 
  <OutputField name="Iris-setosa Pred" optype="categorical" dataType="string" targetField="response" feature="predictedValue"/> 
  <OutputField name="Iris-setosa Display" optype="categorical" dataType="string" targetField="response" feature="predictedDisplayValue"/> 
</Output>

For more information, see the Output specification.

The second part contains the actual verification records. These records are contained in an InlineTable. Each entry in an InlineTable row must match the fields specified in the VerificationFields section. Since there can be situations where field names contain spaces or blanks, the column attribute can be used to specify the element name that will be used in the in-line table for this VerificationField. That name can be freely chosen and may not be related to the respective field name.

How to determine if results match expectations

Categorical and ordinal results have a single specific value and therefore the actual results should match the expected results exactly. However, an exact numeric result may be too precise for continuous results like regression values and probabilities. For example, 0.9999999 may be considered as a correct result for an expected value of 1, or 0.1-E12 (0.000000000001) may be a correct result for an expected value of 0. Hence each VerificationField has a precision attribute to indicate an acceptable range of continuous results for a particular field. The precision specifies an acceptable range given in proportion of the reference value, including its boundaries:

(expectedValue * (1 - precision)) ≤ resultValue ≤ (expectedValue * (1 + precision))

For example, if the expected result is 0.95 and the precision is 1E-3 (0.001) then the acceptable range for the result is 0.94905 to 0.95095 (including the limits):

(0.95 * (1 - 0.001)) ≤ resultValue ≤ (0.95 * (1 + 0.001))
or
(0.95 * 0.999) ≤ resultValue ≤ (0.95 * 1.001)
or
0.9405 ≤ resultValue ≤ 0.9595

By using a proportion of the expected value, the precision can handle a large range of values. However, this approach breaks down when the expected value is near zero. To avoid problems with reference values close or identical to zero, all numbers whose absolute value is smaller than zeroThreshold are considered to be zero. For example, if the expected value is greater than zeroThreshold, then the precision attribute is used and the zeroThreshold won't have any effect on verifying results. But if expected result is smaller than or equal to the zeroThreshold, then results should just be compared to the zeroThreshold. The whole idea behind zeroThreshold is that there is a range, in which the rounding errors might outweigh any calculations by far. zeroThreshold specifies that range. For example, if zeroThreshold is 0.01 and the expected value is less than or equal to 0.01, then the expected value is considered zero (regardless of the expected value in the PMML).

if (-zeroThreshold ≤ expectedValue ≤ zeroThreshold) AND (-zeroThreshold ≤ resultValue ≤ zeroThreshold) then the result is verified to be correct; otherwise, it is not;
if (expectedValue < -zeroThreshold) or (expectedValue > zeroThreshold) then use the precision method described above.

The following table illustrates various scoring situations, where:

  Precision            = value of VerificationField's precision attribute
  zeroThreshold        = value of VerificationField's zeroThreshold attribute
  expectedValue        = expected value of the record being verified
  Method               = procedure to use to verify the result
  Low                  = minimum value of verification range (dependent on verification method)
  High                 = maximum value of verification range (dependent on verification method)
  resultValue          = maximum value of verification range (dependent on verification method)
  Verified             = TRUE means the record is verified; FALSE means it is not verified.

Note that for each entry precision is 0.01 and zeroThreshold is 0.001.

Record expectedValue Method Low High resultValue Verified Comment
1 0.001000 zeroThreshold -0.001000 0.001000 0.001020 FALSE -zeroThresholdexpectedValuezeroThreshold so use zeroThreshold method. Not verified since resultValue > zeroThreshold
2 0.001000 zeroThreshold -0.001000 0.001000 0.001010 FALSE -zeroThresholdexpectedValuezeroThreshold so use zeroThreshold method. Not verified since resultValue > zeroThreshold
3 0.001000 zeroThreshold -0.00100 0.001000 0.001000 TRUE -zeroThresholdexpectedValuezeroThreshold so use zeroThreshold method. Verified since -zeroThreshold < resultValue < zeroThreshold
4 0.001000 zeroThreshold -0.001000 0.001000 0.000990 TRUE -zeroThresholdexpectedValuezeroThreshold so use zeroThreshold method. Verified since -zeroThreshold < resultValue < zeroThreshold
5 0.001000 zeroThreshold -0.001000 0.001000 0.000000 TRUE -zeroThresholdexpectedValuezeroThreshold so use zeroThreshold method. Verified since -zeroThreshold < resultValue < zeroThreshold
6 0.001000 zeroThreshold -0.001000 0.001000 -0.000999 TRUE -zeroThresholdexpectedValuezeroThreshold so use zeroThreshold method. Verified since -zeroThreshold < resultValue < zeroThreshold
7 0.001000 zeroThreshold -0.001000 0.001000 -0.001000 TRUE -zeroThresholdexpectedValuezeroThreshold so use zeroThreshold method. Verified since -zeroThreshold < resultValue < zeroThreshold
8 0.001000 zeroThreshold -0.001000 0.001000 -0.001001 FALSE -zeroThresholdexpectedValuezeroThreshold so use zeroThreshold method. Not verified since resultValue < -zeroThreshold
9 0.999000 precision 0.989010 1.008990 0.998980 TRUE expectedValue > zeroThreshold so use precision method. Verified since Low ≤ resultValue ≤ High
10 0.999000 precision 0.989010 1.008990 0.989010 TRUE expectedValue > zeroThreshold so use precision method. Verified since Low ≤ resultValue ≤ High
11 0.999000 precision 0.989010 1.008990 0.989000 FALSE expectedValue > zeroThreshold so use precision method. Not verified since resultValue < Low
12 0.999000 precision 0.989010 1.008990 1.009000 FALSE expectedValue > zeroThreshold so use precision method. Not verified since resultValue > High

It is recommended that producers of PMML models not only include a random data sample from the training data set, but also artificial records illustrating the behaviour of the model. E.g., it is a good idea to have records that set each individual field to a missing value. This may be varied by combinations of fields missing up to a record where input values for all fields are missing. Also, records where categorical fields have values that are not present in the training data can be of interest or numeric fields that have values outside of the regular value range for that field.

Here is an example for ModelVerification for the Iris dataset using an InlineTable. The field species is actually a MiningField with usageType="predicted", thus it represents the dependent variable from the training data set. Per above, since OutputFields are present, this field will not be considered an expected output. Note also that missing values are indicated by omitting the respective column in the InlineTable. Zero-length string values are indicated by omitting the in-line table column's value. Note that zero-length strings are only valid for string fields; if a zero-length string is specified for a non-string field, the value will be considered invalid. Also, field names containing a space use a column that replaces the space with _x0020_ - any other naming scheme like assigning numbers would work as well:

<ModelVerification recordCount="4" fieldCount="9">
  <VerificationFields>
    <!-- the following five fields are listed in the MiningSchema element -->
    <VerificationField field="petal length" column="petal_x0020_length" precision="0.01"/>
    <VerificationField field="petal width" column="petal_x0020_Width" precision="0.01"/>
    <VerificationField field="sepal length" column="sepal_x0020_length" precision="0.01"/>
    <VerificationField field="sepal width" column="sepal_x0020_width" precision="0.01"/>
    <VerificationField field="continent"/>
    <VerificationField field="species"/>
    <!-- the following four fields are listed in the Output element -->
    <VerificationField field="PredictClass"/>
    <VerificationField field="Iris-setosa Prob" column="Iris-setosa_x0020_Prob" precision="0.005"/>
    <VerificationField field="Iris-versicolor Prob" column="Iris-versicolor_x0020_Prob"/>
    <VerificationField field="Iris-virginica Prob" column="Iris-virginica_x0020_Prob" zeroThreshold="0.002"/>
  </VerificationFields>
  <InlineTable>
    <row>
      <petal_x0020_length>1.4</petal_x0020_length>
      <petal_x0020_width>0.2</petal_x0020_width>
      <sepal_x0020_length>1.4</sepal_x0020_length>
      <sepal_x0020_width>0.2</sepal_x0020_width>
      <continent>africa</continent>
      <species>Iris-setosa</species>
      <PredictClass>Iris-setosa</PredictClass>
      <Iris-setosa_x0020_Prob>0.62</Iris-setosa_x0020_Prob>
      <Iris-versicolor_x0020_Prob>0.30</Iris-versicolor_x0020_Prob>
      <Iris-virginica_x0020_Prob>0.08</Iris-virginica_x0020_Prob>
    </row>
    <row>
      <petal_x0020_length>4.7</petal_x0020_length>
      <petal_x0020_width>1.4</petal_x0020_width>
      <sepal_x0020_length>7.0</sepal_x0020_length>
      <!-- The sepal width value is missing -->
      <!-- The continent value is a zero-length string -->
      <continent/>
      <species>Iris-versicolor</species>
      <PredictClass>Iris-setosa</PredictClass>
      <Iris-setosa_x0020_Prob>0.43</Iris-setosa_x0020_Prob>
      <Iris-versicolor_x0020_Prob>0.39</Iris-versicolor_x0020_Prob>
      <Iris-virginica_x0020_Prob>0.18</Iris-virginica_x0020_Prob>
    </row>
    <row>
      <petal_x0020_length>4.7</petal_x0020_length>
      <petal_x0020_width>1.4</petal_x0020_width>
      <!-- The sepal length value is missing -->
      <sepal_x0020_width>0.2</sepal_x0020_width>
      <!-- The continent value is missing -->
      <species>Iris-versicolor</species>
      <PredictClass>Iris-setosa</PredictClass>
      <Iris-setosa_x0020_Prob>0.43</Iris-setosa_x0020_Prob>
      <Iris-versicolor_x0020_Prob>0.39</Iris-versicolor_x0020_Prob>
      <Iris-virginica_x0020_Prob>0.18</Iris-virginica_x0020_Prob>
    </row>
    <row>
      <petal_x0020_length>4.7</petal_x0020_length>
      <petal_x0020_width>1.4</petal_x0020_width>
      <sepal_x0020_length>7.0</sepal_x0020_length>
      <sepal_x0020_width>0.2</sepal_x0020_width>
      <continent>asia</continent>
      <species>Iris-versicolor</species>
      <!-- The PredictClass result is missing -->
      <Iris-setosa_x0020_Prob>0.609</Iris-setosa_x0020_Prob>
      <Iris-versicolor_x0020_Prob>0.39</Iris-versicolor_x0020_Prob>
      <Iris-virginica_x0020_Prob>0.001</Iris-virginica_x0020_Prob>
    </row>
  </InlineTable>    
</ModelVerification> 

Here is the same example, except this time, the model has no outputs defined. Instead, the mining schema field species represents the predicted value. Per above, since OutputFields are not present, this field will be considered an expected output.

<ModelVerification recordCount="4" fieldCount="5">
  <VerificationFields>
    <!-- the following five fields are listed in MiningSchema element -->
    <VerificationField field="petal length" column="petal_x0020_length" precision="0.01"/>
    <VerificationField field="petal width" column="petal_x0020_Width" precision="0.01"/>
    <VerificationField field="sepal length" column="sepal_x0020_length" precision="0.01"/>
    <VerificationField field="sepal width" column="sepal_x0020_width" precision="0.01"/>
    <VerificationField field="continent"/>
    <VerificationField field="species"/>
  </VerificationFields>
  <InlineTable>
    <row>
      <petal_x0020_length>1.4</petal_x0020_length>
      <petal_x0020_width>0.2</petal_x0020_width>
      <sepal_x0020_length>1.4</sepal_x0020_length>
      <sepal_x0020_width>0.2</sepal_x0020_width>
      <continent>africa</continent>
      <species>Iris-setosa</species>
    </row>
    <row>
      <petal_x0020_length>4.7</petal_x0020_length>
      <petal_x0020_width>1.4</petal_x0020_width>
      <sepal_x0020_length>7.0</sepal_x0020_length>
      <!-- The sepal width value is missing -->
      <!-- The continent value is a zero-length string -->
      <continent/>
      <species>Iris-versicolor</species>
    </row>
    <row>
      <petal_x0020_length>4.7</petal_x0020_length>
      <petal_x0020_width>1.4</petal_x0020_width>
      <!-- The sepal length value is missing -->
      <sepal_x0020_width>0.2</sepal_x0020_width>
      <!-- The continent value is missing -->
      <species>Iris-versicolor</species>
    </row>
    <row>
      <petal_x0020_length>4.7</petal_x0020_length>
      <petal_x0020_width>1.4</petal_x0020_width>
      <sepal_x0020_length>7.0</sepal_x0020_length>
      <sepal_x0020_width>0.2</sepal_x0020_width>
      <continent>asia</continent>
      <species>Iris-versicolor</species>
    </row>
  </InlineTable>    
</ModelVerification> 

Models that rely on a "group" input require special handling with regards to Model Verification. Inputs to such models are typically a group of Items referred to as an Itemset. Therefore, multiple verification records are required to represent a single model input. The following example illustrates how verification records can be used by Association Rules models. When scoring the verification records, the records are grouped based on the field corresponding to usageType="group" which, in this example, is "OrderID". Therefore, the verification records below represent two input Itemsets, "Cracker, Coke, Water" and "Cracker, Banana".

Note that the outputs of the model would need to be handled in the same manner. Since the consequent output is an itemset, the prediction might also require multiple verification records. In this example, the verification records for order #1 predict a consequent of "Nachos, Banana", while the records for order #2 predict "Pear". The number of records required would be dependent on the larger of the two itemsets. For output types consisting of a single value (i.e. ruleId, support, etc.) the prediction must appear in only one of the records.

<ModelVerification fieldCount="3" recordCount="5">
  <VerificationFields>
    <VerificationField column="MVField1" field="OrderID"/>
    <VerificationField column="MVField2" field="Product"/>
    <VerificationField column="MVField3" field="Rule Id"/>
    <VerificationField column="MVField4" field="Consequent"/>
  </VerificationFields>
  <InlineTable>
    <row>
      <MVField1>1</MVField1>
      <MVField2>Cracker</MVField2>
      <MVField3>1</MVField3>
      <MVField4>Nachos</MVField4>
    </row>
    <row>
      <MVField1>1</MVField1>
      <MVField2>Coke</MVField2>
      <MVField3/>
      <MVField4>Banana</MVField4>
    </row>
    <row>
      <MVField1>1</MVField1>
      <MVField2>Water</MVField2>
      <MVField3/>
      <MVField4/>
    </row>
    <row>
      <MVField1>2</MVField1>
      <MVField2>Cracker</MVField2>
      <MVField3>3</MVField3>
      <MVField4/>
    </row>
    <row>
      <MVField1>2</MVField1>
      <MVField2>Banana</MVField2>
      <MVField3/>
      <MVField4>Pear</MVField4>
    </row>
  </InlineTable>
</ModelVerification>
e-mail info at dmg.org