PMML 4.0 - Model Verification
Providers and consumers of PMML models need a mechanism to ensure a model
deployed to a new environment generates results consistent with environment
where the model was developed. Since differences in operating systems, data
precision and algorithm implementation can affect the model's performance, the
ModelVerification schema provides a dataset of model inputs and known
results that can be used to verify accurate results are generated, regardless
of the environment.
To use Model Verification, the producer of a PMML model adds a set of
verification records to the model. These should include a meaningful sample of
the training dataset, including normal cases as well as exception cases, such
as missing data, outliers, and other extreme situations. The consumer can
ignore this part when scoring the model. But the consumer may provide a
feature that performs the verification of these records and returns the
results to the user.
These verification results can take many forms, such as providing a result for
each verification record, listing records that fail to verify, maximum
deviation, etc.
<xs:element name="ModelVerification">
<xs:complexType>
<xs:sequence>
<xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded" />
<xs:element ref="VerificationFields" />
<xs:element ref="InlineTable" />
</xs:sequence>
<xs:attribute name="recordCount" type="INT-NUMBER" use="optional"/>
<xs:attribute name="fieldCount" type="INT-NUMBER" use="optional"/>
</xs:complexType>
</xs:element>
<xs:element name="VerificationFields">
<xs:complexType>
<xs:sequence>
<xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded" />
<xs:element maxOccurs="unbounded" ref="VerificationField" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="VerificationField">
<xs:complexType>
<xs:sequence>
<xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="field" type="xs:string" use="required" />
<xs:attribute name="column" type="xs:string" use="optional" />
<xs:attribute name="precision" type="xs:double" default="1E-6"/>
<xs:attribute name="zeroThreshold" type="xs:double" default="1E-16"/>
</xs:complexType>
</xs:element>
|
ModelVerification consists of two parts. First, the
VerificationFields element contains the fields that will
appear in the verification records. These include the inputs for each record
and one or more outputs. The inputs must relate
to fields in the MiningSchema. In particular, these inputs must be
MiningFields that have
usageType="active".
The VerificationFields for outputs ideally should refer to
OutputField elements. This
allows more than one output to be verified. For example, including two
OutputField references, one with
feature="predictedValue" and one with feature="probability",
means that both these results can be verified.
Often times, a VerificationField will refer to the
MiningField whose usageType="predicted". This can cause
confusion since some will interpret this field to represent the dependent
variable used to train the model, while others may consider
it to be the expected result of the model. Therefore, by convention, when
there exist VerificationField elements that refer to
OutputFields, any VerificationField that refers to a
MiningField whose usageType="predicted" should be
considered a represent a dependent variable from the training dataset,
not an expected output. On the other hand, if there are no
VerificationField elements that refer to an OutputField,
then any VerificationField that refers to a
MiningField whose usageType="predicted" should be
considered a represent an expected output. The former is the
preferred approach but the latter allows Model Verification to be used by
producers that have yet to implement OutputFields.
Below is an example of some OutputFields:
<Output>
<OutputField name="Iris-setosa Prob" optype="continuous" datatype="xs:double"
targetField="response" feature="probability" value="YES" />
<OutputField name="Iris-setosa Pred" optype="categorical" datatype="xs:string"
targetField="response" feature="predictedValue" />
<OutputField name="Iris-setosa Display" optype="categorical" datatype="xs:string"
targetField="response" feature="predictedDisplayValue" />
</Output>
|
For more information, see the Output specification.
The second part contains the actual verification records. These records are contained in an
InlineTable. Each entry in an InlineTable row must match the fields specified in the
VerificationFields section. Since there can
be situations where field names contain spaces or blanks, the column attribute can be used to specify the element
name that will be used in the in-line table for this VerificationField. That name can be freely chosen and may not
be related to the respective field name.
How to determine if results match expectations
Categorical and ordinal results have a single specific value and
therefore the actual results should match the expected results exactly. However, an
exact numeric result may be too precise for continuous results like regression
values and probabilities. For example, 0.9999999 may be considered as a correct
result for an expected value of 1, or 0.1-E12 (0.000000000001) may be a correct
result for an expected value of 0. Hence each VerificationField has a
precision attribute to indicate an acceptable range of continuous results
for a particular field. The precision specifies an acceptable range given
in proportion of the reference value, including its boundaries:
(expectedValue * (1 - precision)) ≤ resultValue ≤ (expectedValue *
(1 + precision))
For example, if the expected result is 0.95 and the precision is 1E-3 (0.001)
then the acceptable range for the result is 0.94905 to 0.95095 (including the
limits):
(0.95 * (1 - 0.001)) ≤ resultValue ≤ (0.95 * (1 + 0.001))
or
(0.95 * 0.999) ≤ resultValue ≤ (0.95 * 1.001)
or
0.9405 ≤ resultValue ≤ 0.9595
By using a proportion of the expected value, the precision can handle
a large range of values. However, this approach breaks down when the expected
value is near zero. To avoid problems with reference values close or identical
to zero, all numbers whose absolute value is smaller than zeroThreshold
are considered to be zero. For example, if the expected value is greater than
zeroThreshold, then the precision attribute is used and the
zeroThreshold won't have any effect on verifying results. But if expected
result is smaller than or equal to the zeroThreshold, then results should just be
compared to the zeroThreshold. The whole idea behind zeroThreshold
is that there is a range, in which the rounding errors might outweigh any
calculations by far. zeroThreshold specifies that range. For example, if
zeroThreshold is 0.01 and the expected value is less than or equal to
0.01, then the expected value is considered zero (regardless of the expected
value in the PMML).
if (-zeroThreshold ≤ expectedValue ≤ zeroThreshold) AND
(-zeroThreshold ≤ resultValue ≤ zeroThreshold) then the result is verified
to be correct; otherwise, it is not;
if (expectedValue < -zeroThreshold) or (expectedValue >
zeroThreshold) then use the precision method described above.
The following table illustrates various scoring situations, where:
Precision = value of VerificationField's precision attribute
zeroThreshold = value of VerificationField's zeroThreshold attribute
expectedValue = expected value of the record being verified
Method = procedure to use to verify the result
Low = minimum value of verification range (dependent on verification method)
High = maximum value of verification range (dependent on verification method)
resultValue = maximum value of verification range (dependent on verification method)
Verified = TRUE means the record is verified; FALSE means it is not verified.
Note that for each entry precision is 0.01 and zeroThreshold
is 0.001.
Record |
expectedValue |
Method |
Low |
High |
resultValue |
Verified |
Comment |
1 |
0.001000 |
zeroThreshold |
-0.001000 |
0.001000 |
0.001020 |
FALSE |
-zeroThreshold ≤ expectedValue ≤ zeroThreshold so use
zeroThreshold method. Not verified since resultValue >
zeroThreshold |
2 |
0.001000 |
zeroThreshold |
-0.001000 |
0.001000 |
0.001010 |
FALSE |
-zeroThreshold ≤ expectedValue ≤ zeroThreshold so use
zeroThreshold method. Not verified since
resultValue > zeroThreshold |
3 |
0.001000 |
zeroThreshold |
-0.00100 |
0.001000 |
0.001000 |
TRUE |
-zeroThreshold ≤ expectedValue ≤ zeroThreshold so use
zeroThreshold method. Verified since -zeroThreshold <
resultValue < zeroThreshold |
4 |
0.001000 |
zeroThreshold |
-0.001000 |
0.001000 |
0.000990 |
TRUE |
-zeroThreshold ≤ expectedValue ≤ zeroThreshold so use
zeroThreshold method. Verified since -zeroThreshold <
resultValue < zeroThreshold |
5 |
0.001000 |
zeroThreshold |
-0.001000 |
0.001000 |
0.000000 |
TRUE |
-zeroThreshold ≤ expectedValue ≤ zeroThreshold so use
zeroThreshold method. Verified since -zeroThreshold <
resultValue < zeroThreshold |
6 |
0.001000 |
zeroThreshold |
-0.001000 |
0.001000 |
-0.000999 |
TRUE |
-zeroThreshold ≤ expectedValue ≤ zeroThreshold so use
zeroThreshold method. Verified since -zeroThreshold <
resultValue < zeroThreshold |
7 |
0.001000 |
zeroThreshold |
-0.001000 |
0.001000 |
-0.001000 |
TRUE |
-zeroThreshold ≤ expectedValue ≤ zeroThreshold so use
zeroThreshold method. Verified since -zeroThreshold <
resultValue < zeroThreshold |
8 |
0.001000 |
zeroThreshold |
-0.001000 |
0.001000 |
-0.001001 |
FALSE |
-zeroThreshold ≤ expectedValue ≤ zeroThreshold so use
zeroThreshold method. Not verified since resultValue <
-zeroThreshold |
9 |
0.999000 |
precision |
0.989010 |
1.008990 |
0.998980 |
TRUE |
expectedValue > zeroThreshold so use
precision method. Verified since Low ≤ resultValue ≤
High |
10 |
0.999000 |
precision |
0.989010 |
1.008990 |
0.989010 |
TRUE |
expectedValue > zeroThreshold so use
precision method. Verified since Low ≤ resultValue ≤
High |
11 |
0.999000 |
precision |
0.989010 |
1.008990 |
0.989000 |
FALSE |
expectedValue > zeroThreshold so use
precision method. Not verified since resultValue <
Low |
12 |
0.999000 |
precision |
0.989010 |
1.008990 |
1.009000 |
FALSE |
expectedValue > zeroThreshold so use
precision method. Not verified since resultValue >
High |
It is recommended that producers of PMML models not only include a random data
sample from the training data set, but also artificial records illustrating the
behaviour of the model. E.g., it is a good idea to have records that set each
individual field to a missing value. This may be varied by combinations of
fields missing up to a record where input values for all fields are missing.
Also, records where categorical fields have values that are not present in the
training data can be of interest or numeric fields that have
values outside of the regular value range for that field.
Here is an example for ModelVerification for the Iris dataset using an InlineTable.
The field species is actually a MiningField with usageType="predicted", thus
it represents the dependent variable from the training data set. Per above, since
OutputFields are present, this field will not be considered an expected output.
Note also that missing values are indicated by omitting the respective entry in the
InlineTable or omitting the in-line table's value. Also, field names containing
a space use a column that replaces the space with _x0020_ - any other naming
scheme like assigning numbers would work as well:
<ModelVerification recordCount="4" fieldCount="9">
<VerificationFields precision="0.01">
<!-- the following six fields are listed in MiningSchema element -->
<VerificationField field="petal length" column="petal_x0020_length" />
<VerificationField field="petal width" column="petal_x0020_Width" />
<VerificationField field="sepal length" column="sepal_x0020_length" />
<VerificationField field="sepal width" column="sepal_x0020_width" />
<VerificationField field="species"/>
<!-- Each of the following four fields is defined in an OutputField element in the Output element -->
<VerificationField field="PredictClass" / >
<VerificationField field="Iris-setosa Prob" column="Iris-setosa_x0020_Prob" precision="0.005"/ >
<VerificationField field="Iris-versicolor Prob" column="Iris-versicolor_x0020_Prob" />
<VerificationField field="Iris-virginica Prob" column="Iris-virginica_x0020_Prob" zeroThreshold="0.002" />
</VerificationFields>
<InlineTable>
<row>
<petal_x0020_length>1.4</petal_x0020_length>
<petal_x0020_width>0.2</petal_x0020_width>
<sepal_x0020_length>1.4</sepal_x0020_length>
<sepal_x0020_width>0.2</sepal_x0020_width>
<species>Iris-setosa</species>
<PredictClass>Iris-setosa</PredictClass>
<Iris-setosa_x0020_Prob>0.62</Iris-setosa_x0020_Prob>
<Iris-versicolor_x0020_Prob>0.30</Iris-versicolor_x0020_Prob>
<Iris-virginica_x0020_Prob>0.08</Iris-virginica_x0020_Prob>
</row>
<row>
<petal_x0020_length>4.7</petal_x0020_length>
<petal_x0020_width>1.4</petal_x0020_width>
<sepal_x0020_length>7.0</sepal_x0020_length>
<!-- The sepal width value is missing -->
<species>Iris-versicolor</species>
<PredictClass>Iris-setosa</PredictClass>
<Iris-setosa_x0020_Prob>0.43</Iris-setosa_x0020_Prob>
<Iris-versicolor_x0020_Prob>0.39</Iris-versicolor_x0020_Prob>
<Iris-virginica_x0020_Prob>0.18</Iris-virginica_x0020_Prob>
</row>
<row>
<petal_x0020_length>4.7</petal_x0020_length>
<petal_x0020_width>1.4</petal_x0020_width>
<!-- The sepal length value is missing -->
<sepal_x0020_length></sepal_x0020_length>
<sepal_x0020_width>0.2</sepal_x0020_width>
<species>Iris-versicolor</species>
<PredictClass>Iris-setosa</PredictClass>
<Iris-setosa_x0020_Prob>0.43</Iris-setosa_x0020_Prob>
<Iris-versicolor_x0020_Prob>0.39</Iris-versicolor_x0020_Prob>
<Iris-virginica_x0020_Prob>0.18</Iris-virginica_x0020_Prob>
</row>
<row>
<petal_x0020_length>4.7</petal_x0020_length>
<petal_x0020_width>1.4</petal_x0020_width>
<sepal_x0020_length>7.0</sepal_x0020_length>
<species>Iris-versicolor</species>
<!-- The PredictClass result is missing -->
<Iris-setosa_x0020_Prob>0.609</Iris-setosa_x0020_Prob>
<Iris-versicolor_x0020_Prob>0.39</Iris-versicolor_x0020_Prob>
<Iris-virginica_x0020_Prob>0.001</Iris-virginica_x0020_Prob>
</row>
</InlineTable>
</ModelVerification>
|
Finally, here is the same example, except this time, the model has no outputs defined.
Instead, the mining schema field species represents the predicted value. Per above, since
OutputFields are not present, this field will be considered an expected output.
<ModelVerification recordCount="4" fieldCount="9">
<VerificationFields precision="0.01">
<!-- the following six fields are listed in MiningSchema element -->
<VerificationField field="petal length" column="petal_x0020_length" />
<VerificationField field="petal width" column="petal_x0020_Width" />
<VerificationField field="sepal length" column="sepal_x0020_length" />
<VerificationField field="sepal width" column="sepal_x0020_width" />
<VerificationField field="species"/>
</VerificationFields>
<InlineTable>
<row>
<petal_x0020_length>1.4</petal_x0020_length>
<petal_x0020_width>0.2</petal_x0020_width>
<sepal_x0020_length>1.4</sepal_x0020_length>
<sepal_x0020_width>0.2</sepal_x0020_width>
<species>Iris-setosa</species>
</row>
<row>
<petal_x0020_length>4.7</petal_x0020_length>
<petal_x0020_width>1.4</petal_x0020_width>
<sepal_x0020_length>7.0</sepal_x0020_length>
<!-- The sepal width value is missing -->
<species>Iris-versicolor</species>
</row>
<row>
<petal_x0020_length>4.7</petal_x0020_length>
<petal_x0020_width>1.4</petal_x0020_width>
<!-- The sepal length value is missing -->
<sepal_x0020_length></sepal_x0020_length>
<sepal_x0020_width>0.2</sepal_x0020_width>
<species>Iris-versicolor</species>
</row>
<row>
<petal_x0020_length>4.7</petal_x0020_length>
<petal_x0020_width>1.4</petal_x0020_width>
<sepal_x0020_length>7.0</sepal_x0020_length>
<species>Iris-versicolor</species>
</row>
</InlineTable>
</ModelVerification>
|