PMML 3.2 - PMML Conformance
PMML Document ValidityPMML is a standard for XML documents which express trained instances of analytic models. The following classes of model are addressed:
- Association Rules
- Decision Trees
- Center-Based & Distribution-Based Clustering
- General Regression
- Neural Networks
- Naive Bayes
- Support Vector Machine
A valid PMML 3.2 document must be an XML document that is valid with respect to the reference XML Schema found at http://www.dmg.org/PMML3.2/pmml-3-2.xsd, after removing additional attributes that have a name with the prefix "x-". Additional elements with a name prefix "X-" can be defined in an internal Schema. A valid PMML 3.2 document must obey the restrictions expressed in the model specifications found at http://www.dmg.org/PMML3.2/.
The model specifications define various restrictions which are not
implemented in the XML Schema. Examples are:
"The sequence of LinearNorm elements defines a sequence
of points for a stepwise linear interpolation function."
The order of these elements is critical but it is not possible to enforce this using the XSD.
"In most cases Score is required but for
model composition where a tree model is used to select a regression model,
the regression equation will provide the Score and, in this case,
Score is not required.
In all other cases, if the winning node in the tree evaluation does not
have a Score, the result is invalid."
Ensuring that tree nodes in non-model composition models have the Score attribute is also not possible to enforce in the XSD.
PMML intends to enable application portability, sharing,
and reuse of analytic
models produced by a variety of tools.
Conformance must therefore be specified
from both producer and consumer perspectives.
Applications need ways to specify what kinds of analytic models
they can use, and modeling tools need ways to specify what kinds of
analytic models they produce.
A PMML document is what gets produced by a modeling tool to specify
a trained analytic model and
is what an application uses to deploy that model.
Satisfying producer conformance rules ensures a model definition
document is syntactically correct and in fact defines a model instance
which is consistent with semantic criteria
that are defined in model specifications.
Satisfying consumer conformance
rules ensures that such a model will be applied in ways which are valid.
A tool or application is producer conforming if it generates valid PMML
documents for at least one type of model; a producer conformance statement must
include which types of models are supported in this way. Producer conformance
is a "contract" between a data mining tool/application vendor and users and
application suppliers which states that the PMML documents defining models of
those types it supplies can be integrated and used.
An application is consumer conforming if it will accept valid PMML documents
for at least one type of model. A consumer conformance statement must indicate
which types of models are accepted. Application conformance is a "contract"
between the application suppliers and both users and data mining
tool/application vendors which states that PMML documents defining models of
those types can be integrated and used.
An essential example is that if an application is consumer conforming for
Polynomial-Regression-model types, then valid PMML documents defining models of
this type produced by different producers would be interchangeable in the
Core and Non-Core Features
For a given class of model, the corresponding XML Schema and specifications identify core features which all conforming producers and consumers must support and non-core features which may optionally be supported.