Changes from PMML 3.0
PMML3.1 Menu

Home


PMML Notice and License

Changes


Conformance

General Structure

Header

Data
Dictionary


Mining
Schema


Transformations

Statistics

Taxomony

Targets

Output

Functions

Built-in Functions

Model Composition

Model Verification


Association Rules

Cluster
Models


General
Regression


Naive
Bayes


Neural
Network


Regression

Ruleset

Sequences

Text Models

Trees

Vector Machine

Overview of most significant Changes

  • more builtin functions
  • specification of the dataType is now required in all places where applicable
  • new link functions loglog and cauchit
  • MiningSchema can now specify how to treat invalid input values
  • Sequence models have been reworked:
    • SetPredicate is now deprecated
    • clarified distinction between informal attributes and actual constraints from the model building phase
    • changes break compatibility
  • more post-processing capabilities in Targets element
  • various clarifications in TreeModel:
    • added mechanism to specify confidences
    • missing value treatment within the tree is now configurable
    • behaviour when no child node applies is now configurable





List of all Changes

Associations

  • typo: (Paragraph 1): The PMML spec says "a certain product" is associated with a set of other products. Text changed to say "a certain product or set of products"
  • typo: "Here is a description of the attributes in an item" changed to itemset.
  • added scoring procedure description.

BuiltinFunctions

  • added further functions exp, pow, threshold, floor, ceil and round

ClusteringModel

  • introduce default for compareFunction in element ComparisonMeasure
  • add default 1 for fieldWeight (already present in wording)
  • typo: Reworded awkward sentence from "In particular this allows to map categorical input fields?" to "In particular this allows the mapping of categorical input fields..."
  • typo: Eliminated typo in "NUM-ARRAY;." To "NUM-ARRAY."

Conformance

  • typo: Added 3.0 models (Text, Rules and SVM) to bullet list

DataDictionary

  • datatype is required for DataField
  • continuous fields can have an unlimited amount of value range intervals
  • removed dataType: boolean

Functions

  • make attribute function in element Apply required
  • optype is required for DefineFunction
  • dataType clarification, ie. casts

GeneralRegression

  • typo: Factor List: Changed text from "Each name in the list must match a name from the dictionary?" to "Each name in the list must match a DataField name or a DerivedField name."
  • typo: targetVariableName: Changed text from "Each name in the list must match a name from the dictionary?" to "Each name in the list must match a DataField name or a DerivedField name."
  • typo: Observation: "...Each name in the list must match a name from the dictionary?" but it doesn't state which dictionary GeneralRegression (Covariate List): Changed text from "Each name in the list must match a name from the dictionary?" to "Each name in the list must match a DataField name or a DerivedField name."
  • added new link functions loglog and cauchit
  • added scoring procedure

GeneralStructure

  • clarified thousand separator
  • typo: Changed "xmlns="https://www.dmg.org/PMML-3_0" to "xmlns="https://www.dmg.org/PMML-3_1" in three places [Two items]
  • typo: Extension Mechanism: Changed attribute to an x- attribute, "<X-DataFieldSource x-sourceKnown="yes" >"
  • typo: x- attributes have been deprecated so removed the "x-author" attribute in example

MiningSchema

  • added attribute invalidValueTreatment for invalid value handling
  • clarified missingValueReplacement strategy asMean
  • typo: highValue and lowValue: Changed text to say these attributes are required, "...used in conjunction with, and are required"

ModelComposition

  • removed final example
  • clarified where final prediction comes from
  • added new attributes from tree to DecisionTree (missingValueStrategy, missingValuePenalty, noTrueChildStrategy)

NaiveBayes

  • re-added example model (gone since PMML 2.1 for unknown reasons)

NeuralNetwork

  • clarify usage of attribute width
  • define what to do in case of ties with classification

Output

  • made attribute targetField in OutputField optional
  • corrected definition of attribute targetField

Regression

  • correct formulas for softmax and logit functions
  • add new link functions loglog and cauchit in accordance with GeneralRegression

Sequence

  • rework, most important changes: distinction between information and constraints, deprecated SetPredicate

SupportVectorMachine

  • detail how to support categorical input variables, binary and non-binary classification

Targets

  • TargetValue is required only for classification models
  • added further post-processing capabilities

Transformations

  • make InlineTable/TableLocator in MapValues optional to allow indicator variables for missing values
  • added example for multi-dimensional case with MapValues
  • DerivedFields can only have a list of valid values to define the order of ordinal fields. Value ranges for categorical or continuous fields are not possible anymore.
  • dataType clarification
  • Clarify that Discretization is a mapping to discrete values, not strings like previously stated

TreeModel

  • Confidences and Missing Value handling - added attributes defaultChild (to Node), missingValueStrategy, missingValuePenalty (both to TreeModel), confidence (to ScoreDistribution) - note that missingValuePenalty applies only to the use of surrogate rules or the missingValueStrategy defaultChild
  • Clarify evaluation of xor operator
  • Correction: specify that field can also be a DerivedField in the LocalTansformations or TransformationDictionary
  • allow Nodes with only a single child
  • define noTrueChildStrategy to catch cases where no children can be chosen

e-mail info at dmg.org