PMML Data Mining Schema
PMML1.1 Menu

Home


PMML Notice and License

PMML Conformance

Header

Data Dictionary

Mining Schema

Statistics

Normalization

Tree Model

General Regression

General Structure

Association Rules

Neural Network

Center and Distribution - based Clustering

PMML 1.1 DTD

Download PMML v1.1 (zip)

PMML 1.1 -- Mining Schema

Each model contains one mining schema which lists fields as used in that model. This is a subset of the fields as defined in the data dictionary. While the mining schema contains information that is specific to a certain model, the data dictionary contains data definitions which do not vary per model.

 
<!ELEMENT MiningSchema (Extension*, MiningField+) >
<!ENTITY  % FIELD-USAGE-TYPE "(active | predicted | supplementary)" > 
		 
<!ENTITY  % OUTLIER-TREATMENT-METHOD "( asIs | asMissingValues | asExtremeValues ) " >

usageType

    active: field used as input (independent field)

    predicted: field whose value is predicted by the model

    supplementary: field holding additional descriptive information

Supplementary fields are not required to apply a model. They are provided as additional information for explanatory purpose, though. When some field has gone through preprocessing transformations before a model is built, then an additional supplementary field is typically used to describe the statistics for the original field values.

outliers

    asIs: field values treated at face value

    asMissingValues: outlier values are treated as if they were missing

    asExtremeValues: outlier values are changed to a specific high or low value defined in MiningField

					
	<!ELEMENT MiningField (Extension*) > 
				 
	<!ATTLIST MiningField 
	     name                   %FIELD-NAME;                    #REQUIRED
	     usageType              %FIELD-USAGE-TYPE;              "active" 
	     outliers               %OUTLIER-TREATMENT-METHOD;      "asIs" 
	     lowValue               %NUMBER;                        #IMPLIED 
	     highValue              %NUMBER;                        #IMPLIED
	> 

    name: symbolic name of field, same as the name of some field in the data dictionary

    highValue and lowValue: used in conjunction with %outlierTreatmentMethod "asExtremeValues" as values for records with outliers in this field if x < lowValue then x = lowValue


Conformance

  • outlier treatment 'asIs', i.e. the default value of the attribute outliers in MiningField, is in core; other options are not in core.
e-mail info at dmg.org