Data Mining Group - PMML Data Mining Schema

PMML 1.1 -- Mining Schema

Each model contains one mining schema which lists fields as used in that model. This is a subset of the fields as defined in the data dictionary. While the mining schema contains information that is specific to a certain model, the data dictionary contains data definitions which do not vary per model.

 
<!ELEMENT MiningSchema (Extension*, MiningField+) >
<!ENTITY  % FIELD-USAGE-TYPE "(active | predicted | supplementary)" > 
		 
<!ENTITY  % OUTLIER-TREATMENT-METHOD "( asIs | asMissingValues | asExtremeValues ) " >

usageType

active: field used as input (independent field)

predicted: field whose value is predicted by the model

supplementary: field holding additional descriptive information

Supplementary fields are not required to apply a model. They are provided as additional information for explanatory purpose, though. When some field has gone through preprocessing transformations before a model is built, then an additional supplementary field is typically used to describe the statistics for the original field values.

outliers

asIs: field values treated at face value

asMissingValues: outlier values are treated as if they were missing

asExtremeValues: outlier values are changed to a specific high or low value defined in MiningField

					
	<!ELEMENT MiningField (Extension*) > 
				 
	<!ATTLIST MiningField 
	     name                   %FIELD-NAME;                    #REQUIRED
	     usageType              %FIELD-USAGE-TYPE;              "active" 
	     outliers               %OUTLIER-TREATMENT-METHOD;      "asIs" 
	     lowValue               %NUMBER;                        #IMPLIED 
	     highValue              %NUMBER;                        #IMPLIED
	>

name: symbolic name of field, same as the name of some field in the data dictionary

highValue and lowValue: used in conjunction with %outlierTreatmentMethod "asExtremeValues" as values for records with outliers in this field if x < lowValue then x = lowValue

Conformance

outlier treatment 'asIs', i.e. the default value of the attribute outliers in MiningField, is in core; other options are not in core.