|
NormContinuous defines how to normalize an input field. field must refer to a field in the data dictionary. If LinearNorm is missing then the input field is not normalized. |
|
LinearNorm* defines a sequence of points for a stepwise linear interpolation function. The sequence must contain at least two elements. To simplify processing, the sequence must be sorted by ascending original values. Within NormContinuous the elements LinearNorm must be strictly sorted by ascending value of ' orig'. Given two points (a1, b1) and (a2, b2) such that there is no other point (a3, b3) with a1<a3<a2, then the normalized value is b1+ ( x-a1)/(a2-a1)*(b2-b1) for a1 <= x <= a2 |
Missing input values are mapped to missing output. If the input value is not within the range [a1..an] then it is treated as an outlier, the specific method for outlier treatment must be provided by the caller, eg, an outlier could be mapped to a missing value or it could be mapped as the minimal or maximal value. |
|
PMML 1.1 supports only one kind of discrete normalization, future versions could support other techniques such as thermometer encoding. Thermometer encoding can be used for ordinal values, the output is 1.0 if the value of input field f is greater or equal v, otherwise it is 0.0. Futhermore there could also be a linear index mapping for ordinal values: given an ordering (a1, a2, ..., an), then the normalized value for value ai is the number i. |
|