PMML 3.1 - Built-in functions
Almost all programming languages come with a set of predefined functions that perform low-level operations. PMML has a similar set of functions.
- +, -, * and /
- min, max, sum and avg
- log10, ln, sqrt, abs, exp, pow, threshold, floor, ceil, round
- uppercase
- substring
- trimBlanks
- formatNumber
- formatDatetime
- dateDaysSinceYear
- dateSecondsSinceYear
- dateSecondsSinceMidnight
The definitions of functions in PMML generally follow the design of functions and operators in XQuery. Further ideas are taken from MathML , XPath, , Java Date formats .
+, -, * and /
Functions for simple arithmetics.
Pseudo-declaration of PMML built-in function +:
<DefineFunction name="+" optype="continuous" > <ParameterField name="a" optype="continuous" > <ParameterField name="b" optype="continuous" > ... implementation built-in ... </DefineFunction> |
Example: Return the difference between input fields named A, B.
<Apply function="-"> <FieldRef field="A"/> <FieldRef field="B"/> </Apply> |
min, max, sum and avg
Returns an aggregation of a variable number of input fields.
Pseudo-declaration of PMML built-in function min:
<DefineFunction name="min" optype="continuous" > The function takes a variable number of <FieldRef/> as parameters ... implementation built-in ... </DefineFunction> |
Example: Return the minimum value of input fields named A, B, and C.
<Apply function="min"> <FieldRef field="A"/> <FieldRef field="B"/> <FieldRef field="C"/> </Apply> |
log10, ln, sqrt, abs, exp, pow, threshold, floor, ceil, round
Further mathematical functions.
Pseudo-declaration of PMML built-in function log10:
<DefineFunction name="log10" optype="continuous" > <ParameterField name="x" optype="continuous" > ... implementation built-in ... </DefineFunction> |
Example: Return the logarithm to the base 10 of an input field A.
<Apply function="log10"> <FieldRef field="A"/> </Apply> |
Pseudo-declaration of PMML built-in functions pow and floor:
<DefineFunction name="pow" optype="continuous" > <ParameterField name="x" optype="continuous" > <ParameterField name="y" optype="continuous" > ... implementation built-in ... </DefineFunction> <DefineFunction name="floor" datatype="integer" > <ParameterField name="x" optype="continuous" > ... implementation built-in ... </DefineFunction> |
Example: Return the cube of an input field A.
<Apply function="pow" > <FieldRef field="A" /> <Constant dataType="integer">3</Constant> </Apply> |
uppercase
Returns a string where all lowercase characters in the input string are replaced by their uppercase variants.
Pseudo-declaration of PMML built-in function uppercase:
<DefineFunction name="uppercase" dataType="string" > <ParameterField name="input" dataType="string" > ... implementation built-in ... </DefineFunction> |
Example: Return the field Str with all characters in upper case.
<Apply function="uppercase"> <FieldRef field="Str"/> </Apply> |
substring
Extracts a substring from an input string.
Pseudo-declaration of PMML built-in function substring:
<DefineFunction name="substring" dataType="string" > <ParameterField name="input" dataType="string" /> <ParameterField name="startPos" dataType="integer" /> <ParameterField name="length" dataType="integer" /> ... See XQuery fn:substring ... </DefineFunction> |
startPos
and length
must be positive integers.
The first character of a string is located at position 1 (not position 0).
Example: Return the 3 characters of field Str beginning at position 2.
<Apply function="substring"> <FieldRef field="Str"/> <Constant dataType="integer">2</Constant/> <Constant dataType="integer">3</Constant/> </Apply> |
trimBlanks
Returns a string where leading and trailing characters in the input string are removed. Note that trailing blanks in PMML, by definition, are not significant when strings are compared.
Pseudo-declaration of PMML built-in function trimBlanks:
<DefineFunction name="trimBlanks" dataType="string" > <ParameterField name="input" dataType="string" > ... implementation built-in ... </DefineFunction> |
Example: Trim blanks of field Str.
<Apply function="trimBlanks"> <FieldRef field="Str"/> </Apply> |
formatNumber
Formats numbers according to a pattern. The pattern uses the Posix descriptors as used, e.g., in the C function printf.Pseudo-declaration of PMML built-in function formatNumber:
<DefineFunction name="formatNumber" dataType="string" > <ParameterField name="input" optype="continuous" /> <ParameterField name="pattern" dataType="string" > ... implementation built-in ... </DefineFunction> |
Example: Convert a number in the field Num into a string of length 3 with leading blanks.
<Apply function="formatNumber"> <FieldRef field="Num"/> <Constant>%3d</Constant/> </Apply> |
formatDatetime
Formats date and time value according to a pattern. The pattern is a Posix descriptors as used, e.g., in the C function strftime or the Unix command date. See, e.g., Posix datetime descriptorsPseudo-declaration of PMML built-in function formatDatetime:
<DefineFunction name="formatDatetime" optype="categorical" > <ParameterField name="input" optype="ordinal" > <ParameterField name="pattern" dataType="string" > ... implementation built-in ... </DefineFunction> |
input
must be a date or time or dateTime.
Example: Format a date value as 'Month/Day/Year'.
<DerivedField name="StartDateUS" optype="categorical" > <Apply function="formatDatetime" > <FieldRef field="StartDate"/> <Constant>%m/%d/%y</Constant> </Apply> </DerivedField> |
StartDate
being the date August 20th, 2004 the result is StartDateUS="08/20/04"
.
dateDaysSinceYear
Function for transforming dates into integers.Pseudo-declaration of PMML built-in function dateDaysSinceYear:
<DefineFunction name="dateDaysSinceYear" optype="continuous" > <ParameterField name="input" optype="ordinal" /> <ParameterField name="referenceYear" optype="continuous" /> </DefineFunction> |
input
must be of datatype date or dateTime.
Example: Calculate days since 1970.
<DerivedField name="PurchaseDateDays" optype="continuous" > <Apply function="dateDaysSinceYear" /> <FieldRef field="PurchaseDate" /> <Constant>1970</Constant> </Apply> </DerivedField> |
dateSecondsSinceYear
Function for transforming dates into integers.Pseudo-declaration of PMML built-in function dateSecondsSinceYear:
<DefineFunction name="dateSecondsSinceYear" optype="continuous" > <ParameterField name="input" optype="ordinal" /> <ParameterField name="referenceYear" optype="continuous" /> </DefineFunction> |
input
must be of datatype date or dateTime.
If input is of datatype date, it is assumed that
the time is 00:00:00 at this date.
Example: Create a new field PurchaseDateSeconds from the PurchaseDate attribute relative to the year 1970.
<DerivedField name="PurchaseDateSeconds" optype="continuous" > <Apply function="dateSecondsinceYear" /> <FieldRef field="PurchaseDate" /> <Constant>1970</Constant> </Apply> </DerivedField> |
dateSecondsSinceMidnight
Function for transforming dates into integers.Pseudo-declaration of PMML built-in function dateSecondsSinceMidnight:
<DefineFunction name="dateSecondsSinceMidnight" optype="continuous" > <ParameterField name="input" optype="ordinal" /> </DefineFunction> |
input
must be of datatype time or dateTime.
Example: Create a new field PurchaseDateSeconds from the PurchaseDate attribute relative to midnight.
<DerivedField name="PurchaseDateSeconds" optype="continuous" > <Apply function="dateSecondsSinceMidnight" /> <FieldRef field="PurchaseDate" /> </Apply> </DerivedField> |