DMG logo PMML 4.4.1 - Built-in functions
PMML4.4 Menu

Home

Changes

XML Schema

Conformance

Interoperability

General Structure

Field Scope

Header

Data
Dictionary


Mining
Schema


Transformations

Statistics

Taxomony

Targets

Output

Functions

Built-in Functions

Model Verification

Model Explanation

Multiple Models

Anomaly Detection
Models


Association Rules

Baseline Models

Bayesian Network

Cluster
Models


Gaussian
Process


General
Regression


k-Nearest
Neighbors


Naive
Bayes


Neural
Network


Regression

Ruleset

Scorecard

Sequences

Text Models

Time Series

Trees

Vector Machine

PMML 4.4.1 - Built-in functions

Almost all programming languages come with a set of predefined functions that perform low-level operations. PMML has a similar set of functions.

  1. +, -, * and /
  2. min, max, sum, avg, median, product
  3. log10, ln, sqrt, abs, exp, pow, threshold, floor, ceil, round, modulo
  4. isMissing, isNotMissing, isValid, isNotValid
  5. equal, notEqual, lessThan, lessOrEqual, greaterThan, greaterOrEqual
  6. and, or
  7. not
  8. isIn, isNotIn
  9. if
  10. uppercase
  11. lowercase
  12. stringLength
  13. substring
  14. trimBlanks
  15. concat
  16. replace
  17. matches
  18. formatNumber
  19. formatDatetime
  20. dateDaysSinceYear
  21. dateSecondsSinceYear
  22. dateSecondsSinceMidnight
  23. normalCDF, normalPDF, stdNormalCDF, stdNormalPDF, erf, normalIDF, stdNormalIDF
  24. expm1, hypot, ln1p, rint
  25. sin, asin, sinh, cos, acos, cosh, tan, atan, tanh

The definitions of functions in PMML generally follow the design of functions and operators in XQuery. Further ideas are taken from MathML , XPath , Java Date formats.

Reference is made herein to the constants NaN, INF, and -INF. They are defined in Transformations.html. Except as noted below, any missing inputs to a built-in function will result in a missing value being returned.

Arithmetic Functions

+, -, * and /

Functions for simple arithmetics.

Pseudo-declaration of PMML built-in function +

  <DefineFunction name="+" optype="continuous">
    <ParameterField name="a" optype="continuous"/>
    <ParameterField name="b" optype="continuous"/>
    ... implementation built-in ...
  </DefineFunction>

The functions -, *, and / are defined in the same way and have two parameters. Functions for adding or multiplying more than two parameters are defined later in this document.

Pseudo-declaration of PMML built-in function -

  <DefineFunction name="-" optype="continuous">
    <ParameterField name="a" optype="continuous"/>
    <ParameterField name="b" optype="continuous"/>
    ... implementation built-in ...
  </DefineFunction>

Example: Return the difference between input fields named A, and B.

<Apply function="-">
  <FieldRef field="A"/>
  <FieldRef field="B"/>
</Apply>

Assuming A=2.5 and B=4 the result corresponding to this Apply element is -1.5.

Division by Zero

  • If a positive number is divided by zero, INF will be returned.
  • If zero is divided by zero, the result is indeterminate and a missing value will be returned.
  • If a negative number is divided by zero, -INF will be returned.

min, max, sum, avg, median, product

Return an aggregation of a variable number of input fields.

Pseudo-declaration of PMML built-in function min

  <DefineFunction name="min" optype="continuous">
    The function takes a variable number
    of <FieldRef/> as parameters
    ... implementation built-in ...
  </DefineFunction>

The aggregation functions max, sum, avg, median, and product are defined in the same way. Note that the number of input parameters is variable but these functions do not aggregate values coming from multiple input records.

Example: Return the minimum value of input fields named A, B, and C.

<Apply function="min">
  <FieldRef field="A"/>
  <FieldRef field="B"/>
  <FieldRef field="C"/>
</Apply>

Assuming A=2.5 and B=4 and C=1.5 the result corresponding to this Apply element is 1.5. Missing values in the input to an aggregate function are simply ignored. It should be noted that this has an effect on how avg is computed in particular. In the above example, if B is a missing value, the result corresponding to applying avg on A, B and C is 2. If all inputs are missing, the result evaluates to a missing value.

log10, ln, sqrt, abs, exp, pow, threshold, floor, ceil, round, modulo

Further mathematical functions.

Pseudo-declaration of PMML built-in function log10

  <DefineFunction name="log10" optype="continuous">
    <ParameterField name="x" optype="continuous"/>
    ... implementation built-in ...
  </DefineFunction>

The function log10 returns the logarithm to the base 10. The functions ln (natural log), sqrt (square root), abs (absolute value), exp (exponential) are defined in the same way. Semantics are as usual. See also MathML.

The logarithm of a negative number is undefined, so NaN should be returned in such a case. The logarithm of INF is to be taken as INF. The logarithm of 0 is to be taken as -INF. The square root of a INF is to be taken as INF. The square root of a negative number is to be taken as NaN since complex numbers are not supported by PMML. The exponential of INF is to be taken as INF. The exponential of -INF is to be taken as 0.

Example: Return the logarithm to the base 10 of an input field A.

<Apply function="log10">
  <FieldRef field="A"/>
</Apply>

Assuming A=2.5 the result corresponding to this Apply element is approximately 0.397940008672038.

Pseudo-declaration of PMML built-in functions pow and floor

  <DefineFunction name="pow" optype="continuous">
    <ParameterField name="x" optype="continuous"/>
    <ParameterField name="y" optype="continuous"/>
    ... implementation built-in ...
  </DefineFunction>

  <DefineFunction name="floor" datatype="integer">
    <ParameterField name="x" optype="continuous"/>
    ... implementation built-in ...
  </DefineFunction>

The function pow(x,y) returns the number x raised to the power y; if x and y are both zero then pow will return one. The function threshold(x,y) returns one if x>y and zero otherwise. Functions floor, ceil, and round return an integer obtained by rounding the numeric argument down, up, and to the closest integer respectively. See section on respective functionality in Targets for examples.

The behavior of pow with regard to infinite numbers is as detailed in Transformations.html. Since INF and -INF can only be floating point numbers, they will cause floor, ceil, round, and modulo to return an invalid value.

Example: Return the cube of an input field A.

<Apply function="pow">
  <FieldRef field="A"/>
  <Constant dataType="integer">3</Constant>
</Apply>

Assuming A=5.0 the result corresponding to this Apply element is 125.0.

Function modulo(x,y) takes two numbers x and y as inputs and returns the remainder after x is divided by y, computed according to XQuery rules described in https://www.w3.org/TR/xquery-operators/#func-numeric-mod. Operations with negative numbers yield diverging results in different programming languages. The PMML standard follows the semantics of Python when using the % operator. Formula

result = a - Math.floor(a/b)*b

can be used to obtain these results for positive and negative numbers. Thus, modulo(11,3) returns 2, modulo(-17.2,0.5) returns 0.3, modulo(9,-7) returns -5 and modulo(-4,-9) returns -4.

Boolean Functions

Return true or false. Result is dependent on applying either function to a single input parameter.

isMissing, isNotMissing, isValid, isNotValid

In PMML, a field value can have one of three states:

  • Valid
  • Invalid
  • Missing
The functions in this section detect these three states.

Pseudo-declaration of PMML built-in function isMissing

  <DefineFunction name="isMissing" optype="categorical" dataType="boolean">
    <ParameterField name="input"/>
    ... implementation built-in ...
  </DefineFunction>

Example: Check if field Str is missing. If so, return true, else false.

<Apply function="isMissing">
  <FieldRef field="Str"/>
</Apply>

Pseudo-declaration of PMML built-in function isNotMissing

  <DefineFunction name="isNotMissing" optype="categorical" dataType="boolean">
    <ParameterField name="input"/>
    ... implementation built-in ...
  </DefineFunction>

Example:Check if field Str is missing. If so, return false, else true.

<Apply function="isNotMissing">
  <FieldRef field="Str"/>
</Apply>

An invalid value is not missing. Therefore, if Str is invalid, the above will return true.

Pseudo-declaration of PMML built-in function isValid

  <DefineFunction name="isValid" optype="categorical" dataType="boolean">
    <ParameterField name="field">
    ... implementation built-in ...
  </DefineFunction>

Example: Check if the value of field A is valid. It so then return true, otherwise false.

<Apply function="isValid">
  <FieldRef field="A"/>
</Apply>

In the above example, a missing value will cause false to be returned.

Pseudo-declaration of PMML built-in function isNotValid

  <DefineFunction name="isNotValid" optype="categorical" dataType="boolean">
    <ParameterField name="field">
    ... implementation built-in ...
  </DefineFunction>

Example: Check if the value of field A is invalid. It so then return true, otherwise false.

<Apply function="isNotValid">
  <FieldRef field="A"/>
</Apply>

In the above example, a missing value will cause false to be returned.

If the invalidValueTreatment specified in the mining schema for a particular field is set to asMissing, these functions will treat the affected invalid values as missing rather than as invalid. Thus, in such cases, isMissing will return true and isNotValid will return false.

The behavior of the above described functions is summarized in the following table:

FunctionMissingValidInvalid
isMissingtruefalsefalse
isNotMissingfalsetruetrue
isValidfalsetruefalse
isNotValidfalsefalsetrue

equal, notEqual, lessThan, lessOrEqual, greaterThan, greaterOrEqual

Further boolean functions.

Return true or false. Result is dependent on applying either function to two input parameters.

Pseudo-declaration of PMML built-in function lessThan

  <DefineFunction name="lessThan" optype="categorical" dataType="boolean">
    <ParameterField name="x"/>
    <ParameterField name="y"/>
    ... implementation built-in ...
  </DefineFunction>

Example: Check if field A is less than field B. If so, returns true, else false.

<Apply function="lessThan">
  <FieldRef field="A"/>
  <FieldRef field="B"/>
</Apply>

By definition, INF is taken as greater than any finite number or -INF and -INF is taken as less than any finite number or INF. But any attempt to either with itself is indeterminate and will result in a missing value being returned.

and, or

Further boolean functions.

Evaluate the results of two or more boolean values.

and
true only if all input values are true, false otherwise.
or
true if a single input value is true, false only if all input values are false.

Pseudo-declaration of PMML built-in function and

  <DefineFunction name="and" optype="categorical" dataType="boolean">
    The function takes a variable number
    of fields as parameters
    ... implementation built-in ...
  </DefineFunction>

Example: Check if field A is less than 3 and field B is less than 4. If so, return true, else false.

<Apply function="and">
  <Apply function="lessThan">
    <FieldRef field="A"/>
    <Constant dataType="integer">3</Constant>
  </Apply>
  <Apply function="lessThan">
    <FieldRef field="B"/>
    <Constant dataType="integer">4</Constant>
  </Apply>
</Apply>

not

Further boolean function.

Negates input boolean value.

Pseudo-declaration of PMML built-in function not

  <DefineFunction name="not" optype="categorical" dataType="boolean">
    <ParameterField name="x" dataType="boolean"/>
    ... implementation built-in ...
  </DefineFunction>

Example: Check if field A is not less than B (i.e. greater or equal to B). If so, returns true, else false.

<Apply function="not">
  <Apply function="lessThan">
    <FieldRef field="A"/>
    <FieldRef field="B"/>
  </Apply>
</Apply>

isIn, isNotIn

Further boolean functions.

Evaluates if a field value is contained in a given list of values.

isIn
True if the field value is contained in list of values.
isNotIn
True if the field value is not contained in list of values.

Pseudo-declaration of PMML built-in function isIn

  <DefineFunction name="isIn" optype="categorical" dataType="boolean">
    <ParameterField name="x"/>
    The list takes a variable number
    of fields as parameters
    ... implementation built-in ...
  </DefineFunction>

Example: Check if field color is in (red, green, blue). If so, return true, else false.

<Apply function="isIn">
  <FieldRef field="color"/>
  <Constant dataType="string">red</Constant>
  <Constant dataType="string">green</Constant>
  <Constant dataType="string">blue</Constant>
</Apply>

if

Implements IF-THEN-ELSE logic. The ELSE part is optional. If the ELSE part is absent and the boolean value is false then a missing value is returned.

Pseudo-declaration of PMML built-in function if

  <DefineFunction name="if">
    <ParameterField name="x" dataType="boolean"/>
    <ParameterField name="A"/>  THEN part is required 
    <ParameterField name="B"/>  ELSE part is optional 
    ... implementation built-in ...
  </DefineFunction>

Example: Check if field color is in (red, green, blue). If so, returns "primary", else "other".

<Apply function="if">
  <Apply function="isIn">
    <FieldRef field="color"/>
    <Constant dataType="string">red</Constant>
    <Constant dataType="string">green</Constant>
    <Constant dataType="string">blue</Constant>
  </Apply>
  <Constant dataType="string">primary</Constant>
  <Constant dataType="string">other</Constant>
</Apply>

String Functions

uppercase

Returns a string where all lowercase characters in the input string are replaced by their uppercase variants.

Pseudo-declaration of PMML built-in function uppercase

  <DefineFunction name="uppercase" optype="categorical" dataType="string">
    <ParameterField name="input" dataType="string"/>
    ... implementation built-in ...
  </DefineFunction>

The function uppercase uses the Unicode definitions for classifying characters as uppercase / lowercase. See XQuery fn:upper-case

Example: Return the field Str with all characters in upper case.

<Apply function="uppercase">
  <FieldRef field="Str"/>
</Apply>

Assuming Str="aBc9" the result corresponding to this Apply element is "ABC9".

lowercase

Returns a string where all uppercase characters in the input string are replaced by their lowercase variants.

Pseudo-declaration of PMML built-in function lowercase:

  <DefineFunction name="lowercase" optype="categorical" dataType="string">
    <ParameterField name="input" dataType="string"/>
    ... implementation built-in ...
  </DefineFunction>

The function lowercase uses the Unicode definitions for classifying characters as uppercase / lowercase. See XQuery fn:lower-case.

Example: Return the field Str with all characters in lower case.

<Apply function="lowercase">
  <FieldRef field="Str"/>
</Apply>

Assuming Str="aBc9" the result corresponding to this Apply element is "abc9".

stringLength

Returns the string length for an input string.

Pseudo-declaration of PMML built-in function stringLength

  <DefineFunction name="stringLength" optype="continuous" dataType="integer">
    <ParameterField name="input" dataType="string"/>
    ...
    See XQuery fn:string-length
    ...
  </DefineFunction>

Example: Return the length of string in the field Str.

<Apply function="stringLength">
  <FieldRef field="Str"/>
</Apply>

Assuming Str="aBc9x" the result corresponding to this Apply element is 5.

substring

Extracts a substring from an input string.

Pseudo-declaration of PMML built-in function substring

  <DefineFunction name="substring" optype="categorical" dataType="string">
    <ParameterField name="input" dataType="string"/>
    <ParameterField name="startPos" dataType="integer"/> 
    <ParameterField name="length" dataType="integer"/>  
    ...
    See XQuery fn:substring
    ...
  </DefineFunction>

startPos and length must be positive integers. The first character of a string is located at position 1 (not position 0).

Example: Return the 3 characters of field Str beginning at position 2.

<Apply function="substring">
  <FieldRef field="Str"/>
  <Constant dataType="integer">2</Constant>
  <Constant dataType="integer">3</Constant>
</Apply>

Assuming Str="aBc9x" the result corresponding to this Apply element is "Bc9".

trimBlanks

Returns a string where leading and trailing characters in the input string are removed. Note that trailing blanks in PMML, by definition, are not significant when strings are compared.

Pseudo-declaration of PMML built-in function trimBlanks:

  <DefineFunction name="trimBlanks" optype="categorical" dataType="string">
    <ParameterField name="input" dataType="string"/>
    ... implementation built-in ...
  </DefineFunction>

Blanks include tab and newline characters. Use definitions according to Unicode.

Example: Trim blanks of field Str.

<Apply function="trimBlanks">
  <FieldRef field="Str"/>
</Apply>

Assuming Str=" aBc9x " the result corresponding to this Apply element is "aBc9x".

concat

Returns a string as a result of the concatenation of two or more parameters.

Pseudo-declaration of PMML built-in function concat:

  <DefineFunction name="concat" optype="categorical" dataType="string">
    <ParameterField name="x"/>
    <ParameterField name="y"/>
    ...
    See XQuery fn:concat
    ...
  </DefineFunction>

Example: Concatenates field month, constant value "-" and field year.

<Apply function="concat">
  <FieldRef field="month"/>
  <Constant>-</Constant>
  <FieldRef field="year"/>
</Apply>

Assuming month=2 and year=2000 the result corresponding to this Apply element is "2-2000".

replace

Replaces each substring in a given input string that matches a given pattern or regular expression by another string. Note that for regular expressions, PMML follows the specification implemented in the PCRE (Perl Compatible Regular Expressions) library.

Pseudo-declaration of PMML built-in function replace

  <DefineFunction name="replace" optype="categorical" dataType="string">
    <ParameterField name="input" dataType="string"/>
    <ParameterField name="pattern" dataType="string"/>
    <ParameterField name="replacement" dataType="string"/>
    ...
    See XQuery fn:replace
    ...
  </DefineFunction>

Example: Replaces a sequence of "B" letters by letter "c".

<Apply function="replace">
  <Constant>BBBB</Constant>
  <Constant>B+</Constant>
  <Constant>c</Constant>
</Apply>

matches

Attempts to match a pattern or regular expression against a given string. It returns a Boolean: true if a match is found or false if not. Note that for regular expressions, PMML follows the specification implemented in the PCRE (Perl Compatible Regular Expressions) library.

Pseudo-declaration of PMML built-in function matches

  <DefineFunction name="matches" optype="categorical" dataType="boolean">
    <ParameterField name="input" dataType="string"/>
    <ParameterField name="pattern" dataType="string"/>
    ...
    See XQuery fn:matches
    ...
  </DefineFunction>

Example: Attempts to match pattern "ary" against the value of field month.

<Apply function="matches">
  <FieldRef field="month"/>
  <Constant>ar?y</Constant>
</Apply>

Assuming month is either "January", "February", or "May"; the result corresponding to this Apply element is true. For any other month, the result is false.

formatNumber

Formats numbers according to a pattern. The pattern uses the Posix descriptors as used, e.g., in the C function printf.

Pseudo-declaration of PMML built-in function formatNumber

  <DefineFunction name="formatNumber" optype="categorical" dataType="string">
    <ParameterField name="input" optype="continuous"/>
    <ParameterField name="pattern" dataType="string"/>
    ... implementation built-in ...
  </DefineFunction>

Example: Convert a number in the field Num into a string of length 3 with leading blanks.

<Apply function="formatNumber">
  <FieldRef field="Num"/>
  <Constant>%3d</Constant>
</Apply>

Assuming Num=2 the result corresponding to this Apply element is the string " 2".

formatDatetime

Formats date and time value according to a pattern. The pattern is a Posix descriptors as used, e.g., in the C function strftime or the Unix command date.

Pseudo-declaration of PMML built-in function formatDatetime

  <DefineFunction name="formatDatetime" optype="categorical" dataType="string"> 
    <ParameterField name="input" optype="ordinal"/> 
    <ParameterField name="pattern" dataType="string"/>
    ... implementation built-in ...
  </DefineFunction>

input must be a date or time or dateTime.

Example: Format a date value as 'Month/Day/Year'.

<DerivedField name="StartDateUS" dataType="string" optype="categorical">
  <Apply function="formatDatetime">
    <FieldRef field="StartDate"/>
    <Constant>%m/%d/%y</Constant>
  </Apply>
</DerivedField>

With StartDate being the date August 20th, 2004 the result is StartDateUS="08/20/04".

Date Functions

dateDaysSinceYear

Function for transforming dates into integers.

The type dateDaysSinceYear is a variant of the type date where the values are represented as the number of days since Year-01-01. The date January 1 of Year is represented by the number 0. January 2 of Year is represented by 1, February 1 of Year is represented by 31, etc. Dates before January 1 of Year are represented as negative numbers. For example, values of type dateDaysSince[1960] are the number of days since January 1, 1960. The date January 1, 1960 is represented by the number 0.

For example, the date April 1, 2003 can be converted to the value 15796 of type dateDaysSince[1960].

Pseudo-declaration of PMML built-in function dateDaysSinceYear

  <DefineFunction name="dateDaysSinceYear" optype="continuous">
    <ParameterField name="input" optype="ordinal"/>
    <ParameterField name="referenceYear" optype="continuous"/>
  </DefineFunction>

input must be of datatype date or dateTime.

Example: Calculate days since 1970.

<DerivedField name="PurchaseDateDays" dataType="integer" optype="continuous">
  <Apply function="dateDaysSinceYear">
    <FieldRef field="PurchaseDate"/>
    <Constant>1970</Constant>
  </Apply>
</DerivedField>

dateSecondsSinceYear

Function for transforming dates into integers.

The type dateSecondsSinceYear is a variant of the type date where the values are represented as the number of seconds since midnight starting the first day of Year (which is represented by 0). 1 minute after midnight on January 1 of Year is represented by 60, 1 hour after midnight on January 1 of Year is represented by 3600, etc. Times before January 1 of Year are represented as negative numbers.

For example, values of type dateSecondsSince[1960] are the number of seconds since the midnight starting January 1, 1960. 30 minutes and 3 seconds after 3 o'clock in the morning of January 3, 1960 can be converted to the value 185403 of type dateSecondsSince[1960].

Pseudo-declaration of PMML built-in function dateSecondsSinceYear

  <DefineFunction name="dateSecondsSinceYear" optype="continuous">
    <ParameterField name="input" optype="ordinal"/>
    <ParameterField name="referenceYear" optype="continuous"/>
  </DefineFunction>

input must be of datatype date or dateTime. If input is of datatype date, it is assumed that the time is 00:00:00 at this date.

Example: Create a new field PurchaseDateSeconds from the PurchaseDate attribute relative to the year 1970.

<DerivedField name="PurchaseDateSeconds" dataType="integer" optype="continuous">
  <Apply function="dateSecondsinceYear">
    <FieldRef field="PurchaseDate"/>
    <Constant>1970</Constant>
  </Apply>
</DerivedField>

dateSecondsSinceMidnight

Function for transforming dates into integers.

For example, Midnight returns a value of 0; 1 second after midnight (00:00:01) would return a value of 1; one minute after midnight would return a value of 60; etc. 23 minutes and 30 seconds after 5 o'clock in the morning should return 19410.

Pseudo-declaration of PMML built-in function dateSecondsSinceMidnight

  <DefineFunction name="dateSecondsSinceMidnight" optype="continuous">
    <ParameterField name="input" optype="ordinal"/>
  </DefineFunction>

input must be of datatype time or dateTime.

Example: Create a new field PurchaseDateSeconds from the PurchaseDate attribute relative to midnight.

<DerivedField name="PurchaseDateSeconds" dataType="integer" optype="continuous">
  <Apply function="dateSecondsSinceMidnight">
    <FieldRef field="PurchaseDate"/>
  </Apply>
</DerivedField>

Distribution Functions

normalCDF, normalPDF, stdNormalCDF, stdNormalPDF, erf, normalIDF, stdNormalIDF

Functions for normal distribution are widely used in statistical applications. Wikipedia has the following information at https://en.wikipedia.org/wiki/Normal_distribution:

In probability theory, the normal (or Gaussian) distribution is a very common continuous probability distribution. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables.

The probability density function (PDF) of the normal distribution with mean Μ and standard deviation Σ is:

f(x,Μ,Σ) = exp(-(x-Μ)^2/(2 Σ^2))/(Σ sqrt(2π))

If Μ = 0 and Σ = 1, the distribution is called the standard normal distribution or the unit normal distribution.

The cumulative distribution function (CDF) of the standard normal distribution, usually denoted with the capital Greek letter Φ (phi), is the integral

\Phi(x)\; = \;\frac{1}{\sqrt{2\pi}} \int_{-\infty}^x e^{-t^2/2} \, dt

In statistics one often uses the related error function, or erf(x), defined as the probability of a random variable with normal distribution of mean 0 and variance 1/2 falling in the range [-x, x], that is:

\operatorname{erf}(x)\; =\; \frac{1}{\sqrt{\pi}} \int_{-x}^x e^{-t^2} \, dt

These integrals cannot be expressed in terms of elementary functions, and are often said to be special functions. However, many numerical approximations are known.

The two functions are closely related, namely

\Phi(x)\; =\; \frac12\left[1 + \operatorname{erf}\left(\frac{x}{\sqrt{2}}\right)\right]

For a generic normal distribution f with mean Μ and standard deviation Σ, the cumulative distribution function is:

F(x)\;=\;\Phi\left(\frac{x-\mu}{\sigma}\right)\;=\; \frac12\left[1 + \operatorname{erf}\left(\frac{x-\mu}{\sigma\sqrt{2}}\right)\right]

The inverse of normal CDF is called the quantile function. The quantile function of the standard normal distribution is called the probit function, and can be expressed in terms of the inverse error function:

\Phi^{-1}(p)\; =\; \sqrt2\;\operatorname{erf}^{-1}(2p - 1), \quad p\in(0,1).

For a normal random variable with mean Μ and variance Σ2, the quantile function is:

F^{-1}(p) = \mu + \sigma\Phi^{-1}(p) = \mu + \sigma\sqrt2\,\operatorname{erf}^{-1}(2p - 1), \quad p\in(0,1).

PMML defines the following built-in functions related to the normal distribution: normalCDF, normalPDF, normalIDF, stdNormalCDF, stdNormalPDF, stdNormalIDF, erf, normalIDF, and stdNormalIDF.

Pseudo-declaration of PMML built-in function normalCDF

    <DefineFunction name="normalCDF" optype="continuous" dataType="double">
    <ParameterField name="x" optype="continuous" dataType="double"/>
    <ParameterField name="mu" optype="continuous" dataType="double"/>
    <ParameterField name="sigma" optype="continuous" dataType="double"/>
    ... implementation built-in ...
  </DefineFunction>

The function normalCDF(x, mu, sigma) returns the value Φ(x, Μ, Σ) defined above. The function stdNormalCDF(x) returns the cumulative distribution function value of x for the standard normal distribution. Its pseudo-declaration is:

    <DefineFunction name="stdNormalCDF" optype="continuous" dataType="double">
    <ParameterField name="x" optype="continuous" dataType="double"/>
    ... implementation built-in ...
  </DefineFunction>

Note that Σ here must be positive.

Functions normalPDF(x, Μ, Σ), normalIDF(p, Μ, Σ), stdNormalPDF(x), and stdNormalIDF(x) have similar to above pseudo-declarations and compute probability distribution functions and inverse distribution functions of normal distribution with mean Μ and positive standard deviation Σ and of standard normal distribution respectively.

PMML function erf is defined similar to stdNormalCDF and computes erf(x) as described above.

Further Mathematical Functions

expm1, hypot, ln1p, rint

Pseudo-declaration of PMML built-in function expm1:

  <DefineFunction name="expm1" optype="continuous" dataType="double">
    <ParameterField name="x" optype="continuous" dataType="double"/>
    ... implementation built-in ...
  </DefineFunction>
 

The expm1 function returns ex-1, where x is the argument, and e is the base of the natural logarithms. The domain of this function is the whole real line. If the input is INF then INF is returned. If the input is -INF then -1 is returned.

Pseudo-declaration of PMML built-in function hypot:

  <DefineFunction name="hypot" optype="continuous" dataType="double">
    <ParameterField name="x" optype="continuous" dataType="double"/>
    <ParameterField name="y" optype="continuous" dataType="double"/>
    ... implementation built-in ...
  </DefineFunction>
 

Hypot is a mathematical function that computes the square root of the sum of the squares of x and y. Therefore, hypot(x,y) function returns sqrt(x2 + y2), where x and y are two parameters. This function is defined for all real numbers x and y.

Pseudo-declaration of PMML built-in function ln1p:

 <DefineFunction name="ln1p" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
  ... implementation built-in ...
 </DefineFunction>
 

ln1p is a mathematical function that returns ln(x+1), where x > -1. If x = -1, the result is negative infinity, and for x < -1 the result is NaN.

Pseudo-declaration of PMML built-in function rint

  <DefineFunction name="rint" optype="continuous" dataType="double">
    <ParameterField name="x" optype="continuous" dataType="double"/>
    ... implementation built-in ...
  </DefineFunction>

The rint function returns the closest whole number to x, rounding toward the nearest even number if the fractional part is exactly one-half. If x is NaN, a NaN shall be returned.

sin, asin, sinh, cos, acos, cosh, tan, atan, tanh

Pseudo-declaration of PMML built-in function sin

<DefineFunction name="sin" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
  ... implementation built-in ...
</DefineFunction>

The function sin(x) returns the trigonometric sine of x, which is assumed to be in radians. The domain of this function is the whole real line. The range is [-1, 1].

Pseudo-declaration of PMML built-in function asin

<DefineFunction name="asin" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
  ... implementation built-in ...
</DefineFunction>

The function asin(x) is an inverse trigonometric function and returns the arc-sine of x as an angle in radians between −π/2 and π/2. The domain of this function is [-1, 1]. Beyond this domain, the result is NaN.

Pseudo-declaration of PMML built-in function sinh

<DefineFunction name="sinh" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
  ... implementation built-in ...
</DefineFunction>

The function sinh(x) returns the hyperbolic sine of x, which is equal to (ex-e-x)/2. The domain of this function is the whole real line.

Pseudo-declaration of PMML built-in function cos

<DefineFunction name="cos" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
  ... implementation built-in ...
</DefineFunction>

The functionn cos(x) returns the trigonometric cosine of x, which is assumed to be in radians. The domain of this function is the whole real line. The range is [-1, 1].

Pseudo-declaration of PMML built-in function acos

<DefineFunction name="acos" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
  ... implementation built-in ...
</DefineFunction>

The function acos(x) is an inverse trigonometric function and returns the arc-cosine of x as an angle in radians between 0 and π. The domain of this function is [-1, 1]. Beyond this domain, the result is NaN.

Pseudo-declaration of PMML built-in function cosh

<DefineFunction name="cosh" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
  ... implementation built-in ...
</DefineFunction>

The function cosh(x) returns the hyperbolic cosine of x, which is equal to (ex+e-x)/2. The domain of this function is the whole real line.

Pseudo-declaration of PMML built-in function tan

<DefineFunction name="tan" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
  ... implementation built-in ...
</DefineFunction>

The function tan(x) returns the trigonometric tangent of x, which is assumed to be in radians. The domain is all real numbers except ±π/2, ±3π/2, ±5π/2, …, where the tan function is undefined. The range of this function is the whole real line.

Pseudo-declaration of PMML built-in function tanh

<DefineFunction name="tanh" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
 ... implementation built-in ...
</DefineFunction>

The function tanh(x) returns the hyperbolic tangent of x, which is equal to (ex-e-x)/(ex+e-x). The domain of this function is the whole real line.

Pseudo-declaration of PMML built-in function atan

<DefineFunction name="atan" optype="continuous" dataType="double">
  <ParameterField name="x" optype="continuous" dataType="double"/>
  ... implementation built-in ...
</DefineFunction>

The function atan(x) is an inverse trigonometric function and returns the arc-tangent of x as an angle in radians between -π/2 and π/2. The domain of this function is the whole real line.

e-mail info at dmg.org