DMG logo PMML 4.4 - Time Series Models
PMML4.4 Menu

Home

Changes

XML Schema

Conformance

Interoperability

General Structure

Field Scope

Header

Data
Dictionary


Mining
Schema


Transformations

Statistics

Taxomony

Targets

Output

Functions

Built-in Functions

Model Verification

Model Explanation

Multiple Models

Anomaly Detection
Models


Association Rules

Baseline Models

Bayesian Network

Cluster
Models


Gaussian
Process


General
Regression


k-Nearest
Neighbors


Naive
Bayes


Neural
Network


Regression

Ruleset

Scorecard

Sequences

Text Models

Time Series

Trees

Vector Machine

PMML 4.4 - Time Series Models

For Time Series models, please see the Notice of Essential Claims.

A Time Series is a sequence of data points, measured at points in time, usually, but not necessarily, forming equidistant intervals. Time series analysis strives to understand such time series, often with the goal of making forecasts (predictions) or of filling in missing values between known data points. Time series prediction is the use of a model to predict future events based on known past events before they are measured. Interpolation is the use of a model to complement or amend values between known data points.

The model must contain information on the general trend, a description of periodic behavior and an overall fitting function that can be used for forecasting and/or interpolation. It may also contain detailed information on various aspects of the time series and the expected forecasting accuracy.

In addition to the entries common to all models, a TimeSeriesModel contains results of at least one time series algorithm, for example SpectralAnalysis, ARIMA, ExponentialSmoothing or SeasonalTrendDecomposition. In PMML 4.4, only Exponential Smoothing, ARIMA, GARCH, and State Space models are defined, the other algorithms are planned for later versions. TimeSeries elements can hold original or predicted time series values.

<xs:element name="TimeSeriesModel">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="MiningSchema"/>
      <xs:element ref="Output" minOccurs="0"/>
      <xs:element ref="ModelStats" minOccurs="0"/>
      <xs:element ref="ModelExplanation" minOccurs="0"/>
      <xs:element ref="LocalTransformations" minOccurs="0"/>
      <xs:element ref="TimeSeries" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="SpectralAnalysis" minOccurs="0"/>
      <xs:element ref="ARIMA" minOccurs="0"/>
      <xs:element ref="ExponentialSmoothing" minOccurs="0"/>
      <xs:element ref="SeasonalTrendDecomposition" minOccurs="0"/>
      <xs:element ref="StateSpaceModel" minOccurs="0"/>
      <xs:element ref="GARCH" minOccurs="0"/>
      <xs:element ref="ModelVerification" minOccurs="0"/>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="modelName" type="xs:string" use="optional"/>
    <xs:attribute name="functionName" type="MINING-FUNCTION" use="required"/>
    <xs:attribute name="algorithmName" type="xs:string" use="optional"/>
    <xs:attribute name="bestFit" type="TIMESERIES-ALGORITHM" use="required"/>
    <xs:attribute name="isScorable" type="xs:boolean" default="true"/>
  </xs:complexType>
</xs:element>
 

<xs:simpleType name="TIMESERIES-ALGORITHM">
  <xs:restriction base="xs:string">
    <xs:enumeration value="ARIMA"/>
    <xs:enumeration value="ExponentialSmoothing"/>
    <xs:enumeration value="SeasonalTrendDecomposition"/>
    <xs:enumeration value="SpectralAnalysis"/>
    <xs:enumeration value="StateSpaceModel"/>
    <xs:enumeration value="GARCH"/>
  </xs:restriction>
</xs:simpleType>  

modelName and algorithmName can have arbitrary strings describing the specific model and algorithm.

functionName must be timeSeries.

The isScorable attribute indicates whether the model is valid for scoring. If this attribute is true or if it is missing, then the model should be processed normally. However, if the attribute is false, then the model producer has indicated that this model is intended for information purposes only and should not be used to generate results. In order to be valid PMML, all required elements and attributes must be present, even for non-scoring models. For more details, see General Structure.

The attribute bestFit is required, and indicates which of the time series algorithms provided by the model results in the best-fitting model. This algorithm should be used when scoring the model.
<xs:element name="TimeSeries">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="TimeAnchor" minOccurs="0" maxOccurs="1"/>
      <xs:element ref="TimeValue" minOccurs="1" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="usage" type="TIMESERIES-USAGE" default="original"/>
    <xs:attribute name="startTime" type="REAL-NUMBER"/>
    <xs:attribute name="endTime" type="REAL-NUMBER"/>
    <xs:attribute name="interpolationMethod" type="INTERPOLATION-METHOD" default="none"/>
    <xs:attribute name="field" type="FIELD-NAME"/>
  </xs:complexType>
</xs:element>

<xs:simpleType name="TIMESERIES-USAGE">
  <xs:restriction base="xs:string">
    <xs:enumeration value="original"/>
    <xs:enumeration value="logical"/>
    <xs:enumeration value="prediction"/>
  </xs:restriction>
</xs:simpleType>

<xs:element name="TimeValue">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Timestamp" minOccurs="0" maxOccurs="1"/>
    </xs:sequence>
    <xs:attribute name="index" type="INT-NUMBER" use="optional"/>
    <xs:attribute name="time" type="NUMBER" use="optional"/>
    <xs:attribute name="value" type="REAL-NUMBER" use="required"/>
    <xs:attribute name="standardError" type="REAL-NUMBER" use="optional"/>
  </xs:complexType>
</xs:element>

<xs:element name="TimeAnchor">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="TimeCycle" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="TimeException" minOccurs="0" maxOccurs="2"/>
    </xs:sequence>
    <xs:attribute name="type" type="TIME-ANCHOR"/>
    <xs:attribute name="offset" type="INT-NUMBER"/>
    <xs:attribute name="stepsize" type="INT-NUMBER"/>
    <xs:attribute name="displayName" type="xs:string" use="optional"/>
  </xs:complexType>
</xs:element>

<xs:element name="TimeCycle">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="INT-ARRAY" minOccurs="0" maxOccurs="1"/>
    </xs:sequence>
    <xs:attribute name="length" type="INT-NUMBER"/>
    <xs:attribute name="type" type="VALID-TIME-SPEC"/>
    <xs:attribute name="displayName" type="xs:string" use="optional"/>
  </xs:complexType>
</xs:element>

<xs:simpleType name="TIME-ANCHOR">
  <xs:restriction base="xs:string">
    <xs:enumeration value="dateTimeMillisecondsSince[0]"/>
    <xs:enumeration value="dateTimeMillisecondsSince[1960]"/>
    <xs:enumeration value="dateTimeMillisecondsSince[1970]"/>
    <xs:enumeration value="dateTimeMillisecondsSince[1980]"/>
    <xs:enumeration value="dateTimeSecondsSince[0]"/>
    <xs:enumeration value="dateTimeSecondsSince[1960]"/>
    <xs:enumeration value="dateTimeSecondsSince[1970]"/>
    <xs:enumeration value="dateTimeSecondsSince[1980]"/>
    <xs:enumeration value="dateDaysSince[0]"/>
    <xs:enumeration value="dateDaysSince[1960]"/>
    <xs:enumeration value="dateDaysSince[1970]"/>
    <xs:enumeration value="dateDaysSince[1980]"/>
    <xs:enumeration value="dateMonthsSince[0]"/>
    <xs:enumeration value="dateMonthsSince[1960]"/>
    <xs:enumeration value="dateMonthsSince[1970]"/>
    <xs:enumeration value="dateMonthsSince[1980]"/>
    <xs:enumeration value="dateYearsSince[0]"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="VALID-TIME-SPEC">
  <xs:restriction base="xs:string">
    <xs:enumeration value="includeAll"/>
    <xs:enumeration value="includeFromTo"/>
    <xs:enumeration value="excludeFromTo"/>
    <xs:enumeration value="includeSet"/>
    <xs:enumeration value="excludeSet"/>
  </xs:restriction>
</xs:simpleType>

<xs:element name="TimeException">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="INT-ARRAY" minOccurs="1"/>
    </xs:sequence>
    <xs:attribute name="type" type="TIME-EXCEPTION-TYPE"/>
    <xs:attribute name="count" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

<xs:simpleType name="TIME-EXCEPTION-TYPE">
  <xs:restriction base="xs:string">
    <xs:enumeration value="exclude"/>
    <xs:enumeration value="include"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="INTERPOLATION-METHOD">
  <xs:restriction base="xs:string">
    <xs:enumeration value="none"/>
    <xs:enumeration value="linear"/>
    <xs:enumeration value="exponentialSpline"/>
    <xs:enumeration value="cubicSpline"/>
  </xs:restriction>
</xs:simpleType>

The element TimeSeries contains a time series consisting of one or more TimeValue objects. The time series can be an original time series as read from the input data; in this case the attribute interpolationMethod is 'none'. Or it can be a pre-processed and interpolated time series; pre-processing and interpolation may be necessary to produce a logical time series, for most time series algorithms require a sequence of logically equidistant time steps. If a logical time series is present, it was used as input into the algorithm. Finally, the time series (usage = 'prediction') may hold values predicted by the best-fitting model.

StartTime and endTime refer to points in an input time series between which the points were used for fitting. They can be integers indicating the index into a logical time series or real numbers indicating original points in time. These values are required when the usage type is original or logical.
The attribute interpolationMethod names the interpolation method used to compute values between existing (or predicted) data points. It is one of {'none', 'linear', 'exponentialSpline', 'cubicSpline'}.

The attribute field provides the field name for the time series variable and is required if there are several time series fields in the context.

TimeValue contains one single point of a time series. The point can either be a known point from the past; in this case, only the attribute value is required. In addition, time or index must be used. In case of a logical TimeSeries, index values must be present. Or the time point is a predicted future value; in this case, the attribute standardError can contain the incertitude (predicted standard error) of the prediction standard based on the empirically determined error.
Note: TimeAnchor and TimeCycle define the correlation to calendar times. Optionally, a contained element Timestamp may hold a string describing the time for presentation purposes, see Header.

TimeAnchor optionally defines the relationship between time points in a time series and a calendar. It is not used for computing predictions, but it may be used by applications or visualization tools that want to come up with predictions based on points of time in a calendar as opposed to just a look-ahead index. Time is anchored at an offset with respect to a specified calendar point given by type. And the flow of time is defined in smallest steps of size stepsize. Both offset and stepsize are (long) integer values in the units specified in type. An optional displayName, e.g. "day" can be provided as a name for the time step.

TimeCycle allows to express the situation where time steps are not contiguous on a calendar. As an example, consider hourly revenue data of a store that opens Mo-Sat from 7am to 9pm. One has to represent hours as being the step size of the data, but one also wants to be able to specify that Sundays and night-shifts should be disregarded, i.e. for the time series prediction, the value for Monday 8am (aggregated revenue between 7am and 8am) immediately follows that of Saturday 9pm.
Each TimeCycle divides the sequence of time steps defined by the previous TimeCycle (or the TimeAnchor) into cycles of equal length, each cycle consisting of length steps. Index values for these steps run from 0 to length - 1, and are used in the specification of valid steps. Type defines whether this definition is by interval or enumeration and whether by inclusion or exclusion, and the contained Array element provides the interval boundaries or enumerates the values. The following is the specification of the shop hours in the example:

<TimeAnchor type="dateTimeSecondsSince[1960]" offset="1530543600" stepsize="3600" displayName="hour">
  <TimeCycle length="24" type="includeFromTo" displayName="day">
    <Array type="int" n="2">7 20</Array>
  </TimeCycle>
  <TimeCycle length="7" type="excludeSet" displayName="week">
    <Array type="int" n="1">6</Array>
  </TimeCycle>
</TimeAnchor>

Calendar entries can now be described as a sequence of values. The 15th hour of the 6th day in the 30th week since the time anchor (1530543600 seconds after the beginning of 1960) would become <29, 5, 14>.

In addition to the regular behavior, there may be exceptions to the TimeStep specification. The store may, for example, be closed on July 4th, but exceptionally open late because of an event on some other day. This is captured by up to two TimeExceptions, which contain lists of unsystematic exclusions or inclusions as arrays of index values. All index values of a certain TimeCycle can be specified by using the length value instead of a valid index; -1 is used for the regular indexes..

The following TimeExceptions specify additional shop closure and opening hours.

<TimeExceptions type="exclude" count="2">
  <!-- closed in the 5th week on the 6th day at the 8th hour -->
  <Array type="int">4 5 7</Array>
  <!-- closed in the 33rd week, throughout the 1st day -->
  <Array type="int">32 0 24</Array>
</TimeExceptions%gt;
<TimeExceptions type="include" count="2">
  <!-- open in the 1st week on the 7th day at regular hours -->
  <Array type="int">0 6 -1</Array>
  <!-- open in the 34th week on the 6th day at the 20th hour -->
  <Array type="int">33 5 19</Array>
</TimeExceptions>

ExponentialSmoothing contains an exponential smoothing model for the time series. It is one out of the 15 possible model type combinations (no trend N, additive trend A, damped additive trend DA, multiplicative trend M, damped multiplicative trend DM) * (no seasonality N, additive seasonality A, multiplicative seasonality M). If the model contains a seasonality, the seasonality info is captured in the Seasonality sub-element. Each TimeValue sub-element contains one predicted time point. The predicted time points are calculated from Gardner, Jr., E.S., "Exponential smoothing: The state of the art - Part II", 3 Jun. 2005 for the given Trend combined with the Seasonality type. The number of predicted time points contained in the model may be determined by the modeling kernel, for example by using the incertitude ranges of each prediction.

This model also supports the multiple smoothing techniques described in the Multiple Smoothing for Higher-order Polynomials found in Brown, R.G., Smoothing, Forecasting and Prediction of Discrete Time Series, Mineola, New York: Dover Publications, Inc., 2004.

<xs:element name="ExponentialSmoothing">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Level" minOccurs="1" maxOccurs="1"/>
      <xs:element ref="Trend_ExpoSmooth" minOccurs="0" maxOccurs="1"/>
      <xs:element ref="Seasonality_ExpoSmooth" minOccurs="0" maxOccurs="1"/>
      <xs:element ref="TimeValue" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="RMSE" type="REAL-NUMBER"/>
    <xs:attribute name="transformation" default="none">
      <xs:simpleType>
        <xs:restriction base="xs:NMTOKEN">
          <xs:enumeration value="none"/>
          <xs:enumeration value="logarithmic"/>
          <xs:enumeration value="squareroot"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:complexType>
</xs:element>

RMSE is the root mean squared error of the predictions.

Transformation specifies what transformation has been applied to the time series prior to executing the algorithm. Possible values are "none," "logarithmic" and "squareroot." This attribute is informational only, and does not affect scoring.

Level specifies smoothedValue the smoothed value of the time series at the last known point of the history. The optional attribute alpha is the optimal smoothing parameter for the level. It can be used to continue the fitting process if more data become known, but it is not needed for scoring. However, it may be used to compute theoretical confidence intervals.

<xs:element name="Level">
  <xs:complexType>
    <xs:attribute name="alpha" type="REAL-NUMBER" use="optional"/>
    <xs:attribute name="smoothedValue" type="REAL-NUMBER"/>
  </xs:complexType>
</xs:element>

Trend_ExpoSmooth specifies the smoothed value or coefficients of the trend at the last known point of the history. For the Gardner models, the smoothed value can be found in the smoothedValue attribute; for Brown's polynomial models, the smoothed coefficients can be found in the array sub-element. The smoothed coefficients are required if the specified trend is "polynomial_exponential;" otherwise, the smoothed value is required. The optional attribute gamma is the optimal smoothing parameter for the trend. It can be used to continue the fitting process if more data become known, but it is not needed for scoring. The damping parameter phi is needed when scoring the damped versions of the Gardner models.

<xs:element name="Trend_ExpoSmooth">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="trend" default="additive">
      <xs:simpleType>
        <xs:restriction base="xs:NMTOKEN">
          <xs:enumeration value="additive"/>
          <xs:enumeration value="damped_additive"/>
          <xs:enumeration value="multiplicative"/>
          <xs:enumeration value="damped_multiplicative"/>
          <xs:enumeration value="polynomial_exponential"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute> 
    <xs:attribute name="gamma" type="REAL-NUMBER" use="optional"/>
    <xs:attribute name="phi" type="REAL-NUMBER" use="optional" default="1"/>
    <xs:attribute name="smoothedValue" type="REAL-NUMBER" use="optional"/>
  </xs:complexType>
</xs:element>

Seasonality_ExpoSmooth describes a periodic oscillation cycle with a length of period time units, where period must be a positive integer. The phase indicates the season index of the last known data point; it defaults to period. The oscillation can be additive, that means of the form 'trend + oscillation' or multiplicative, that means of the form 'trend * oscillation'. Unit is a string used for naming the cycles, such as "week" or "year." It is optional and serves only for explanatory purposes. delta is seasonal smoothing weight. The sub-element RealArray (of size period) contains floating point numbers which describe the local values of the oscillation at each of the season indices. In the additive case, the sum of all these numbers may be normalized to 0. In the multiplicative case, the product of all these numbers may be normalized to 1.

<xs:element name="Seasonality_ExpoSmooth">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
    <xs:attribute name="type" use="required">
      <xs:simpleType>
        <xs:restriction base="xs:NMTOKEN">
          <xs:enumeration value="additive"/>
          <xs:enumeration value="multiplicative"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="period" type="INT-NUMBER" use="required"/>
    <xs:attribute name="unit" type="xs:string" use="optional"/>
    <xs:attribute name="phase" type="INT-NUMBER" use="optional"/>
    <xs:attribute name="delta" type="REAL-NUMBER" use="optional"/>
  </xs:complexType>
</xs:element>

Scoring Procedure

Predictions of future events in a time series are based on a set of well-defined formulae. The formula used to score a particular model is based on the specified trend and seasonality.

The formulae use the following notation:

Symbol Definition PMML equivalent
m Number of periods in the forecast lead-time Input to the model
Xt(m) Forecast for m periods ahead from origin t Output of the model
p Number of periods in the seasonal cycle period attribute within the Seasonal_ExpoSmooth element
α Smoothing parameter for the level of the series alpha attribute within the Level element
φ Autoregressive or damping parameter phi attribute within the Trend_ExpoSmooth element
St Smoothed level of the series, computed after Xt is observed. Also the expected value of the data at the end of period t in some models smoothedValue attribute within the Level element
Tt Smoothed additive trend at the end of period t smoothedValue attribute within the Trend_ExpoSmooth element
Rt Smoothed multiplicative trend at the end of period t smoothedValue attribute within the Trend_ExpoSmooth element
It Smoothed seasonal index at the end of period t. The reference It-p+m found in the formulae below identifies a specific seasonal index, found by cycling through the list of seasonal indexes Array element within the Seasonality_ExpoSmooth element
a0,a1...an Smoothed coefficients used by Brown's multiple smoothing formulae Array element within the Trend_ExpoSmooth element

The complete set of formulae are listed below. The first 15 formulae are identified by the combination of trend and seasonality specified by the model (i.e. <trend>-<seasonality>).

The remaining formula is used for Brown's polynomial models. In this case, the number of values found in the Array element within the Trend_ExpoSmooth element will dictate the order of the polynomial model. A single coefficient indicates a constant model, for which predictions are calculated by applying the formula Xt(m) = a0. For a linear model (two coefficients), the formula Xt(m) = a0 + a1m is used, while a quadratic model (three coefficients) would use the formula Xt(m) = a0 + a1m + (1/2)a2m2 .

N-N: Xt(m) = St

N-A: Xt(m) = St + It-p+m

N-M: Xt(m) = StIt-p+m

A-N: Xt(m) = St + mTt

A-A: Xt(m) = St + mTt + It-p+m

A-M: Xt(m) = (St + mTt)It-p+m

DA-N: Xt(m) = St + (Sum[i in 1..m]φiTt)

DA-A: Xt(m) = St + (Sum[i in 1..m]φiTt) + It-p+m

DA-M: Xt(m) = (St + (Sum[i in 1..m]φiTt))It-p+m

M-N: Xt(m) = StRtm

M-A: Xt(m) = StRtm + It-p+m

M-M: Xt(m) = (StRtm)It-p+m

DM-N: Xt(m) = StRt(Sum[i in 1..m]φi)

DM-A: Xt(m) = StRt(Sum[i in 1..m]φi) + It-p+m

DM-M: Xt(m) = (StRt(Sum[i in 1..m]φi))It-p+m

Brown: Xt(m) = a0 + a1m + (1/2)a2m2 + ... + (1/n!)anmn

Example for a time series model:

<PMML xmlns="http://www.dmg.org/PMML-4_4" version="4.4">
  <Header copyright="2018 DMG.org">
    <Application name="test application"/>
    <Timestamp>2008-06-23 10:30:00</Timestamp>
  </Header>
  <DataDictionary numberOfFields="2">
    <DataField dataType="integer" optype="continuous" name="TS" displayName="TS"/>
    <DataField dataType="double" optype="continuous" name="VALUE" displayName="TS-VALUE"/>
  </DataDictionary>
  <TimeSeriesModel modelName="AA2Model" functionName="timeSeries" algorithmName="exponential smoothing" bestFit="ExponentialSmoothing">
    <MiningSchema>
      <MiningField name="TS" usageType="order"/>
      <MiningField name="VALUE" usageType="target"/>
    </MiningSchema>
    <TimeSeries usage="logical" startTime="1" endTime="24" interpolationMethod="none">
      <TimeValue index="1" value="112"/>
      <TimeValue index="2" value="118"/>
      <TimeValue index="3" value="132"/>
      <TimeValue index="4" value="129"/>
      <TimeValue index="5" value="121"/>
      <TimeValue index="6" value="135"/>
      <TimeValue index="7" value="148"/>
      <TimeValue index="8" value="148"/>
      <TimeValue index="9" value="136"/>
      <TimeValue index="10" value="119"/>
      <TimeValue index="11" value="104"/>
      <TimeValue index="12" value="118"/>
      <TimeValue index="13" value="115"/>
      <TimeValue index="14" value="126"/>
      <TimeValue index="15" value="141"/>
      <TimeValue index="16" value="135"/>
      <TimeValue index="17" value="125"/>
      <TimeValue index="18" value="149"/>
      <TimeValue index="19" value="170"/>
      <TimeValue index="20" value="170"/>
      <TimeValue index="21" value="158"/>
      <TimeValue index="22" value="133"/>
      <TimeValue index="23" value="114"/>
      <TimeValue index="24" value="140"/>
    </TimeSeries>
    <TimeSeries usage="prediction" interpolationMethod="none">
      <TimeValue index="25" value="145" standardError="7.3"/>
      <TimeValue index="26" value="150" standardError="8.3"/>
      <TimeValue index="27" value="178" standardError="9.3"/>
      <TimeValue index="28" value="163" standardError="10.3"/>
      <TimeValue index="29" value="172" standardError="11.3"/>
      <TimeValue index="30" value="178" standardError="12.3"/>
      <TimeValue index="31" value="199" standardError="13.3"/>
      <TimeValue index="32" value="199" standardError="14.3"/>
      <TimeValue index="33" value="184" standardError="15.3"/>
      <TimeValue index="34" value="162" standardError="16.3"/>
      <TimeValue index="35" value="146" standardError="17.3"/>
      <TimeValue index="36" value="166" standardError="18.3"/>
    </TimeSeries>
    <ExponentialSmoothing RMSE="7.3">
      <Level alpha="0.233984" smoothedValue="139.8"/>
      <Trend_ExpoSmooth smoothedValue="4.139" gamma="3.910E-4" phi="1.006" trend="damped_additive"/>
      <Seasonality_ExpoSmooth type="multiplicative" period="12" unit="month" delta="0.8254" phase="12">
        <Array n="12" type="real">
          .900 .840 .924 .976 .994 1.120 0.981 1.025 1.038 1.038 0.908 1.259
        </Array>
      </Seasonality_ExpoSmooth>
    </ExponentialSmoothing>
  </TimeSeriesModel>
</PMML>

This is an example of an exponential smoothing time series model using Brown's multiple smoothing technique:

<PMML xmlns="http://www.dmg.org/PMML-4_4" version="4.4">
  <Header copyright="DMG.org">
    <Application name="test application"/>
  </Header>
  <DataDictionary numberOfFields="2">
    <DataField dataType="double" name="Month_Index" optype="continuous">
      <Interval closure="closedClosed" leftMargin="1" rightMargin="50"/>
    </DataField>
    <DataField dataType="double" name="QuadraticMonth" optype="continuous"/>
  </DataDictionary>
  <TimeSeriesModel bestFit="ExponentialSmoothing" functionName="timeSeries" modelName="QuadraticMonth Predictor">
    <MiningSchema>
      <MiningField invalidValueTreatment="asIs" name="Month_Index" outliers="asIs" usageType="order"/>
      <MiningField name="QuadraticMonth" outliers="asIs" usageType="target"/>
    </MiningSchema>
    <Output>
      <OutputField dataType="double" feature="predictedValue" name="QuadraticMonth Predictor - Predicted Value" optype="continuous"/>
    </Output>
    <TimeSeries>
      <TimeValue index="50" value="2550"/>
    </TimeSeries>
    <ExponentialSmoothing RMSE="1.469296539064514">
      <Level alpha="0.29"/>
      <Trend_ExpoSmooth trend="polynomial_exponential">
        <Array type="real" n="3">2549.999972 100.9999732 1.999994714</Array>
      </Trend_ExpoSmooth>
      <TimeValue index="1" value="1.499999999999986"/>
      <TimeValue index="2" value="4.434999999999974"/>
      <TimeValue index="3" value="9.422699999999999"/>
      <TimeValue index="4" value="16.69814500000001"/>
      <TimeValue index="5" value="26.30456892500002"/>
      <TimeValue index="6" value="38.19968310645001"/>
      <TimeValue index="7" value="52.31215520655601"/>
      <TimeValue index="8" value="68.56947439659567"/>
      <TimeValue index="9" value="86.90993292618671"/>
      <TimeValue index="10" value="107.2862207849398"/>
      <TimeValue index="11" value="129.6649358584095"/>
      <TimeValue index="12" value="154.0243926888923"/>
      <TimeValue index="13" value="180.3519824960263"/>
      <TimeValue index="14" value="208.6416885111437"/>
      <TimeValue index="15" value="238.8920018116692"/>
      <TimeValue index="16" value="271.1042947305884"/>
      <TimeValue index="17" value="305.281618813078"/>
      <TimeValue index="18" value="341.4278584112037"/>
      <TimeValue index="19" value="379.5471635161653"/>
      <TimeValue index="20" value="419.6435914851766"/>
      <TimeValue index="21" value="461.7208987097413"/>
      <TimeValue index="22" value="505.7824356899514"/>
      <TimeValue index="23" value="551.8311103729045"/>
      <TimeValue index="24" value="599.8693941784766"/>
      <TimeValue index="25" value="649.8993527234378"/>
      <TimeValue index="26" value="701.9226890294883"/>
      <TimeValue index="27" value="755.9407912489685"/>
      <TimeValue index="28" value="811.9547799736052"/>
      <TimeValue index="29" value="869.965552291198"/>
      <TimeValue index="30" value="929.9738211628786"/>
      <TimeValue index="31" value="991.980149602085"/>
      <TimeValue index="32" value="1055.984979693915"/>
      <TimeValue index="33" value="1121.988656811033"/>
      <TimeValue index="34" value="1189.991449540624"/>
      <TimeValue index="35" value="1259.993565893434"/>
      <TimeValue index="36" value="1331.99516636062"/>
      <TimeValue index="37" value="1405.996374344018"/>
      <TimeValue index="38" value="1481.997284428074"/>
      <TimeValue index="39" value="1559.997968898639"/>
      <TimeValue index="40" value="1639.998482851363"/>
      <TimeValue index="41" value="1721.998868174932"/>
      <TimeValue index="42" value="1805.999156642965"/>
      <TimeValue index="43" value="1891.999372304384"/>
      <TimeValue index="44" value="1979.999533324537"/>
      <TimeValue index="45" value="2069.999653398587"/>
      <TimeValue index="46" value="2161.999742833126"/>
      <TimeValue index="47" value="2255.999809371593"/>
      <TimeValue index="48" value="2351.9998588225"/>
      <TimeValue index="49" value="2449.999895536411"/>
      <TimeValue index="50" value="2549.999922767288"/>
    </ExponentialSmoothing>
  </TimeSeriesModel>
</PMML>

This model can be scored using Brown's formula for multiple smoothing. For instance, to predict the next value in the series (i.e. index=51, which corresponds to m=1), we use the following calculation:

Xt(m) = a0 + a1m + (1/2)a2m2
= 2549.999972 + (100.9999732 * 1) + (0.5 * 1.999994714 * 12)
= 2652

ARIMA and Transfer Function models

The element ARIMA contains an ARIMA (Autoregressive Integrated Moving Average) model for the time series data to better understand the data or to predict values for future time points in the series. An ARIMA model, also known as Box-Jenkins model, is composed of two parts: an autoregressive (AR) part and a moving average (MA) part. The AR part specifies the current value of the time series as a linear combination of the previous values of the time series plus a stochastic term. The MA part is a linear function of white noise error terms or random shocks at current and previous time points. A Transfer Function model is ARIMA model with predictor series. The element ARIMA supports both the non-seasonal and seasonal ARIMA models, with or without a set of predictors. Other features supported in the element include the specification of the numerator and denominator polynomials for each predictor, the definition of each detected outlier, and the input of extra information needed for forecasting. The technical details of the model are described in Brockwell, P. J., and R. A. Davis. Time series: theory and methods. Springer Science & Business Media, 1991.
Some basic information on ARIMA without predictors can be found at https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average

Overview of models and methods.

Denote the target time series to be modeled and predicted Yt.

We will use B to represent a backward shift (or lag) operator: BYt = Yt-1, its k-th degree is BkYt = Yt-k.

Δd is a differencing operator of order d: Δd = ( 1 - B )d.

The general form for an AR polynomial of order p is defined as:

ϕp (B) = 1 - φ1 B - φ2 B2 - ... - φp Bp

and a MA polynomial of order q:

θq (B) = 1 - θ1 B - θ2 B2 - ... - θq Bq

Those represent non-seasonal part of the model, and seasonal polynomials of order P and Q are used for the seasonal components of AR and MA:

ΦP (Bs) = 1 - Φ1 Bs - Φ2 B2s - ... - ΦP BPs
and
ΘQ (Bs) = 1 - Θ1 Bs - Θ2 B2s - ... - ΘQ BQs
where s is the seasonal length or period.

The coefficients φi, θi, Φi, Θi of the above polynomials constitute a major part of an ARIMA model.

Often the values of a time series are affected by one or more other time series, called predictor time series or dynamic regressors. We will denote them by Xit, i=1,..., k. Consider the numerator polynomial for the i-th predictor Xit:

  Numi = ( ωi0 - ωi1B - ... - ωiuBu ) * ( 1 - Ωi1Bs - ... - ΩivBvs )

and denominator polynomial for that predictor:

  Deni = ( 1 - δi1B - ... - δirBr ) * ( 1 - Δi1Bs - ... )

The following equations are given in terms of Yt and Xit, but they can be log or square root transformations of the original series. If the target time series was transformed then its predictions should be back-transformed to the original scale.

ARIMA model includes the following widely used variations:

ARMA( p, q ):    Yt = μ + θq( B ) / ϕp( B ) * at

Here μ is a constant, at is a white noise series normally distributed with mean zero and variance σ2.

ARIMA( p, d, q ):    Δd Yt = μ + θq( B ) / ϕp( B ) * at

Note: ARIMA( p, 0, q ) = ARMA( p, q ).

ARIMA( p, d, q )( P, D, Q ):   Δd ΔD Yt = μ + θq( B ) ΘQ( Bs ) / ϕp( B ) ΦP( Bs ) * at

Here Δd and ΔD are the difference operator and the seasonal difference operator for Yt.

Note: ARIMA( p, 0, q )( P, 0, Q ) = ARMA( p, q )( P, Q ).

Transfer Function model:    Δd ΔD Yt = μ + Σi=1 to K (Numi / Deni Δdi ΔDi Bli Xit) + Nt,

where Nt is the noise series that follows zero mean ARMA(p,q)(P,Q) model:

  Nt = θq( B ) ΘQ( Bs ) / ϕp( B ) ΦP( Bs ) * at

Here Δdi and ΔDi are the difference operator and seasonal difference operator for Xit, respectively, and Bli is delay term for Xit , li is the order of the delay.
Note: ARMA and ARIMA are special cases of transfer function model.

Transfer function model with outliers

Outliers can have significant effect on the predictions of time series models.

We follow the paper "Outliers, Level Shifts, and Variance Changes in Time Series" by Ruey S. Tsay. Journal of Forecasting, Vol. 7, I-20 (1988). Six types of outliers are allowed: additive outliers (AO), innovational outliers (IO), level shift (LS), temporary (or transient) change (TC), seasonal additive (SA), local trend (LT). Suppose there are M outliers at times T1, T2, ..., TM with types O1, O2, ..., OM and magnitudes w1, w2, ..., wM. The transfer function model incorporating all these outliers is
  Δd ΔD Yt = μ + Σi=1 to K Numi / Deni Δdi ΔDi Bli Xit + Σi=1 to M wkLOk(B)Δd ΔD ITk(t) + Nt,
where Nt is as before, IT(t) is 1 when t=T and 0 otherwise. The term LO(B) is defined as follows for each type of outlier.

Type of outlier Formula for LO(B)
Additive 1
Innovational 1/( Δd ΔD ϖ(B)), where ϖ(B) = ϕp( B ) ΦP( Bs ) / θq( B ) ΘQ( Bs )
Level Shift 1/(1-B)
Temporary Change 1/(1-δB)
Seasonal Additive 1/(1-Bs)
Local Trend 1/(1-B)2

Since ITk(t) can be regarded as a predictor series, i.e. the series takes value 1 if t=Tk, otherwise is 0, we can consider each outlier part LOk(B) Δd ΔD ITk(t) as a transfer function. Thus, in the scoring descriptions below they should be treated as transfer functions.

ARIMA element in PMML 4.4 and later includes all the parameters mentioned above and provides three forecasting methods: conditional least squares, Theta recursion, Kalman filter. Theta recursion and Kalman filter are two kinds of exact least squares prediction method. This method is normally used to predict the noise term Nt, and Yt is then obtained by adding the other terms such as the mean constant μ or transfer functions for predictors or outliers.

Theta recursion method

For a series with zero mean, and without predictors, outliers, or differencing operators, one-step-ahead forecast of Yt+1 based on the past observations Y1,...,Yt and past predicted values Ŷ1,...,Ŷt is:

 for t between 1 and m = max( p, q ):
   Ŷt+1 = Σj=1 to t θt,j (Yt+1-j-Ŷt+1-j ),
 for t > m:
   Ŷt+1= φ1 Yt2 Yt-1+...+φp Yt+1-p+ Σj=1 to q θt,j (Yt+1-j-Ŷt+1-j )

θ(t,j) is computed recursively based on theoretical auto-covariance of ARMA(p,q) model and the forecast variance at previous time point. The forecast variance of Ŷt+1 is estimated as σt+122t, where the forecast variance factor νt is also recursively computed based on theoretical auto-covariance of ARMA(p,q) model, θt,j, and the forecast variance factor at previous time point. The computation details can be found in proposition 5.2.2 and equation 5.3.9 in Chapter 5 of Brockwell and Davis(1991) where this is called "The innovation algorithm".

After model estimation, the following statistics need to be saved into PMML:
  • Final m observations: Y
  • Final q predicted values: Ŷ
  • Final thetas: θt,t-j, t=n-q+1,...,n, and j=n-q,...,t-1
  • Final nu: νnn-1,...,νn-q+1
  • AR coefficients: φ1,...,φp
Here n is the number of time points in historical data. If the model includes a constant term and a predictor (or regressor) term, during prediction these terms need to be added to the remaining part which is noise which follows ARMA(p,q) model. So terms "FinalNoise" and "FinalPredictedNoise" are used, all the listed statistics can be found in element ThetaRecursionState.

Kalman Filter method

Let m = max( p, q ). The state-space representation of a time series Yt with zero mean and without predictors, outliers, or differencing operators is:
  St+1 = F St + H et
  Yt = G St + et
where St is a state vector of m by 1, and
  G=(1,0,...,0)(1×m)
  HT=(ψ1,...,ψm )
where ψi are coefficients in
  (1-θ1 B-θ2 B2-...-θq Bq)/ (1-φ1 B-φ1 B2-...-φp Bp )= 1+ψ1 B+ψ2 B2+...
F is an m×m matrix with 1's above the main diagonal, the bottom row containing φm, φm-1, ..., φ1, and zeros elsewhere. Here φi = 0 if i > p.
The noise term et has expectation E(et)=0 and variance Var(et ) = σ2.
The details of one-step-ahead forecast and forecast variance are given by the proposition 12.2.2 in Chapter 12 of Brockwell and Davis (1991).
After model estimation, the following statistics need to be saved into PMML:
  • Final state vector: Sn
  • Final error covariance matrix of state Sn: Ωn
  • H vector: H
These statistics can be found in element KalmanState.

Maximum likelihood forecasting methods

The unknown parameters of ARMA model are estimated iteratively by maximizing the log likelihood function. In each iteration, we use the theta recursive method or Kalman filter method to compute one-step-ahead forecast values based on parameters in the last iteration, and then compute the log-likelihood value. Based on log-likelihood value, we update the parameters based on derivatives. These steps are repeated until the criteria of convergence are reached. Since the theta recursive method or Kalman filter method are involved in the maximum likelihood estimation of parameters, we call the PMML element containing statistics for the two forecasting methods "MaximumLikelihoodStats".

The schema

Here is the definition of ARIMA element:
<xs:element name="ARIMA">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="NonseasonalComponent" minOccurs="0"/>
      <xs:element ref="SeasonalComponent" minOccurs="0"/>
      <xs:element ref="DynamicRegressor" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="MaximumLikelihoodStat" minOccurs="0"/>
      <xs:element ref="OutlierEffect" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="RMSE" type="REAL-NUMBER"/>
    <xs:attribute name="transformation" default="none">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="none"/>
          <xs:enumeration value="logarithmic"/>
          <xs:enumeration value="squareroot"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="constantTerm" type="REAL-NUMBER" default="0"/>
    <xs:attribute name="predictionMethod" default="conditionalLeastSquares">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="conditionalLeastSquares"/>
          <xs:enumeration value="exactLeastSquares"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:complexType>
</xs:element>

The attribute RMSE is the root mean squared error, or standard deviation of the residuals.

The attribute transformation is similar to the one in ExponentialSmoothing, which specifies what transformation has been applied to the target series prior to executing the algorithm. If present, the inverse of this transform should be applied to the model predictions to get the predicted value of the original series.

The attribute constantTerm specifies the constant value μ in ARIMA part. This value should be added to the predicted value (before applying the inverse of the transformation, if specified) to get the prediction of the original series.

The attribute predictionMethod specifies the method of prediction of the ARIMA model. Description of each method is given in the Prediction Methods section. Note that if predictionMethod is exactLeastSquares then element MaximumLikelihoodStat must be present.

Nonseasonal Component and Seasonal Component

The AutoRegressive and MovingAverage components of the non-seasonal and seasonal component, respectively, of the time series model are described by the following elements:

<xs:element name="NonseasonalComponent">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="AR" minOccurs="0"/>
      <xs:element ref="MA" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="p" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="d" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="q" type="xs:nonNegativeInteger" default="0"/>
  </xs:complexType>
</xs:element>

<xs:element name="SeasonalComponent">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="AR" minOccurs="0"/>
      <xs:element ref="MA" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="P" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="D" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="Q" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="period" type="xs:nonNegativeInteger" use="required"/>
  </xs:complexType>
</xs:element>

<xs:element name="AR">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="MA">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="MACoefficients" minOccurs="0"/>
      <xs:element ref="Residuals" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="MACoefficients">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="Residuals">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

The attribute p is the order of the non-seasonal autoregressive part of the model.

The attribute d is the order of the non-seasonal differencing.

The attribute q is the order of the non-seasonal moving average part of the model.

The attribute P is the order of the seasonal autoregressive part of the model.

The attribute D is the order of the seasonal differencing.

The attribute Q is the order of the seasonal moving average part of the model.

The attribute period is the seasonal length.

The REAL-ARRAY in AR specifies the polynomial coefficients of autoregressive (AR) part from order 1 to order p (or order P for seasonal AR).

The REAL-ARRAYs in MACoefficients specifies the polynomial coefficients of moving average (MA) part from order 1 to order q (or order Q for seasonal MA), as well as the residuals to which the coefficients are to be applied to.

The REAL-ARRAY in Residuals specifies the residual error values, that is the past differences between the actual time series values and the model predictions, to which the MA coefficients are multiplied by. These values are needed for the conditionalLeastSquares prediction method and for GARCH.

Dynamic Regressor

Time series predictions can usually depend on not just the previous values of the series but other independent time series as well. The fields containing such time series are called external or dynamic regressors.

The effects of the dynamic regressors can be added to the predictions of an ARIMA model or a State Space model (defined later in this document).

<xs:element name="DynamicRegressor">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="Numerator" minOccurs="0"/>
      <xs:element ref="Denominator" minOccurs="0"/>
      <xs:element ref="RegressorValues" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="field" type="FIELD-NAME" use="required"/>
    <xs:attribute name="transformation" default="none">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="none"/>
          <xs:enumeration value="logarithmic"/>
          <xs:enumeration value="squareroot"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="delay" type="INT-NUMBER" default="0"/>
    <xs:attribute name="futureValuesMethod" default="constant">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="constant"/>
          <xs:enumeration value="trend"/>
          <xs:enumeration value="stored"/>
          <xs:enumeration value="otherModel"/>
          <xs:enumeration value="userSupplied"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="targetField" type="FIELD-NAME"/>
  </xs:complexType>
</xs:element>

<xs:element name="Numerator">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="NonseasonalFactor" minOccurs="0"/>
      <xs:element ref="SeasonalFactor" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="Denominator">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="NonseasonalFactor" minOccurs="0"/>
      <xs:element ref="SeasonalFactor" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="SeasonalFactor">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
    <xs:attribute name="difference" type="INT-NUMBER" default="0"/>
    <xs:attribute name="maximumOrder" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

<xs:element name="NonseasonalFactor">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
    <xs:attribute name="difference" type="INT-NUMBER" default="0"/>
    <xs:attribute name="maximumOrder" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

Numerator and Denominator define the numerator and denominator polynomials of the fraction for a specified predictor. More details are given in the prediction methods section.

field is the field name to which all the operations defined in this element are to be applied to.

transformation specifies the transformation applied to this field. Possible values are "none" "logarithmic" and "squareroot", with default "none".

delay specifies the number of intervals by which the input field's influence is delayed. For example, if delay="5", the value of the input predictor at time t doesn't affect forecast until five periods have elapsed (t+5).

difference specifies the order of differencing applied to the selected predictor series before estimating models.

maximumOrder defines the maximum order of numerator or denominator polynomial. All positive lower orders should be included in the model and represented in the REAL-ARRAY. In addition, order zero is always included for the numerator polynomial.

RegressorValues contains the observed (and possibly future) values of the predictor. It can also have additional information needed for predicting future values of the predictor. This element is defined below.

futureValuesMethod specifies a way to obtain future values of the predictor needed for predicting the target values. The following options are available:

  • constant: assume the predictor is constant for future times, i.e. use the last known value for all later times.
  • trend: use trend with coefficients ci found in <TrendCoefficients> as follows: xt+1 = xt + c0 + c1t + c2t2 ...
  • stored: future values of the predictor are stored in element <TimeSeries> inside <RegressorValues>
  • otherModel: future values of the predictor are computed as predictions of another model found in the same PMML document, either as part of a MiningModel or directly in PMML element. In the latter case modelName of the model for the predictor must be the same as the predictor name to help find the connections.
  • userSupplied: future values of the predictor should be supplied by the user during scoring. This approach allows "what-if" analysis, i.e. trying different future values of predictors to see how the target will be affected.

targetField the target time series field name to which the regressor value should be added to. This is only needed for the case of multidimensional time series.

<xs:element name="RegressorValues">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="TimeSeries" minOccurs="0"/>
      <xs:element ref="TrendCoefficients" minOccurs="0"/>
      <xs:element ref="TransferFunctionValues" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="TrendCoefficients">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="REAL-SparseArray" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="TransferFunctionValues">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

The coefficients ci used for trend computation described above are stored in the TrendCoefficients element, with the index of the array corresponding to the order of the polynomial term to which the coefficient is applied.

TransferFunctionValues can contain the past values of transfer function, which help in computation of the future values of transfer function, see scoring description below.

The Exact Least Squares method

Kalman filter and Theta recursion information is stored in the following element:
<xs:element name="MaximumLikelihoodStat">
  <xs:complexType>
    <xs:sequence>
      <xs:choice>
        <xs:element ref="KalmanState"/>
        <xs:element ref="ThetaRecursionState"/>
      </xs:choice>
    </xs:sequence>
    <xs:attribute name="method" use="required">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="kalman"/>
          <xs:enumeration value="thetaRecursion"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="periodDeficit" default="0" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

method specifies the method used to do the prediction. Different methods involve different parameters that are needed when the model is used to forecast into the future. Supported methods are “kalman” and “thetaRecursion”. The Kalman filter method can be used for time series with or without missing, and the theta recursion method can be used for time series without missing value. The predicted time points are calculated based on Chapters 5 and 12 of Brockwell and Davis’s book.

periodDeficit specifies the number of missing values at the end of differenced series. The actual model fitting period ends at (endTime – periodDeficit). For example, the original dependent values Y[97], Y[98], Y[99], Y[100] are 11, 10, missing, 12, respectively. If the model calls for first-differencing, the differenced series dY[98], dY[99], dY[100] will be -1, missing, missing. The model fitting period will end at (100 – 2), and the periodDeficit=2 means that user expects to forecast from this model to start at 101.

Noise prediction for Kalman filter method

<xs:element name="KalmanState">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="FinalOmega"/>
      <xs:element ref="FinalStateVector"/>
      <xs:element ref="HVector" minOccurs="0" maxOccurs="1"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalOmega">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Matrix"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalStateVector">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="HVector">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>
KalmanState represents the final state of the Kalman iterations.
FinalOmega includes a symmetric m*m matrix which represents the final omega matrix where the m = max(p, q).
FinalStateVector includes a REAL-ARRAY. The length of the ARRAY should equal to m, ‘the larger of the degree of autoregression and the degree of moving average’.
HVector includes a REAL-ARRAY to define the first m psi weights instead of recalculating at scoring time.

Noise prediction for theta recursion method

<xs:element name="ThetaRecursionState">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="FinalNoise"/>
      <xs:element ref="FinalPredictedNoise"/>
      <xs:element ref="FinalTheta"/>
      <xs:element ref="FinalNu"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalNoise">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalPredictedNoise">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalTheta">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Theta" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="Theta">
  <xs:complexType>
    <xs:attribute name="i" type="INT-NUMBER"/>
    <xs:attribute name="j" type="INT-NUMBER"/>
    <xs:attribute name="theta" type="xs:double"/>
  </xs:complexType>
</xs:element>

<xs:element name="FinalNu">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>
ThetaRecursionState represents the final state of theta-recursion estimation.
FinalNoise includes a REAL-ARRAY which represents the final values of noise. The length of the ARRAY equals to m, ‘the larger of the degree of autoregression and the degree of moving average’.
FinalPredictedNoise includes a REAL-ARRAY which represents the final predictions of noise. The length of the ARRAY equals to q, the degree of moving average.
FinalTheta contains a collection of θi,i-k values needed for predicting future values of the series. Those values are kept in triples with two indices to avoid ambiguity. Each triple is kept in element Theta that has attributes for i, j=i-k, and the corresponding θi,j value. FinalNu contain values for νi.

OutlierEffect

<xs:element name="OutlierEffect">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="type" use="required">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="additive"/>
          <xs:enumeration value="level"/>
          <xs:enumeration value="transient"/>
          <xs:enumeration value="seasonalAdditive"/>
          <xs:enumeration value="trend"/>
          <xs:enumeration value="innovational"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="startTime" use="required" type="REAL-NUMBER"/>
    <xs:attribute name="magnitude" use="required" type="REAL-NUMBER"/>
    <xs:attribute name="dampingCoefficient" use="optional" type="REAL-NUMBER"/>
  </xs:complexType>
</xs:element>

OutlierEffect supports a number of detected outliers during the modeling process. Their types, locations, and magnitudes are represented by attributes type, startTime, and magnitude respectively. dampingCoefficient specifies the value of delta for the transient outlier. Details of their application are given in the Prediction Methods section.

additive means there is an additive outlier, as a pulse function, at the time of startTime which needs to add a magnitude for predicted values at that time.

level means there is a level shift, as a step function, from the time of startTime which needs to add a magnitude for all predicted values from that time on.

transient means there is a transient change, as a damp function, from the time of startTime which needs to add a magnitude for predicted values from that time on, and the magnitude is damped over time.

seasonalAdditive is similar with level, but the magnitude, as a seasonal pulse function, is only added to the subsequent time points with the same seasonal phase of startTime.

trend means there is a local trend from the time of startTime which needs to add a magnitude for predicted values from that time, and the magnitude is increased or decreased over time based on its sign.

innovational means there is an innovational outlier at the time of startTime which needs to add a magnitude for the noise at that time.

Prediction Methods for ARIMA

For ARIMA models, the conditionalLeastSquares prediction method is simple. Given all information up to time t, one predicts the time series at time t+1, and then uses that to predict the series at time t+2 and so on. Given all information up to a given time t, the prediction at time t+1 is calculated using the ARIMA equations defined, the information required is the AR and MA coefficients and the residuals. To calculate predictions at time t+2 (or t+h in general), one needs the residuals and time series values up to time t+1 (or t+h-1 in general) to apply the AR and MA coefficients to respectively. One assumes all residuals after time t are 0 and the time series values after time t are given by the prediction calculations done for those time values. Scoring examples below provide all the details. The future values of the regressors are either calculated based on a trend, or are assumed not to change with time, or are supplied in PMML or are provided by the user during scoring.

Example for a non-seasonal ARIMA model with conditional least squares prediction

Consider an ARIMA(3,1,1) model with auto-regressive coefficients φ1, φ2, φ3 and a moving-average coefficient θ1. The model is represented as:

Δ1Yt =   θ1(B)/ϕ3(B) * at

Multiplying both sides by ϕ3 (B) this can be written as:

  (1 - φ1B - φ2B2 - φ3B3)(1-B)yt = (1 - θ1B)at

Rewriting this, the prediction at time t+1 is:

  yt+1 = (1 + φ1)yt - (φ1 - φ2)yt-1 - (φ2 - φ3)yt-2 - φ3yt-3 - θ1at

here at is the residual at time t. The AR and MA elements give the coefficients at each time step. The coefficient array gives the values of the coefficients starting at time t, t-1 and so on. Assume that the AR and MA coefficients at each time step were calculated as φ1=0.05, φ2=0.1, φ3=0.3, θ1=-0.4 and the fit error (residual) was 2:

<PMML xmlns="http://www.dmg.org/PMML-4_4" version="4.4">
  <Header copyright="DMG.org"/>
  <DataDictionary>
    <DataField name="orders" optype="continuous" dataType="double" displayName="TS"/>
    <DataField name="date" optype="continuous" dataType="dateDaysSince[1970]" displayName="Time"/>
  </DataDictionary>
  <TimeSeriesModel functionName="timeSeries" bestFit="ARIMA">
    <MiningSchema>
      <MiningField name="date" usageType="order"/>
      <MiningField name="orders" usageType="target"/>
    </MiningSchema>
    <TimeSeries usage="logical" startTime="1" endTime="6">
      <TimeAnchor type="dateDaysSince[1970]" offset="6940" stepsize="1" displayName="Month">
        <TimeCycle length="12" type="includeFromTo" displayName="Year">
          <Array type="int" n="2">0 11</Array>
        </TimeCycle>
      </TimeAnchor>
      <TimeValue index="1" value="11357.92"/>
      <TimeValue index="2" value="10605.95"/>
      <TimeValue index="3" value="16998.57"/>
      <TimeValue index="4" value="6563.75"/>
      <TimeValue index="5" value="6607.69"/>
      <TimeValue index="6" value="9839.0"/>
    </TimeSeries>
    <ARIMA RMSE="218.6" transformation="none">
      <NonseasonalComponent p="3" d="1" q="1">
        <AR>
          <Array type="real" n="3">0.05 0.1 0.3 </Array>
        </AR>
        <MA>
          <MACoefficients>
            <Array type="real">-0.4</Array>
          </MACoefficients>
          <Residuals>
            <Array type="real">2</Array>
          </Residuals>
        </MA>
      </NonseasonalComponent>
    </ARIMA>
  </TimeSeriesModel>
</PMML>

The prediction of 'orders' at time index "7" is therefore:

orders = 1.05*9839 + 0.05*6607.69 + 0.2*6563.75 - 0.3*16998.57 - (-0.4)*2

Transfer Function Scoring and Examples

Forecasting in the historical data

Suppose that a data set {Yt,X1t,…,XKt} is given for t=1,...,n, also a time series model is given. Then the one-step-ahead forecast can be obtained using the following steps:

Step 1. Computation of transfer function
For each predictor series, transfer function needs to be calculated. For a predictor series Xit, let the transfer function be Vit = Numi / Deni Δdi ΔDi Bli. It can be calculated as follows.
    Calculate Uit = Δdi ΔDi Bli Xit
    Recursively calculate Vit = -Σj=1 to di Deni.coeff(j) * Vit-j + Σj=0 to ni Numi.coeff(j) * Uit-j
where ni and di are degrees of numerator and denominator polynomials for the i-th predictor, respectively, Numi.coeff(j) and Deni.coeff(j) are the coefficients of those polynomials. All missing Vit-j in the first term are taken to be Vi,-∞ and missing Uit-j in the second term are taken to be Ui,-∞, where Ui,-∞ is the first non-missing measurement of Uit-j and Vi,-∞ = Numi.coeff(1)/ Deni.coeff(1) * Ui,-∞.
Step 2. Computation of noise process
  Nt = Δd ΔD Yt - μ - Σi=1 to K Numi / Deni Δdi ΔDi Bli Xit
Step 3. One-step-ahead forecast of noise process
Nt is a zero mean ARMA(p+sP,q+sQ) process. One-step-ahead forecast can be obtained by theta recursive method or Kalman filter method, denote it as N̂t.
Step 4. One-step-ahead of forecast for the series Δd Δd Yt
Let Zt = Δd ΔD Yt, then one-step-ahead of series Zt is
   Ẑt = μ + Σi=1 to K Vit + N̂t
Step 5. One-step-ahead of forecast the Yt
  Ŷt = Ẑt - Σj=1 to d+Ds τj Yt-j
where τj is the coefficient corresponding to power j of the difference operator Δd ΔD
Note: If Yt is a transformation of an original series, then we need to back transform results to get the prediction of the original series.

Forecasting beyond the historical data

Suppose that time series {Yt, X1t,…,XKt} are given up to time n, and that future predictor series {X1t,…,XKt} are given for up to time n+H or can be computed based on futureValuesMethod specified in DynamicRegressor element. Now we can use the transfer function model that is built on the historical data to forecast future values of target for t=n+1 to n+H using the method described in the above section except that we cannot compute the noise values Nt, t = n+1,...,n+H because there is no target value Yt, t = n+1,...,n+H. So based on Nt and N̂t in historical data, we will compute one-step-ahead forecast for noise process if t=n+1, and h-step-ahead forecast if h is between 1 and n+H.

Kalman filter example and scoring

<PMML xmlns="http://www.dmg.org/PMML-4_4" version="4.4">
    <Header copyright="DMG.org"/>
    <DataDictionary>
        <DataField name="VALUE" optype="continuous" dataType="double" displayName=" TS-VALUE"/>
        <DataField name="TS" optype="continuous" dataType="double" displayName="TS"/>
    </DataDictionary>
    <TimeSeriesModel functionName="timeSeries" bestFit="ARIMA">
        <MiningSchema>
            <MiningField name="TS" usageType="order"/>
            <MiningField name="VALUE" usageType="target"/>
        </MiningSchema>
        <TimeSeries usage="logical" startTime="1" endTime="240">
            <TimeValue index="1" value="1.00"/>
            <TimeValue index="2" value="0.58"/>
            <TimeValue index="239" value="0.49"/>
            <TimeValue index="240" value="0.48"/>
        </TimeSeries>
        <ARIMA RMSE="1.00854505389749" transformation="none" constantTerm="0.375271182246529" predictionMethod="exactLeastSquares">
            <NonseasonalComponent p="1" d="0" q="1">
                <AR>
                    <Array type="real"> 0.441912328691372 </Array>
                </AR>
                <MA>
                    <MACoefficients>
                        <Array type="real">-0.135110189708823</Array>
                    </MACoefficients>
                </MA>
            </NonseasonalComponent>
            <MaximumLikelihoodStat method="kalman" periodDeficit="0">
                 <KalmanState>
                    <FinalOmega>
                        <Matrix kind="symmetric" nbRows="1" nbCols="1">
                            <Array type="real" n="1">0</Array>
                        </Matrix>
                    </FinalOmega>
                    <FinalStateVector>
                        <Array n="1" type="real">0.00493251439212172</Array>
                    </FinalStateVector>
                </KalmanState>
            </MaximumLikelihoodStat>
        </ARIMA>
    </TimeSeriesModel>
</PMML>

This is an ARIMA(1,0,1) model Yt = μ + ( 1 - θ B ) / ( 1 - φ B ) at
Here μ = 0.375271182246529, θ = -0.135110189708823, and φ = 0.441912328691372. Since there is no predictor, we do not need to compute the transfer function values. Let Nt = Yt - μ, t = 1,2,...240. We can get the forecast value for N241 assuming that the noise Nt follows ARMA(1,1) model. Let m = max( p, q ). The state-space representation of Nt has been already given as
St+1 = FSt + Het
Nt = GSt + et
where St is a state vector of length m.

In this case, m=1, G=1, F=φ, and S241 = 0.00493251439212172 which is computed after model building and saved in PMML. The one-step-ahead forecast of N241 is
241 = GS241 = 0.00493251439212172.
And the one-step-ahead forecast of Y241 is Ŷ241 = μ + N̂241 = 0.375271182246529 + 0.00493251439212172 = 0.3802037
If we want the two-step-ahead forecast, then S242 = FS241 = 0.441912328691372 * 0.00493251439212172 = 0.002179739, N̂242 = GS242=0.002179739, and Ŷ242 = μ + N̂242 = 0.3774509.

Theta Recursion Example and Scoring

<PMML xmlns="http://www.dmg.org/PMML-4_4" version="4.4">
    <Header copyright="DMG.org"/>
    <DataDictionary>
        <DataField name="TS" optype="continuous" dataType="double" displayName="T"/>
        <DataField name="x" optype="continuous" dataType="double" displayName="predictor-series"/>
        <DataField name="y" optype="continuous" dataType="double" displayName="target-series"/>
    </DataDictionary>
    <TimeSeriesModel functionName="timeSeries" bestFit="ARIMA">
        <MiningSchema>
            <MiningField name="TS" usageType="order"/>
            <MiningField name="x" usageType="active"/>
            <MiningField name="y" usageType="target"/>
        </MiningSchema>
        <TimeSeries usage="logical" startTime="1" endTime="63">
            <TimeValue index="1" value="3750.1797916378027"/>
            <TimeValue index="2" value="3846.066273602286"/>
            <TimeValue index="63" value="12141.488999636887"/>
        </TimeSeries>
        <ARIMA RMSE="90.8960561930059" transformation="none" constantTerm="342.594932405502" predictionMethod="exactLeastSquares">
            <NonseasonalComponent p="1" d="1" q="1">
                <AR>
                    <Array type="real">0.590549588503187</Array>
                </AR>
                <MA>
                    <MACoefficients>
                        <Array type="real">0.459274266537189</Array>
                    </MACoefficients>
                </MA>
            </NonseasonalComponent>
            <DynamicRegressor field="x" futureValuesMethod="userSupplied">
                <Numerator>
                    <NonseasonalFactor>
                        <Array type="real">-0.0114728000479396 -.0115880130277021</Array>
                    </NonseasonalFactor>
                </Numerator>
                <RegressorValues>
                    <TimeSeries usage="logical" startTime="63" endTime="63">
                        <TimeValue index="63" value="2538309.727789499"/>
                    </TimeSeries>
                </RegressorValues>
            </DynamicRegressor>
            <MaximumLikelihoodStat method="thetaRecursion" periodDeficit="0">
                 <ThetaRecursionState>
                    <FinalNoise>
                        <Array n="1" type="real">127.264876980187</Array>
                    </FinalNoise>
                    <FinalPredictedNoise>
                        <Array n="1" type="real">-2.10096285515485</Array>
                    </FinalPredictedNoise>
                    <FinalTheta>
                        <Theta i="63" j="1" theta="-0.459274266537189"/>
                    </FinalTheta>
                    <FinalNu>
                        <Array n="1" type="real">1</Array>
                    </FinalNu>
                </ThetaRecursionState>
            </MaximumLikelihoodStat>
        </ARIMA>
    </TimeSeriesModel>
</PMML>

The dataset has two variables: target Y and predictor X. The dataset has 63 points and the model is ARIMA(1,1,1) with a predictor Xt:
  (1-B)Yt = μ + (ω0 - ω1 B) Xt + (1 - θB)/(1 - φB) at
Where μ=342.594932405502, ω0=-0.0114728000479396 , ω1=-0.0115880130277021, θ=0.459274266537189 and φ=0.590549588503187. Now the following steps are used to compute the forecast value for t = 64.

Step 1. Compute transfer function for Xt:
   V64 = ω0*X64 - ω1*X63 = -0.0114728000479396 * 2538309.727789499 - 0.0115880130277021 * 2538309.727789499 = 292.4462.
Where X63 is saved in the PMML inside RegressorValues in DynamicRegressor, and X64 is obtained based on the constant trend.

Step 2. Compute the noise prediction
  N̂64=φ * N63 + θ63,1 ( N63 - N̂63 )
where N63 is noise at the time t=63. The noise N63, noise prediction N̂63, and parameter θ63,1 are saved in PMML after model building. We compute N̂64 = 15.74182.

Step 3. Let Z64 = ( 1 - B ) Y64. Its prediction is Ẑ64 = μ + V64 + N̂64 = 650.783

Step 4. The forecasted value for target at t=64 is Ŷ64 = Y63 + Ẑ64 = 12141.488999636887 + 650.783 = 12792.27

More generally, theta recursion scoring is computed as follows. Let Nt follow ARMA( p, q ) model. One-step-ahead noise forecast is given by

  N̂n+1 = φ1 Nn + φ2 Nn-1 +...+ φp Nn+1-p + Σj=1 to q θn,j (Nn+1-j - N̂n+1-j )

h-step-ahead noise forecast when h is less than or equal q is
   N̂n(h) = Σi=1 to p φin (h-i) + Σj=h to q θn+h-1,j (Nn+h-j - N̂n+h-j )
and for h > q
   N̂n(h) = Σi=1 to p φin (h-i)
where N̂n (j) = Nn-j for j less than or equal to 0.

And θij are computed recursively as follows:
For i=n+1,n+2,... when k < i - q we set θi,i-k = 0. For k between i-q and i-1 we have
  θi,i-k = νk-1 (κ(i+1,k+1) - Σj=i-q to k-1 θk,k-j θi,i-j νj )
  νi = κ(i+1,i+1) - Σj=i-q to i-1 θi,i-j2 νj = Σr=0 to q ϑr2 - Σk=1 to q θi,k2 νi-k
where κ(i+1,k+1) = Σr=0 to q-|i-k| ϑr ϑr+|i-k|, for i-q-1 < k < i, with ϑ0 = 1, ϑi = -θi for i=1,...,q, and ϑi = 0 if i > q.

Consider a slightly more complicated example of a theta recursion model.

<PMML xmlns="http://www.dmg.org/PMML-4_4" version="4.4">
    <Header copyright="DMG.org"/>
    <DataDictionary>
        <DataField name="TS" optype="continuous" dataType="double" displayName="T"/>
        <DataField name="x" optype="continuous" dataType="double" displayName="predictor-series"/>
        <DataField name="y" optype="continuous" dataType="double" displayName="target-series"/>
    </DataDictionary>
    <TimeSeriesModel functionName="timeSeries" bestFit="ARIMA">
        <MiningSchema>
            <MiningField name="TS" usageType="order"/>
            <MiningField name="x" usageType="active"/>
            <MiningField name="y" usageType="target"/>
        </MiningSchema>
        <TimeSeries usage="logical" startTime="1" endTime="63">
            <TimeValue index="1" value="3750.1797916378027"/>
            <TimeValue index="2" value="3846.066273602286"/>
            <TimeValue index="63" value="12141.488999636887"/>
        </TimeSeries>
        <ARIMA transformation="none" constantTerm="320.636778635366" predictionMethod="exactLeastSquares">
            <NonseasonalComponent p="1" d="1" q="2">
                <AR>
                    <Array type="real">-0.09114206662863</Array>
                </AR>
                <MA>
                    <MACoefficients>
                        <Array type="real">-0.21505809467966 -0.182217351518243 </Array>
                    </MACoefficients>
                </MA>
            </NonseasonalComponent>
            <DynamicRegressor field="x" futureValuesMethod="userSupplied">
                <Numerator>
                    <NonseasonalFactor>
                        <Array type="real">-0.0101715603339601  -0.0102725915582332</Array>
                    </NonseasonalFactor>
                </Numerator>
                <RegressorValues>
                    <TimeSeries usage="logical" startTime="63" endTime="63">
                        <TimeValue index="63" value="2538309.727789499"/>
                    </TimeSeries>
                </RegressorValues>
            </DynamicRegressor>
            <MaximumLikelihoodStat method="thetaRecursion" periodDeficit="0">
                <ThetaRecursionState>
                    <FinalNoise>
                        <Array n="2" type="real">-0.948468422344376 130.727431958243</Array>
                    </FinalNoise>
                    <FinalPredictedNoise>
                        <Array n="2" type="real">-9.3141625486384 27.6726476335164</Array>
                    </FinalPredictedNoise>
                    <FinalTheta>
                        <Theta i="63" j="1" theta="0.21505809467966"/>
                        <Theta i="63" j="2" theta="0.182217351518243"/>
                    </FinalTheta>
                    <FinalNu>
                        <Array n="2" type="real">1 1</Array>
                    </FinalNu>
                </ThetaRecursionState>
            </MaximumLikelihoodStat>
        </ARIMA>
    </TimeSeriesModel>
</PMML>

The model is ARIMA(1,1,2) with a predictor Xt:
  (1-B)Yt = μ + (ω0 - ω1 B) Xt + (1 - θ1B - θ2B2)/(1 - φB) at
Where μ=320.636778635366, ω0=-0.0101715603339601 , ω1=-0.0102725915582332, θ1=-0.21505809467966, θ2=-0.182217351518243, and φ=-0.09114206662863. Now the following steps are used to compute the forecast value for t = 64.

Step 1. Compute transfer function for Xt:
   V64 = ω0*X64 - ω1*X63 = =-0.0101715603339601 * 2538309.727789499 -(-0.0102725915582332) * 2538309.727789499 = 256.4485.
Where X63 is saved in the PMML inside RegressorValues in DynamicRegressor, and X64 is specified by user at the time of scoring.

Step 2. Compute the noise prediction
  N̂64 = φ * N63 + θ63,1 ( N63 - N̂63 ) + θ63,2 ( N62 - N̂62 )
where N63 is noise at the time t=63, N62 is noise at the time t=62. The noise N62 and N63, noise prediction N̂62 and N̂63, and parameters θ63,1 and θ63,2 are saved in PMML after model building. We compute N̂64 = 11.77238.

Step 3. Let Z64 = ( 1 - B ) Y64. Its prediction is Ẑ64 = μ + V64 + N̂64 = 588.8577

Step 4. The forecasted value for target at t=64 is Ŷ64 = Y63 + Ẑ64 = 12141.488999636887 + 588.8577 = 12730.35

To predict the time series at t=65 we will use the same steps, but we also need to compute the new θi,i-k.

Step 1. Compute transfer function for Xt, with user-supplied predictor values:
   V65 = ω0*X65 - ω1*X64 = =-0.0101715603339601 * 2538308 -(-0.0102725915582332) * 2538309.727789499 = 256.4661.

Step 2. Compute the noise prediction
  N̂65 = φ * N̂64 + θ64,2 ( N63 - N̂63 )
where θ64,2 is computed as follows:
Let i=64, k=62, then according to the theta recursive method, θ64,2 = θ64,(64-62) = ν62-1 (κ(65,63) - θ62,1 * θ64,3 * ν61 ) = ν62-10 ϑ2 - θ62,1 * θ64,3 * ν61 )
where ν61=1 and ν62=1, and they are saved in PMML after model building.
κ(65,63) = ϑ0 * ϑ2 = 1*(-θ2 ) = 0.182217351518243, and θ64,3 = 0, therefore θ64,2 = ν62-10 ϑ2 ) = 0.1822174
Finally, N̂63(2) = φ * N̂64 + θ64,2 (N63 - N̂63 ) = -0.09114206662863 * 11.77238 + 0.1822174 * (130.727431958243 - 27.6726476335164) = 17.70541

Step 3. Let Z65 = (1-B)Y65. Its prediction is Ẑ65 = μ + V65 + N̂63(2) = 594.8083

Step 4. The forecast value for target at t=65 is Ŷ65 = Ŷ64 + Ẑ65 = 12730.35 + 594.8083 = 13325.16

Confidence of Prediction Calculation

Step 1. Compute the variance estimate of h-step ahead of noise process:

If the theta recursive method is used, the variance of h-step ahead of noise process can be computed as

σNt2(h) = σ2Σj=0 to h-1r=0 to j χ2θt+h-r-1,j-r)2νt+h-j-1

Where σ2 is estimated as the squared RMSE, the constants χr are calculated recursively as

χ0 = 1

χr = Σk=1 to min(p,r) φkχr-k, r=1,2,3,...

And the computation details of νt and θij can be found in proposition 5.2.2 and equation 5.3.9 in Chapter 5 of Brockwell and Davis (1991) where this is called "The innovation algorithm".

If the Kalman Filter method is used, the variance of h-step ahead of noise process can be computed as

σNt2(h) = σ2t(h)(1,1) + 1)

Where σ2 is estimated as the squared RMSE, and the Ωt(h)(1,1) can be computed as

Ωt(h) = FΩt(h-1)FT + HHT

with Ωt(1) = Ωt+1

Step 2. Compute the confidence interval of h-step-ahead prediction:

The 100(1-α)% confidence interval of prediction at the time point t+h is computed as

(Ŷt(h) - tdf,α/2Nt(h), Ŷt(h) + tdf,α/2Nt(h))

where tdf,α/2 is the (1-α/2)100th percentile of the t distribution with degree of freedom df which can be computed by the number of valid noise residuals minus the number of parameters.

GARCH

GARCH (Generalized Auto Regressive Conditional Heteroscedasticity) models (Bollerslev, T. P., 1986, Generalized Autoregressive Conditional Heteroscedasticity, Journal of Econometrics, 31: 307-327.) are widely used to characterize and model time series with time-varying volatility. In GARCH models the conditional variance of the error terms are assumed to have a structure very similar to the structure of the conditional expectation in an ARMA model, i.e., GARCH models assume the variance of the current error term to be a linear function of the squares of the previous errors and their variances:
  Yt = μ + Σi=1 to p φiYt-i + Σj=1 to q θjεt-j + εt
  ht = α0 + Σi=1 to gq αiεt-i2 + Σj=1 to gp βjht-j
Here εt are residuals from the MA part of the non-seasonal component, ht are variances, αi and βj are residual square coefficients and coefficients of past variances, respectively. gp and gq are order of variance and order of residual squares, respectively.

The following element contains both the ARMA part and a GARCH part for a complete GARCH model.

<xs:element name="GARCH">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="ARMAPart"/>
      <xs:element ref="GARCHPart"/>
    </xs:sequence>    
  </xs:complexType>
</xs:element>

<xs:element name="ARMAPart">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="AR"/>
      <xs:element ref="MA"/>
    </xs:sequence>    
    <xs:attribute name="constant" default="0" type="REAL-NUMBER"/>
    <xs:attribute name="p" use="required" type="INT-NUMBER"/>
    <xs:attribute name="q" use="required" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

<xs:element name="GARCHPart">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="ResidualSquareCoefficients"/>
      <xs:element ref="VarianceCoefficients"/>
    </xs:sequence>    
    <xs:attribute name="constant" default="0" type="REAL-NUMBER"/>
    <xs:attribute name="gp" use="required" type="INT-NUMBER"/>
    <xs:attribute name="gq" use="required" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

<xs:element name="ResidualSquareCoefficients">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="Residuals" minOccurs="0"/>
      <xs:element ref="MACoefficients" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="VarianceCoefficients">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="PastVariances" minOccurs="0"/>
      <xs:element ref="MACoefficients" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="PastVariances">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

GARCH Example and Scoring

<PMML xmlns="http://www.dmg.org/PMML-4_4" version="4.4">
  <Header copyright="(c) DMG 2016" description="Time Series Example"/>
  <DataDictionary numberOfFields="1">
    <DataField dataType="double" displayName="Time Series" name="ts" optype="continuous"/>
  </DataDictionary>
  <TimeSeriesModel algorithmName="GARCH" bestFit="GARCH" functionName="timeSeries">
      <MiningSchema>
        <MiningField name="ts" usageType="target"/>
      </MiningSchema> 
      <TimeSeries usage="original" startTime="97" endTime="100" interpolationMethod="none">
        <TimeValue index="97" value="1.1"/>
        <TimeValue index="98" value="1.2"/>
        <TimeValue index="99" value="1.3"/>
        <TimeValue index="100" value="1.4"/>
      </TimeSeries>
      <GARCH>
        <ARMAPart p="4" q="2" constant="-3.4">
            <AR>
                <Array n="4" type="real">1.5 1.4 1.3 1.2</Array>
            </AR>
            <MA>
                <MACoefficients>
                    <Array n="2" type="real">1.1 1.2</Array>
                </MACoefficients>
                <Residuals>
                    <Array n="2" type="real">1.1 1.2</Array>
                </Residuals>
            </MA>
        </ARMAPart>
        <GARCHPart constant="0.12" gp="2" gq="3">
            <ResidualSquareCoefficients>
                <Residuals>
                    <Array n="3" type="real">1.1 1.2 1.3</Array>
                </Residuals>
                <MACoefficients>
                    <Array n="3" type="real">1.3 1.2 1.1</Array>
                </MACoefficients>
            </ResidualSquareCoefficients>
            <VarianceCoefficients>
                <PastVariances>
                    <Array n="2" type="real">1.1 1.2</Array>
                </PastVariances>
                <MACoefficients>
                    <Array n="2" type="real">1.3 1.2</Array>
                </MACoefficients>
            </VarianceCoefficients>
        </GARCHPart>
      </GARCH>
  </TimeSeriesModel>
</PMML>

The steps to score such a model one step ahead are:

  1. Compute Yt+1 using conditional least squares prediction method.
  2. Compute εt+1 = Yt+1 - Σi=1 to p φiYt+1-i - Σj=1 to q θjεt+1-j - μ
  3. Compute ht+1 = α0 + Σi=1 to gq αiεt+1-i2 + Σj=1 to gp βjht+1-j, where α0 is the constant in GARCHPart element, in this case 0.12.
This can be repeated for the needed number of steps.

In this example PMML the values of Yt are: 1.1, 1.2, 1.3, 1.4, μ=-3.4, p=4, q=2, gp=2, gq=3; values of φi are found in AR element of ARMAPart: 1.5, 1.4, 1.3, 1.2; values of θj are in MACoefficients of MA in ARMAPart: 1.1, 1.2; αi's are the MACoefficients in ResidualSquareCoefficients: 1.3, 1.2, 1.1; βi's are the MACoefficients in VarianceCoefficients: 1.3, 1.2.

State Space models

State Space models are multi-dimensional time series models which can be represented in the following form:

Yt = GXt + Wt
Xt = FXt-1 + Vt-1


The Observable vector Y is a vector representing an m-dimensional time series yt at time t.
The State vector X is a size p vector describing how the series is advanced from time t-1 to time t.
The Measurement matrix G is a m by p matrix converting the State Vector X at time t to an observable vector Yt.
The Transition matrix F is a p by p matrix describing how the state of the series changes from the previous time step to the present time.
W and V represent noise and error, not important for prediction of the value of each time series itself.
If dynamic regressors are present, the final predicted value is the sum of the value predicted by the dynamic regressors and the value predicted by the matrix multiplication method.

Note that general ARIMA models can be expressed as state space models, but not all state space models are ARIMA models.

<xs:element name="StateSpaceModel">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="StateVector" minOccurs="0"/>
      <xs:element ref="TransitionMatrix" minOccurs="0"/>
      <xs:element ref="MeasurementMatrix" minOccurs="0"/>
      <xs:element ref="PsiVector" minOccurs="0"/>
      <xs:element ref="DynamicRegressor" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="variance" type="REAL-NUMBER"/>
    <xs:attribute name="period" default="none"/>
    <xs:attribute name="intercept" type="REAL-NUMBER" default="0"/>
  </xs:complexType>
</xs:element>

<xs:element name="StateVector">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="TransitionMatrix">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="Matrix"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="MeasurementMatrix">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="Matrix"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="PsiVector">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
    <xs:attribute name="targetField" type="xs:string"/>
    <xs:attribute name="variance" type="xs:string"/>
  </xs:complexType>
</xs:element>

The vector "PsiVector" is a representation of an ARMA model as a MA model of infinite order. The elements of this vector, ψ, can be used to forecast the variance of the next step prediction; in most cases, a very good approximation of the variance of the predicted value h time steps in the future is:
σ2t+h = σt2(1 + ψ12 + ... + ψh-12)
For the case of multidimensional time series models, the targetField attribute defines which time series the vector belongs to and variance defines the variance of that time series. This allows for different variances for the different time series.

Example for a state space model with two target series

The result of fitting a state space model to a time series is the three parameters; the state vector Xt, the transition matrix F and the Measurement matrix G. For the purposes of prediction, the errors are ignored. Using the equations defining s state space model, the prediction of the observable vector Y is:
  Yt = G (FXt)
Continuing the method, the prediction for the next time step would be:
  Yt+1 = G (F Xt+1)
hence
   Yt+1 =G (F (F Xt))

Consider two stationary time series, Y1 and Y2, fit with a state space model resulting in the parameters:
 G =
1 0 0 1
0 0 1 0

 F =
0.052 1 0 0
0.019 0 1 0
0.373 0 0 0
1.0 0 0 1

 X = [-0.53, -2.41, 0.78, 88.74]T
Next suppose the time series depends on 1 regressor variable z, given by
  Y1(z) = zt + zt-1
  Y2(z) = zt
In other words, the final prediction for the time series at time t+1 is:

  Yt+1 = G (FXt) + Z
where
  Z1 = zt+1 + zt
  Z2 = zt+1

The PMML representation is given as:
<PMML xmlns="http://www.dmg.org/PMML-4_4" version="4.4">
  <Header copyright="DMG.org"/>
  <DataDictionary>
    <DataField name="Y1" optype="continuous" dataType="double"/>
    <DataField name="Y2" optype="continuous" dataType="double"/>
    <DataField name="date" optype="continuous" dataType="dateDaysSince[1970]" displayName="TS-VALUE"/>
    <DataField name="z" optype="continuous" dataType="double" displayName="ExternalRegressor"/>
  </DataDictionary>
  <TimeSeriesModel functionName="timeSeries" bestFit="StateSpaceModel">
    <MiningSchema>
      <MiningField name="date" usageType="order"/>
      <MiningField name="z" usageType="active"/>
      <MiningField name="Y1" usageType="target"/>
      <MiningField name="Y2" usageType="target"/>
    </MiningSchema>
    <Output>
      <OutputField name="orders" feature="predictedValue" targetField="Y1" dataType="double"/>
      <OutputField name="profit" feature="predictedValue" targetField="Y2" dataType="double"/>
    </Output>
    <TimeSeries usage="logical" startTime="1" endTime="6" field="Y1">
      <TimeAnchor type="dateDaysSince[1970]" offset="6940" stepsize="1" displayName="Month">
        <TimeCycle length="12" type="includeFromTo" displayName="Year">
          <Array type="int" n="2">0 11</Array>
        </TimeCycle>
      </TimeAnchor>
      <TimeValue index="1" value="11357.92"/>
      <TimeValue index="2" value="10605.95"/>
      <TimeValue index="3" value="16998.57"/>
      <TimeValue index="4" value="6563.75"/>
      <TimeValue index="5" value="6607.69"/>
      <TimeValue index="6" value="9839.0"/>
    </TimeSeries>
    <TimeSeries usage="logical" startTime="1" endTime="6" field="Y2">
      <TimeAnchor type="dateDaysSince[1970]" offset="6940" stepsize="1" displayName="Month">
        <TimeCycle length="12" type="includeFromTo" displayName="Year">
          <Array type="int" n="2">0 11</Array>
        </TimeCycle>
      </TimeAnchor>
      <TimeValue index="1" value="11357.92"/>
      <TimeValue index="2" value="10605.95"/>
      <TimeValue index="3" value="16998.57"/>
      <TimeValue index="4" value="6563.75"/>
      <TimeValue index="5" value="6607.69"/>
      <TimeValue index="6" value="9839.0"/>
    </TimeSeries>
      <StateSpaceModel>
        <StateVector>
          <Array type="real">-0.52 -2.41 0.78 88.74</Array>
        </StateVector>
        <TransitionMatrix>
      <Matrix nbRows="4" nbCols="4">
        <Array type="real">0.052 1 0 0</Array>
            <Array type="real">0.119 0 1 0</Array>
            <Array type="real">0.373 0 0 0</Array>
            <Array type="real">1.000 0 0 1</Array>
      </Matrix>
        </TransitionMatrix>
        <MeasurementMatrix>
          <Matrix nbRows="1" nbCols="4">
            <Array type="real">1 0 0 1</Array>
            <Array type="real">0 0 1 0</Array>
          </Matrix>
        </MeasurementMatrix>
        <PsiVector targetField="Y1" variance="10">
          <Array type="real">-0.402 0.098 0.330</Array>
        </PsiVector>
        <DynamicRegressor field="z" targetField="Y1" delay="0" futureValuesMethod="constant">
          <Numerator>
            <NonseasonalFactor>
              <Array type="real">1 -1</Array>
            </NonseasonalFactor>
          </Numerator>
          <Denominator>
            <NonseasonalFactor>
              <Array type="real">1</Array>
            </NonseasonalFactor>
          </Denominator>
          <RegressorValues>
            <TimeSeries>
              <TimeValue index="6" value="9.1"/>
            </TimeSeries>
          </RegressorValues>
        </DynamicRegressor>
        <DynamicRegressor field="z" targetField="Y2" delay="0" futureValuesMethod="constant">
          <Numerator>
            <NonseasonalFactor>
              <Array type="real">1</Array>
            </NonseasonalFactor>
          </Numerator>
          <Denominator>
            <NonseasonalFactor>
              <Array type="real">1</Array>
            </NonseasonalFactor>
          </Denominator>
          <RegressorValues>
            <TimeSeries>
              <TimeValue index="6" value="9.1"/>
            </TimeSeries>
          </RegressorValues>
        </DynamicRegressor>
      </StateSpaceModel>
  </TimeSeriesModel>
</PMML>

  Y1t = 85.77
  Y2t = -0.19
The coefficients of the regressors are stored in the Numerator and Denominator element. Substituting those, we get the expected equation for the regressor prediction:
  Y1(z) = z7 + z6
  Y2(z) = z7
The regressor value at t=6 is stored in the RegressorValue element
   z6 = 9.1
The regressor value at t=7 is unknown, but the futureValuesMethod="constant" attribute in the DynamicRegressor element indicates that we should take it to be same as that at t=6:
   z7 = 9.1
We therefore get the final prediction:
   orders = 85.77 + 1*9.1 + 1*9.1 = 103.97
   profit = -0.19 + 1*9.1 = 8.91

The variance of the series Y1at time index 6 is given as 10 and the first element of the PsiVector is given; the predicted variance of the series prediction at time index 7 is therefore:
  10*(1 + (-0.402)2) = 11.6
Since the variance of the series Y2 is not given, we cannot calculate the variance of the predicted value of Y2

Other types of models.

The following elements are not used in this version of PMML and only serve as placeholders for future versions.
SpectralAnalysis describes the Fourier spectrum of a time series.
SeasonalTrendDecomposition contains one or more fit functions which represent the trend component of the time series and optionally contain information on seasonal oscillations which are modeled on top of the trend component.

<xs:element name="SpectralAnalysis">
</xs:element>

<xs:element name="SeasonalTrendDecomposition">
</xs:element>
e-mail info at dmg.org