PMML 4.4.1 - Time Series Models

A Time Series is a sequence of data points, measured at points in time, usually, but not necessarily, forming equidistant intervals. Time series analysis strives to understand such time series, often with the goal of making forecasts (predictions) or of filling in missing values between known data points. Time series prediction is the use of a model to predict future events based on known past events before they are measured. Interpolation is the use of a model to complement or amend values between known data points.

The model must contain information on the general trend, a description of periodic behavior and an overall fitting function that can be used for forecasting and/or interpolation. It may also contain detailed information on various aspects of the time series and the expected forecasting accuracy.

In addition to the entries common to all models, a TimeSeriesModel contains results of at least one time series algorithm, for example SpectralAnalysis, ARIMA, ExponentialSmoothing or SeasonalTrendDecomposition. In PMML 4.4, only Exponential Smoothing, ARIMA, GARCH, and State Space models are defined, the other algorithms are planned for later versions. TimeSeries elements can hold original or predicted time series values.

Please find the documentation for particular model types in the following sections:

Exponential Smoothing
ARIMA and Transfer Function Models
GARCH
State Space Models
Other Model Types

<xs:element name="TimeSeriesModel">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="MiningSchema"/>
      <xs:element ref="Output" minOccurs="0"/>
      <xs:element ref="ModelStats" minOccurs="0"/>
      <xs:element ref="ModelExplanation" minOccurs="0"/>
      <xs:element ref="LocalTransformations" minOccurs="0"/>
      <xs:element ref="TimeSeries" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="SpectralAnalysis" minOccurs="0"/>
      <xs:element ref="ARIMA" minOccurs="0"/>
      <xs:element ref="ExponentialSmoothing" minOccurs="0"/>
      <xs:element ref="SeasonalTrendDecomposition" minOccurs="0"/>
      <xs:element ref="StateSpaceModel" minOccurs="0"/>
      <xs:element ref="GARCH" minOccurs="0"/>
      <xs:element ref="ModelVerification" minOccurs="0"/>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="modelName" type="xs:string" use="optional"/>
    <xs:attribute name="functionName" type="MINING-FUNCTION" use="required"/>
    <xs:attribute name="algorithmName" type="xs:string" use="optional"/>
    <xs:attribute name="bestFit" type="TIMESERIES-ALGORITHM" use="required"/>
    <xs:attribute name="isScorable" type="xs:boolean" default="true"/>
  </xs:complexType>
</xs:element>
 

<xs:simpleType name="TIMESERIES-ALGORITHM">
  <xs:restriction base="xs:string">
    <xs:enumeration value="ARIMA"/>
    <xs:enumeration value="ExponentialSmoothing"/>
    <xs:enumeration value="SeasonalTrendDecomposition"/>
    <xs:enumeration value="SpectralAnalysis"/>
    <xs:enumeration value="StateSpaceModel"/>
    <xs:enumeration value="GARCH"/>
  </xs:restriction>
</xs:simpleType>

modelName and algorithmName can have arbitrary strings describing the specific model and algorithm.

functionName must be timeSeries.

The isScorable attribute indicates whether the model is valid for scoring. If this attribute is true or if it is missing, then the model should be processed normally. However, if the attribute is false, then the model producer has indicated that this model is intended for information purposes only and should not be used to generate results. In order to be valid PMML, all required elements and attributes must be present, even for non-scoring models. For more details, see General Structure.

The attribute bestFit is required, and indicates which of the time series algorithms provided by the model results in the best-fitting model. This algorithm should be used when scoring the model.

<xs:element name="TimeSeries">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="TimeAnchor" minOccurs="0" maxOccurs="1"/>
      <xs:element ref="TimeValue" minOccurs="1" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="usage" type="TIMESERIES-USAGE" default="original"/>
    <xs:attribute name="startTime" type="REAL-NUMBER"/>
    <xs:attribute name="endTime" type="REAL-NUMBER"/>
    <xs:attribute name="interpolationMethod" type="INTERPOLATION-METHOD" default="none"/>
    <xs:attribute name="field" type="FIELD-NAME"/>
  </xs:complexType>
</xs:element>

<xs:simpleType name="TIMESERIES-USAGE">
  <xs:restriction base="xs:string">
    <xs:enumeration value="original"/>
    <xs:enumeration value="logical"/>
    <xs:enumeration value="prediction"/>
  </xs:restriction>
</xs:simpleType>

<xs:element name="TimeValue">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Timestamp" minOccurs="0" maxOccurs="1"/>
    </xs:sequence>
    <xs:attribute name="index" type="INT-NUMBER" use="optional"/>
    <xs:attribute name="time" type="NUMBER" use="optional"/>
    <xs:attribute name="value" type="REAL-NUMBER" use="required"/>
    <xs:attribute name="standardError" type="REAL-NUMBER" use="optional"/>
  </xs:complexType>
</xs:element>

<xs:element name="TimeAnchor">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="TimeCycle" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="TimeException" minOccurs="0" maxOccurs="2"/>
    </xs:sequence>
    <xs:attribute name="type" type="TIME-ANCHOR"/>
    <xs:attribute name="offset" type="INT-NUMBER"/>
    <xs:attribute name="stepsize" type="INT-NUMBER"/>
    <xs:attribute name="displayName" type="xs:string" use="optional"/>
  </xs:complexType>
</xs:element>

<xs:element name="TimeCycle">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="INT-ARRAY" minOccurs="0" maxOccurs="1"/>
    </xs:sequence>
    <xs:attribute name="length" type="INT-NUMBER"/>
    <xs:attribute name="type" type="VALID-TIME-SPEC"/>
    <xs:attribute name="displayName" type="xs:string" use="optional"/>
  </xs:complexType>
</xs:element>

<xs:simpleType name="TIME-ANCHOR">
  <xs:restriction base="xs:string">
    <xs:enumeration value="dateTimeMillisecondsSince[0]"/>
    <xs:enumeration value="dateTimeMillisecondsSince[1960]"/>
    <xs:enumeration value="dateTimeMillisecondsSince[1970]"/>
    <xs:enumeration value="dateTimeMillisecondsSince[1980]"/>
    <xs:enumeration value="dateTimeSecondsSince[0]"/>
    <xs:enumeration value="dateTimeSecondsSince[1960]"/>
    <xs:enumeration value="dateTimeSecondsSince[1970]"/>
    <xs:enumeration value="dateTimeSecondsSince[1980]"/>
    <xs:enumeration value="dateDaysSince[0]"/>
    <xs:enumeration value="dateDaysSince[1960]"/>
    <xs:enumeration value="dateDaysSince[1970]"/>
    <xs:enumeration value="dateDaysSince[1980]"/>
    <xs:enumeration value="dateMonthsSince[0]"/>
    <xs:enumeration value="dateMonthsSince[1960]"/>
    <xs:enumeration value="dateMonthsSince[1970]"/>
    <xs:enumeration value="dateMonthsSince[1980]"/>
    <xs:enumeration value="dateYearsSince[0]"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="VALID-TIME-SPEC">
  <xs:restriction base="xs:string">
    <xs:enumeration value="includeAll"/>
    <xs:enumeration value="includeFromTo"/>
    <xs:enumeration value="excludeFromTo"/>
    <xs:enumeration value="includeSet"/>
    <xs:enumeration value="excludeSet"/>
  </xs:restriction>
</xs:simpleType>

<xs:element name="TimeException">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="INT-ARRAY" minOccurs="1"/>
    </xs:sequence>
    <xs:attribute name="type" type="TIME-EXCEPTION-TYPE"/>
    <xs:attribute name="count" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

<xs:simpleType name="TIME-EXCEPTION-TYPE">
  <xs:restriction base="xs:string">
    <xs:enumeration value="exclude"/>
    <xs:enumeration value="include"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="INTERPOLATION-METHOD">
  <xs:restriction base="xs:string">
    <xs:enumeration value="none"/>
    <xs:enumeration value="linear"/>
    <xs:enumeration value="exponentialSpline"/>
    <xs:enumeration value="cubicSpline"/>
  </xs:restriction>
</xs:simpleType>

The element TimeSeries contains a time series consisting of one or more TimeValue objects. The time series can be an original time series as read from the input data; in this case the attribute interpolationMethod is 'none'. Or it can be a pre-processed and interpolated time series; pre-processing and interpolation may be necessary to produce a logical time series, for most time series algorithms require a sequence of logically equidistant time steps. If a logical time series is present, it was used as input into the algorithm. Finally, the time series (usage = 'prediction') may hold values predicted by the best-fitting model.

StartTime and endTime refer to points in an input time series between which the points were used for fitting. They can be integers indicating the index into a logical time series or real numbers indicating original points in time. These values are required when the usage type is original or logical.
The attribute interpolationMethod names the interpolation method used to compute values between existing (or predicted) data points. It is one of {'none', 'linear', 'exponentialSpline', 'cubicSpline'}.

The attribute field provides the field name for the time series variable and is required if there are several time series fields in the context.

TimeValue contains one single point of a time series. The point can either be a known point from the past; in this case, only the attribute value is required. In addition, time or index must be used. In case of a logical TimeSeries, index values must be present. Or the time point is a predicted future value; in this case, the attribute standardError can contain the incertitude (predicted standard error) of the prediction standard based on the empirically determined error.
Note: TimeAnchor and TimeCycle define the correlation to calendar times. Optionally, a contained element Timestamp may hold a string describing the time for presentation purposes, see Header.

TimeAnchor optionally defines the relationship between time points in a time series and a calendar. It is not used for computing predictions, but it may be used by applications or visualization tools that want to come up with predictions based on points of time in a calendar as opposed to just a look-ahead index. Time is anchored at an offset with respect to a specified calendar point given by type. And the flow of time is defined in smallest steps of size stepsize. Both offset and stepsize are (long) integer values in the units specified in type. An optional displayName, e.g. "day" can be provided as a name for the time step.

TimeCycle allows to express the situation where time steps are not contiguous on a calendar. As an example, consider hourly revenue data of a store that opens Mo-Sat from 7am to 9pm. One has to represent hours as being the step size of the data, but one also wants to be able to specify that Sundays and night-shifts should be disregarded, i.e. for the time series prediction, the value for Monday 8am (aggregated revenue between 7am and 8am) immediately follows that of Saturday 9pm.
Each TimeCycle divides the sequence of time steps defined by the previous TimeCycle (or the TimeAnchor) into cycles of equal length, each cycle consisting of length steps. Index values for these steps run from 0 to length - 1, and are used in the specification of valid steps. Type defines whether this definition is by interval or enumeration and whether by inclusion or exclusion, and the contained Array element provides the interval boundaries or enumerates the values. The following is the specification of the shop hours in the example:

<TimeAnchor type="dateTimeSecondsSince[1960]" offset="1530543600" stepsize="3600" displayName="hour">
  <TimeCycle length="24" type="includeFromTo" displayName="day">
    <Array type="int" n="2">7 20</Array>
  </TimeCycle>
  <TimeCycle length="7" type="excludeSet" displayName="week">
    <Array type="int" n="1">6</Array>
  </TimeCycle>
</TimeAnchor>

Calendar entries can now be described as a sequence of values. The 15^th hour of the 6^th day in the 30^th week since the time anchor (1530543600 seconds after the beginning of 1960) would become <29, 5, 14>.

In addition to the regular behavior, there may be exceptions to the TimeStep specification. The store may, for example, be closed on July 4^th, but exceptionally open late because of an event on some other day. This is captured by up to two TimeExceptions, which contain lists of unsystematic exclusions or inclusions as arrays of index values. All index values of a certain TimeCycle can be specified by using the length value instead of a valid index; -1 is used for the regular indexes..

The following TimeExceptions specify additional shop closure and opening hours.

<TimeExceptions type="exclude" count="2">
  <!-- closed in the 5^th week on the 6^th day at the 8^th hour -->
  <Array type="int">4 5 7</Array>
  <!-- closed in the 33^rd week, throughout the 1^st day -->
  <Array type="int">32 0 24</Array>
</TimeExceptions%gt;

<TimeExceptions type="include" count="2">
  <!-- open in the 1^st week on the 7^th day at regular hours -->
  <Array type="int">0 6 -1</Array>
  <!-- open in the 34^th week on the 6^th day at the 20^th hour -->
  <Array type="int">33 5 19</Array>
</TimeExceptions>

Exponential Smoothing

ExponentialSmoothing contains an exponential smoothing model for the time series. It is one out of the 15 possible model type combinations (no trend N, additive trend A, damped additive trend DA, multiplicative trend M, damped multiplicative trend DM) * (no seasonality N, additive seasonality A, multiplicative seasonality M). If the model contains a seasonality, the seasonality info is captured in the Seasonality sub-element. Each TimeValue sub-element contains one predicted time point. The predicted time points are calculated from Gardner, Jr., E.S., "Exponential smoothing: The state of the art - Part II", 3 Jun. 2005 for the given Trend combined with the Seasonality type. The number of predicted time points contained in the model may be determined by the modeling kernel, for example by using the incertitude ranges of each prediction.

This model also supports the multiple smoothing techniques described in the Multiple Smoothing for Higher-order Polynomials found in Brown, R.G., Smoothing, Forecasting and Prediction of Discrete Time Series, Mineola, New York: Dover Publications, Inc., 2004.

<xs:element name="ExponentialSmoothing">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Level" minOccurs="1" maxOccurs="1"/>
      <xs:element ref="Trend_ExpoSmooth" minOccurs="0" maxOccurs="1"/>
      <xs:element ref="Seasonality_ExpoSmooth" minOccurs="0" maxOccurs="1"/>
      <xs:element ref="TimeValue" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="RMSE" type="REAL-NUMBER"/>
    <xs:attribute name="transformation" default="none">
      <xs:simpleType>
        <xs:restriction base="xs:NMTOKEN">
          <xs:enumeration value="none"/>
          <xs:enumeration value="logarithmic"/>
          <xs:enumeration value="squareroot"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:complexType>
</xs:element>

RMSE is the root mean squared error of the predictions.

Transformation specifies what transformation has been applied to the time series prior to executing the algorithm. Possible values are "none," "logarithmic" and "squareroot." This attribute is informational only, and does not affect scoring.

Level specifies smoothedValue the smoothed value of the time series at the last known point of the history. The optional attribute alpha is the optimal smoothing parameter for the level. It can be used to continue the fitting process if more data become known, but it is not needed for scoring. However, it may be used to compute theoretical confidence intervals.

<xs:element name="Level">
  <xs:complexType>
    <xs:attribute name="alpha" type="REAL-NUMBER" use="optional"/>
    <xs:attribute name="smoothedValue" type="REAL-NUMBER"/>
  </xs:complexType>
</xs:element>

Trend_ExpoSmooth specifies the smoothed value or coefficients of the trend at the last known point of the history. For the Gardner models, the smoothed value can be found in the smoothedValue attribute; for Brown's polynomial models, the smoothed coefficients can be found in the array sub-element. The smoothed coefficients are required if the specified trend is "polynomial_exponential;" otherwise, the smoothed value is required. The optional attribute gamma is the optimal smoothing parameter for the trend. It can be used to continue the fitting process if more data become known, but it is not needed for scoring. The damping parameter phi is needed when scoring the damped versions of the Gardner models.

<xs:element name="Trend_ExpoSmooth">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="trend" default="additive">
      <xs:simpleType>
        <xs:restriction base="xs:NMTOKEN">
          <xs:enumeration value="additive"/>
          <xs:enumeration value="damped_additive"/>
          <xs:enumeration value="multiplicative"/>
          <xs:enumeration value="damped_multiplicative"/>
          <xs:enumeration value="polynomial_exponential"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute> 
    <xs:attribute name="gamma" type="REAL-NUMBER" use="optional"/>
    <xs:attribute name="phi" type="REAL-NUMBER" use="optional" default="1"/>
    <xs:attribute name="smoothedValue" type="REAL-NUMBER" use="optional"/>
  </xs:complexType>
</xs:element>

Seasonality_ExpoSmooth describes a periodic oscillation cycle with a length of period time units, where period must be a positive integer. The phase indicates the season index of the last known data point; it defaults to period. The oscillation can be additive, that means of the form 'trend + oscillation' or multiplicative, that means of the form 'trend * oscillation'. Unit is a string used for naming the cycles, such as "week" or "year." It is optional and serves only for explanatory purposes. delta is seasonal smoothing weight. The sub-element RealArray (of size period) contains floating point numbers which describe the local values of the oscillation at each of the season indices. In the additive case, the sum of all these numbers may be normalized to 0. In the multiplicative case, the product of all these numbers may be normalized to 1.

<xs:element name="Seasonality_ExpoSmooth">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
    <xs:attribute name="type" use="required">
      <xs:simpleType>
        <xs:restriction base="xs:NMTOKEN">
          <xs:enumeration value="additive"/>
          <xs:enumeration value="multiplicative"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="period" type="INT-NUMBER" use="required"/>
    <xs:attribute name="unit" type="xs:string" use="optional"/>
    <xs:attribute name="phase" type="INT-NUMBER" use="optional"/>
    <xs:attribute name="delta" type="REAL-NUMBER" use="optional"/>
  </xs:complexType>
</xs:element>

Scoring Procedure

Predictions of future events in a time series are based on a set of well-defined formulae. The formula used to score a particular model is based on the specified trend and seasonality.

The formulae use the following notation:

Symbol	Definition	PMML equivalent
m	Number of periods in the forecast lead-time	Input to the model
X_t(m)	Forecast for m periods ahead from origin t	Output of the model
p	Number of periods in the seasonal cycle	period attribute within the Seasonal_ExpoSmooth element
α	Smoothing parameter for the level of the series	alpha attribute within the Level element
φ	Autoregressive or damping parameter	phi attribute within the Trend_ExpoSmooth element
S_t	Smoothed level of the series, computed after X_t is observed. Also the expected value of the data at the end of period t in some models	smoothedValue attribute within the Level element
T_t	Smoothed additive trend at the end of period t	smoothedValue attribute within the Trend_ExpoSmooth element
R_t	Smoothed multiplicative trend at the end of period t	smoothedValue attribute within the Trend_ExpoSmooth element
I_t	Smoothed seasonal index at the end of period t. The reference I_t-p+m found in the formulae below identifies a specific seasonal index, found by cycling through the list of seasonal indexes	Array element within the Seasonality_ExpoSmooth element
a₀,a₁...a_n	Smoothed coefficients used by Brown's multiple smoothing formulae	Array element within the Trend_ExpoSmooth element

The complete set of formulae are listed below. The first 15 formulae are identified by the combination of trend and seasonality specified by the model (i.e. <trend>-<seasonality>).

The remaining formula is used for Brown's polynomial models. In this case, the number of values found in the Array element within the Trend_ExpoSmooth element will dictate the order of the polynomial model. A single coefficient indicates a constant model, for which predictions are calculated by applying the formula X_t(m) = a₀. For a linear model (two coefficients), the formula X_t(m) = a₀ + a₁m is used, while a quadratic model (three coefficients) would use the formula X_t(m) = a₀ + a₁m + (1/2)a₂m² .

N-N: X_t(m) = S_t

N-A: X_t(m) = S_t + I_t-p+m

N-M: X_t(m) = S_tI_t-p+m

A-N: X_t(m) = S_t + mT_t

A-A: X_t(m) = S_t + mT_t + I_t-p+m

A-M: X_t(m) = (S_t + mT_t)I_t-p+m

DA-N: X_t(m) = S_t + (Sum[i in 1..m]φⁱT_t)

DA-A: X_t(m) = S_t + (Sum[i in 1..m]φⁱT_t) + I_t-p+m

DA-M: X_t(m) = (S_t + (Sum[i in 1..m]φⁱT_t))I_t-p+m

M-N: X_t(m) = S_tR_t^m

M-A: X_t(m) = S_tR_t^m + I_t-p+m

M-M: X_t(m) = (S_tR_t^m)I_t-p+m

DM-N: X_t(m) = S_tR_t^{(Sum[i in
1..m]φⁱ)}

DM-A: X_t(m) = S_tR_t^{(Sum[i in 1..m]φⁱ)} + I_t-p+m

DM-M: X_t(m) = (S_tR_t^{(Sum[i in
1..m]φⁱ)})I_t-p+m

Brown: X_t(m) = a₀ + a₁m + (1/2)a₂m² + ... + (1/n!)a_nmⁿ

Example for a time series model:

<PMML xmlns="https://www.dmg.org/PMML-4_4" version="4.4">
  <Header copyright="2018 DMG.org">
    <Application name="test application"/>
    <Timestamp>2008-06-23 10:30:00</Timestamp>
  </Header>
  <DataDictionary numberOfFields="2">
    <DataField dataType="integer" optype="continuous" name="TS" displayName="TS"/>
    <DataField dataType="double" optype="continuous" name="VALUE" displayName="TS-VALUE"/>
  </DataDictionary>
  <TimeSeriesModel modelName="AA2Model" functionName="timeSeries" algorithmName="exponential smoothing" bestFit="ExponentialSmoothing">
    <MiningSchema>
      <MiningField name="TS" usageType="order"/>
      <MiningField name="VALUE" usageType="target"/>
    </MiningSchema>
    <TimeSeries usage="logical" startTime="1" endTime="24" interpolationMethod="none">
      <TimeValue index="1" value="112"/>
      <TimeValue index="2" value="118"/>
      <TimeValue index="3" value="132"/>
      <TimeValue index="4" value="129"/>
      <TimeValue index="5" value="121"/>
      <TimeValue index="6" value="135"/>
      <TimeValue index="7" value="148"/>
      <TimeValue index="8" value="148"/>
      <TimeValue index="9" value="136"/>
      <TimeValue index="10" value="119"/>
      <TimeValue index="11" value="104"/>
      <TimeValue index="12" value="118"/>
      <TimeValue index="13" value="115"/>
      <TimeValue index="14" value="126"/>
      <TimeValue index="15" value="141"/>
      <TimeValue index="16" value="135"/>
      <TimeValue index="17" value="125"/>
      <TimeValue index="18" value="149"/>
      <TimeValue index="19" value="170"/>
      <TimeValue index="20" value="170"/>
      <TimeValue index="21" value="158"/>
      <TimeValue index="22" value="133"/>
      <TimeValue index="23" value="114"/>
      <TimeValue index="24" value="140"/>
    </TimeSeries>
    <TimeSeries usage="prediction" interpolationMethod="none">
      <TimeValue index="25" value="145" standardError="7.3"/>
      <TimeValue index="26" value="150" standardError="8.3"/>
      <TimeValue index="27" value="178" standardError="9.3"/>
      <TimeValue index="28" value="163" standardError="10.3"/>
      <TimeValue index="29" value="172" standardError="11.3"/>
      <TimeValue index="30" value="178" standardError="12.3"/>
      <TimeValue index="31" value="199" standardError="13.3"/>
      <TimeValue index="32" value="199" standardError="14.3"/>
      <TimeValue index="33" value="184" standardError="15.3"/>
      <TimeValue index="34" value="162" standardError="16.3"/>
      <TimeValue index="35" value="146" standardError="17.3"/>
      <TimeValue index="36" value="166" standardError="18.3"/>
    </TimeSeries>
    <ExponentialSmoothing RMSE="7.3">
      <Level alpha="0.233984" smoothedValue="139.8"/>
      <Trend_ExpoSmooth smoothedValue="4.139" gamma="3.910E-4" phi="1.006" trend="damped_additive"/>
      <Seasonality_ExpoSmooth type="multiplicative" period="12" unit="month" delta="0.8254" phase="12">
        <Array n="12" type="real">
          .900 .840 .924 .976 .994 1.120 0.981 1.025 1.038 1.038 0.908 1.259
        </Array>
      </Seasonality_ExpoSmooth>
    </ExponentialSmoothing>
  </TimeSeriesModel>
</PMML>

This is an example of an exponential smoothing time series model using Brown's multiple smoothing technique:

<PMML xmlns="https://www.dmg.org/PMML-4_4" version="4.4">
  <Header copyright="DMG.org">
    <Application name="test application"/>
  </Header>
  <DataDictionary numberOfFields="2">
    <DataField dataType="double" name="Month_Index" optype="continuous">
      <Interval closure="closedClosed" leftMargin="1" rightMargin="50"/>
    </DataField>
    <DataField dataType="double" name="QuadraticMonth" optype="continuous"/>
  </DataDictionary>
  <TimeSeriesModel bestFit="ExponentialSmoothing" functionName="timeSeries" modelName="QuadraticMonth Predictor">
    <MiningSchema>
      <MiningField invalidValueTreatment="asIs" name="Month_Index" outliers="asIs" usageType="order"/>
      <MiningField name="QuadraticMonth" outliers="asIs" usageType="target"/>
    </MiningSchema>
    <Output>
      <OutputField dataType="double" feature="predictedValue" name="QuadraticMonth Predictor - Predicted Value" optype="continuous"/>
    </Output>
    <TimeSeries>
      <TimeValue index="50" value="2550"/>
    </TimeSeries>
    <ExponentialSmoothing RMSE="1.469296539064514">
      <Level alpha="0.29"/>
      <Trend_ExpoSmooth trend="polynomial_exponential">
        <Array type="real" n="3">2549.999972 100.9999732 1.999994714</Array>
      </Trend_ExpoSmooth>
      <TimeValue index="1" value="1.499999999999986"/>
      <TimeValue index="2" value="4.434999999999974"/>
      <TimeValue index="3" value="9.422699999999999"/>
      <TimeValue index="4" value="16.69814500000001"/>
      <TimeValue index="5" value="26.30456892500002"/>
      <TimeValue index="6" value="38.19968310645001"/>
      <TimeValue index="7" value="52.31215520655601"/>
      <TimeValue index="8" value="68.56947439659567"/>
      <TimeValue index="9" value="86.90993292618671"/>
      <TimeValue index="10" value="107.2862207849398"/>
      <TimeValue index="11" value="129.6649358584095"/>
      <TimeValue index="12" value="154.0243926888923"/>
      <TimeValue index="13" value="180.3519824960263"/>
      <TimeValue index="14" value="208.6416885111437"/>
      <TimeValue index="15" value="238.8920018116692"/>
      <TimeValue index="16" value="271.1042947305884"/>
      <TimeValue index="17" value="305.281618813078"/>
      <TimeValue index="18" value="341.4278584112037"/>
      <TimeValue index="19" value="379.5471635161653"/>
      <TimeValue index="20" value="419.6435914851766"/>
      <TimeValue index="21" value="461.7208987097413"/>
      <TimeValue index="22" value="505.7824356899514"/>
      <TimeValue index="23" value="551.8311103729045"/>
      <TimeValue index="24" value="599.8693941784766"/>
      <TimeValue index="25" value="649.8993527234378"/>
      <TimeValue index="26" value="701.9226890294883"/>
      <TimeValue index="27" value="755.9407912489685"/>
      <TimeValue index="28" value="811.9547799736052"/>
      <TimeValue index="29" value="869.965552291198"/>
      <TimeValue index="30" value="929.9738211628786"/>
      <TimeValue index="31" value="991.980149602085"/>
      <TimeValue index="32" value="1055.984979693915"/>
      <TimeValue index="33" value="1121.988656811033"/>
      <TimeValue index="34" value="1189.991449540624"/>
      <TimeValue index="35" value="1259.993565893434"/>
      <TimeValue index="36" value="1331.99516636062"/>
      <TimeValue index="37" value="1405.996374344018"/>
      <TimeValue index="38" value="1481.997284428074"/>
      <TimeValue index="39" value="1559.997968898639"/>
      <TimeValue index="40" value="1639.998482851363"/>
      <TimeValue index="41" value="1721.998868174932"/>
      <TimeValue index="42" value="1805.999156642965"/>
      <TimeValue index="43" value="1891.999372304384"/>
      <TimeValue index="44" value="1979.999533324537"/>
      <TimeValue index="45" value="2069.999653398587"/>
      <TimeValue index="46" value="2161.999742833126"/>
      <TimeValue index="47" value="2255.999809371593"/>
      <TimeValue index="48" value="2351.9998588225"/>
      <TimeValue index="49" value="2449.999895536411"/>
      <TimeValue index="50" value="2549.999922767288"/>
    </ExponentialSmoothing>
  </TimeSeriesModel>
</PMML>

This model can be scored using Brown's formula for multiple smoothing. For instance, to predict the next value in the series (i.e. index=51, which corresponds to m=1), we use the following calculation:

X_t(m) = a₀ + a₁m + (1/2)a₂m²
= 2549.999972 + (100.9999732 * 1) + (0.5 * 1.999994714 * 1²)
= 2652

ARIMA and Transfer Function models

The element ARIMA contains an ARIMA (Autoregressive Integrated Moving Average) model for the time series data to better understand the data or to predict values for future time points in the series. An ARIMA model, also known as Box-Jenkins model, is composed of two parts: an autoregressive (AR) part and a moving average (MA) part. The AR part specifies the current value of the time series as a linear combination of the previous values of the time series plus a stochastic term. The MA part is a linear function of white noise error terms or random shocks at current and previous time points. A Transfer Function model is ARIMA model with predictor series. The element ARIMA supports both the non-seasonal and seasonal ARIMA models, with or without a set of predictors. Other features supported in the element include the specification of the numerator and denominator polynomials for each predictor, the definition of each detected outlier, and the input of extra information needed for forecasting. The technical details of the model are described in Brockwell, P. J., and R. A. Davis. Time series: theory and methods. Springer Science & Business Media, 1991.
Some basic information on ARIMA without predictors can be found at https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average

Overview of models and methods.

Denote the target time series to be modeled and predicted Y_t.

We will use B to represent a backward shift (or lag) operator: BY_t = Y_t-1, its k-th degree is B^kY_t = Y_t-k.

Δ_d is a differencing operator of order d: Δ_d = ( 1 - B )^d.

The general form for an AR polynomial of order p is defined as:

ϕ_p (B) = 1 - φ₁ B - φ₂ B² - ... - φ_p B^p

and a MA polynomial of order q:

θ_q (B) = 1 - θ₁ B - θ₂ B² - ... - θ_q B^q

Those represent non-seasonal part of the model, and seasonal polynomials of order P and Q are used for the seasonal components of AR and MA:

Φ_P (B^s) = 1 - Φ₁ B^s - Φ₂ B^2s - ... - Φ_P B^Ps
and
Θ_Q (B^s) = 1 - Θ₁ B^s - Θ₂ B^2s - ... - Θ_Q B^Qs
where s is the seasonal length or period.

The coefficients φ_i, θ_i, Φ_i, Θ_i of the above polynomials constitute a major part of an ARIMA model.

Often the values of a time series are affected by one or more other time series, called predictor time series or dynamic regressors. We will denote them by X_it, i=1,..., k. Consider the numerator polynomial for the i-th predictor X_it:

Num_i = ( ω_i0 - ω_i1B - ... - ω_iuB^u ) * ( 1 - Ω_i1B^s - ... - Ω_ivB^vs )

and denominator polynomial for that predictor:

Den_i = ( 1 - δ_i1B - ... - δ_irB^r ) * ( 1 - Δ_i1B^s - ... )

The following equations are given in terms of Y_t and X_it, but they can be log or square root transformations of the original series. If the target time series was transformed then its predictions should be back-transformed to the original scale.

ARIMA model includes the following widely used variations:

ARMA( p, q ): Y_t = μ + ^{θ_q( B )} / _{ϕ_p( B )} * a_t

Here μ is a constant, a_t is a white noise series normally distributed with mean zero and variance σ².

ARIMA( p, d, q ): Δ_d Y_t = μ + ^{θ_q( B )} / _{ϕ_p( B )} * a_t

Note: ARIMA( p, 0, q ) = ARMA( p, q ).

ARIMA( p, d, q )( P, D, Q ): Δ_d Δ_D Y_t = μ + ^{θ_q( B ) Θ_Q( B^s )} / _{ϕ_p( B ) Φ_P( B^s )} * a_t

Here Δ_d and Δ_D are the difference operator and the seasonal difference operator for Y_t.

Note: ARIMA( p, 0, q )( P, 0, Q ) = ARMA( p, q )( P, Q ).

Transfer Function model: Δ_d Δ_D Y_t = μ + Σ_{i=1 to K} (^Num_i / _{Den_i} Δ_{d_i} Δ_{D_i} B^l_i X_it) + N_t,

where N_t is the noise series that follows zero mean ARMA(p,q)(P,Q) model:

N_t = ^{θ_q( B ) Θ_Q( B^s )} / _{ϕ_p( B ) Φ_P( B^s )} * a_t

Here Δ_{d_i} and Δ_{D_i} are the difference operator and seasonal difference operator for X_it, respectively, and B^l_i is delay term for X_it , l_i is the order of the delay.
Note: ARMA and ARIMA are special cases of transfer function model.

Transfer function model with outliers

Outliers can have significant effect on the predictions of time series models.

We follow the paper "Outliers, Level Shifts, and Variance Changes in Time Series" by Ruey S. Tsay. Journal of Forecasting, Vol. 7, I-20 (1988). Six types of outliers are allowed: additive outliers (AO), innovational outliers (IO), level shift (LS), temporary (or transient) change (TC), seasonal additive (SA), local trend (LT). Suppose there are M outliers at times T₁, T₂, ..., T_M with types O₁, O₂, ..., O_M and magnitudes w₁, w₂, ..., w_M. The transfer function model incorporating all these outliers is
Δ_d Δ_D Y_t = μ + Σ_{i=1 to K} ^Num_i / _{Den_i} Δ_{d_i} Δ_{D_i} B^l_i X_it + Σ_{i=1 to M} w_kL_{O_k}(B)Δ_d Δ_D I_{T_k}(t) + N_t,
where N_t is as before, I_T(t) is 1 when t=T and 0 otherwise. The term L_O(B) is defined as follows for each type of outlier.

Type of outlier	Formula for L_O(B)
Additive	1
Innovational	1/( Δ_d Δ_D ϖ(B)), where ϖ(B) = ^{ϕ_p( B ) Φ_P( B^s )} / _{θ_q( B ) Θ_Q( B^s )}
Level Shift	1/(1-B)
Temporary Change	1/(1-δB)
Seasonal Additive	1/(1-B^s)
Local Trend	1/(1-B)²

Since I_{T_k}(t) can be regarded as a predictor series, i.e. the series takes value 1 if t=T_k, otherwise is 0, we can consider each outlier part L_{O_k}(B) Δ_d Δ_D I_{T_k}(t) as a transfer function. Thus, in the scoring descriptions below they should be treated as transfer functions.

ARIMA element in PMML 4.4 and later includes all the parameters mentioned above and provides three forecasting methods: conditional least squares, Theta recursion, Kalman filter. Theta recursion and Kalman filter are two kinds of exact least squares prediction method. This method is normally used to predict the noise term N_t, and Y_t is then obtained by adding the other terms such as the mean constant μ or transfer functions for predictors or outliers.

Theta recursion method

For a series with zero mean, and without predictors, outliers, or differencing operators, one-step-ahead forecast of Y_t+1 based on the past observations Y₁,...,Y_t and past predicted values Ŷ₁,...,Ŷ_t is:

for t between 1 and m = max( p, q ):
Ŷ_t+1 = Σ_{j=1 to t} θ_t,j (Y_t+1-j-Ŷ_t+1-j ),
for t > m:
Ŷ_t+1= φ₁ Y_t+φ₂ Y_t-1+...+φ_p Y_t+1-p+ Σ_{j=1 to q} θ_t,j (Y_t+1-j-Ŷ_t+1-j )

θ_(t,j) is computed recursively based on theoretical auto-covariance of ARMA(p,q) model and the forecast variance at previous time point. The forecast variance of Ŷ_t+1 is estimated as σ_t+1²=σ²*ν_t, where the forecast variance factor ν_t is also recursively computed based on theoretical auto-covariance of ARMA(p,q) model, θ_t,j, and the forecast variance factor at previous time point. The computation details can be found in proposition 5.2.2 and equation 5.3.9 in Chapter 5 of Brockwell and Davis(1991) where this is called "The innovation algorithm".

After model estimation, the following statistics need to be saved into PMML:

Final m observations: Y
Final q predicted values: Ŷ
Final thetas: θ_t,t-j, t=n-q+1,...,n, and j=n-q,...,t-1
Final nu: ν_n,ν_n-1,...,ν_n-q+1
AR coefficients: φ₁,...,φ_p

Here n is the number of time points in historical data. If the model includes a constant term and a predictor (or regressor) term, during prediction these terms need to be added to the remaining part which is noise which follows ARMA(p,q) model. So terms "FinalNoise" and "FinalPredictedNoise" are used, all the listed statistics can be found in element ThetaRecursionState.

Kalman Filter method

Let m = max( p, q ). The state-space representation of a time series Y_t with zero mean and without predictors, outliers, or differencing operators is:
  S_t+1 = F S_t + H e_t
  Y_t = G S_t + e_t
where S_t is a state vector of m by 1, and
  G=(1,0,...,0)_(1×m)
  H^T=(ψ₁,...,ψ_m )
where ψ_i are coefficients in
  (1-θ₁ B-θ₂ B²-...-θ_q B^q)/ (1-φ₁ B-φ₁ B²-...-φ_p B^p )= 1+ψ₁ B+ψ₂ B²+...
F is an m×m matrix with 1's above the main diagonal, the bottom row containing φ_m, φ_m-1, ..., φ₁, and zeros elsewhere. Here φ_i = 0 if i > p.
The noise term e_t has expectation E(e_t)=0 and variance Var(e_t ) = σ².
The details of one-step-ahead forecast and forecast variance are given by the proposition 12.2.2 in Chapter 12 of Brockwell and Davis (1991).
After model estimation, the following statistics need to be saved into PMML:

Final state vector: S_n
Final error covariance matrix of state S_n: Ω_n
H vector: H

These statistics can be found in element KalmanState.

Maximum likelihood forecasting methods

The unknown parameters of ARMA model are estimated iteratively by maximizing the log likelihood function. In each iteration, we use the theta recursive method or Kalman filter method to compute one-step-ahead forecast values based on parameters in the last iteration, and then compute the log-likelihood value. Based on log-likelihood value, we update the parameters based on derivatives. These steps are repeated until the criteria of convergence are reached. Since the theta recursive method or Kalman filter method are involved in the maximum likelihood estimation of parameters, we call the PMML element containing statistics for the two forecasting methods "MaximumLikelihoodStats".

The schema

Here is the definition of ARIMA element:

<xs:element name="ARIMA">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="NonseasonalComponent" minOccurs="0"/>
      <xs:element ref="SeasonalComponent" minOccurs="0"/>
      <xs:element ref="DynamicRegressor" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="MaximumLikelihoodStat" minOccurs="0"/>
      <xs:element ref="OutlierEffect" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="RMSE" type="REAL-NUMBER"/>
    <xs:attribute name="transformation" default="none">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="none"/>
          <xs:enumeration value="logarithmic"/>
          <xs:enumeration value="squareroot"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="constantTerm" type="REAL-NUMBER" default="0"/>
    <xs:attribute name="predictionMethod" default="conditionalLeastSquares">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="conditionalLeastSquares"/>
          <xs:enumeration value="exactLeastSquares"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:complexType>
</xs:element>

The attribute RMSE is the root mean squared error, or standard deviation of the residuals.

The attribute transformation is similar to the one in ExponentialSmoothing, which specifies what transformation has been applied to the target series prior to executing the algorithm. If present, the inverse of this transform should be applied to the model predictions to get the predicted value of the original series.

The attribute constantTerm specifies the constant value μ in ARIMA part. This value should be added to the predicted value (before applying the inverse of the transformation, if specified) to get the prediction of the original series.

The attribute predictionMethod specifies the method of prediction of the ARIMA model. Description of each method is given in the Prediction Methods section. Note that if predictionMethod is exactLeastSquares then element MaximumLikelihoodStat must be present.

Nonseasonal Component and Seasonal Component

The AutoRegressive and MovingAverage components of the non-seasonal and seasonal component, respectively, of the time series model are described by the following elements:

<xs:element name="NonseasonalComponent">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="AR" minOccurs="0"/>
      <xs:element ref="MA" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="p" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="d" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="q" type="xs:nonNegativeInteger" default="0"/>
  </xs:complexType>
</xs:element>

<xs:element name="SeasonalComponent">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="AR" minOccurs="0"/>
      <xs:element ref="MA" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="P" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="D" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="Q" type="xs:nonNegativeInteger" default="0"/>
    <xs:attribute name="period" type="xs:nonNegativeInteger" use="required"/>
  </xs:complexType>
</xs:element>

<xs:element name="AR">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="MA">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="MACoefficients" minOccurs="0"/>
      <xs:element ref="Residuals" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="MACoefficients">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="Residuals">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

The attribute p is the order of the non-seasonal autoregressive part of the model.

The attribute d is the order of the non-seasonal differencing.

The attribute q is the order of the non-seasonal moving average part of the model.

The attribute P is the order of the seasonal autoregressive part of the model.

The attribute D is the order of the seasonal differencing.

The attribute Q is the order of the seasonal moving average part of the model.

The attribute period is the seasonal length.

The REAL-ARRAY in AR specifies the polynomial coefficients of autoregressive (AR) part from order 1 to order p (or order P for seasonal AR).

The REAL-ARRAYs in MACoefficients specifies the polynomial coefficients of moving average (MA) part from order 1 to order q (or order Q for seasonal MA), as well as the residuals to which the coefficients are to be applied to.

The REAL-ARRAY in Residuals specifies the residual error values, that is the past differences between the actual time series values and the model predictions, to which the MA coefficients are multiplied by. These values are needed for the conditionalLeastSquares prediction method and for GARCH.

Dynamic Regressor

Time series predictions can usually depend on not just the previous values of the series but other independent time series as well. The fields containing such time series are called external or dynamic regressors. They are also referred to as exogenous regressors in State Space models.

The effects of the dynamic regressors can be added to the predictions of an ARIMA model or a State Space model (defined later in this document).

<xs:element name="DynamicRegressor">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="Numerator" minOccurs="0"/>
      <xs:element ref="Denominator" minOccurs="0"/>
      <xs:element ref="RegressorValues" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="field" type="FIELD-NAME" use="required"/>
    <xs:attribute name="transformation" default="none">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="none"/>
          <xs:enumeration value="logarithmic"/>
          <xs:enumeration value="squareroot"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="delay" type="INT-NUMBER" default="0"/>
    <xs:attribute name="futureValuesMethod" default="constant">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="constant"/>
          <xs:enumeration value="trend"/>
          <xs:enumeration value="stored"/>
          <xs:enumeration value="otherModel"/>
          <xs:enumeration value="userSupplied"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="targetField" type="FIELD-NAME"/>
  </xs:complexType>
</xs:element>

<xs:element name="Numerator">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="NonseasonalFactor" minOccurs="0"/>
      <xs:element ref="SeasonalFactor" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="Denominator">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="NonseasonalFactor" minOccurs="0"/>
      <xs:element ref="SeasonalFactor" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="SeasonalFactor">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
    <xs:attribute name="difference" type="INT-NUMBER" default="0"/>
    <xs:attribute name="maximumOrder" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

<xs:element name="NonseasonalFactor">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
    <xs:attribute name="difference" type="INT-NUMBER" default="0"/>
    <xs:attribute name="maximumOrder" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

Numerator and Denominator define the numerator and denominator polynomials of the fraction for a specified predictor. More details are given in the prediction methods section.

field is the field name to which all the operations defined in this element are to be applied to.

transformation specifies the transformation applied to this field. Possible values are "none" "logarithmic" and "squareroot", with default "none".

delay specifies the number of intervals by which the input field's influence is delayed. For example, if delay="5", the value of the input predictor at time t doesn't affect forecast until five periods have elapsed (t+5).

difference specifies the order of differencing applied to the selected predictor series before estimating models.

maximumOrder defines the maximum order of numerator or denominator polynomial. All positive lower orders should be included in the model and represented in the REAL-ARRAY. In addition, order zero is always included for the numerator polynomial.

RegressorValues contains the observed (and possibly future) values of the predictor. It can also have additional information needed for predicting future values of the predictor. This element is defined below.

futureValuesMethod specifies a way to obtain future values of the predictor needed for predicting the target values. The following options are available:

constant: assume the predictor is constant for future times, i.e. use the last known value for all later times.
trend: use trend with coefficients c_i found in <TrendCoefficients> as follows: x_t+1 = x_t + c₀ + c₁t + c₂t² ...
stored: future values of the predictor are stored in element <TimeSeries> inside <RegressorValues>
otherModel: future values of the predictor are computed as predictions of another model found in the same PMML document, either as part of a MiningModel or directly in PMML element. In the latter case modelName of the model for the predictor must be the same as the predictor name to help find the connections.
userSupplied: future values of the predictor should be supplied by the user during scoring. This approach allows "what-if" analysis, i.e. trying different future values of predictors to see how the target will be affected.

targetField the target time series field name to which the regressor value should be added to. This is only needed for the case of multidimensional time series.

<xs:element name="RegressorValues">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="TimeSeries" minOccurs="0"/>
      <xs:element ref="TrendCoefficients" minOccurs="0"/>
      <xs:element ref="TransferFunctionValues" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="TrendCoefficients">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="REAL-SparseArray" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="TransferFunctionValues">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

The coefficients c_i used for trend computation described above are stored in the TrendCoefficients element, with the index of the array corresponding to the order of the polynomial term to which the coefficient is applied.

TransferFunctionValues can contain the past values of transfer function, which help in computation of the future values of transfer function, see scoring description below.

The Exact Least Squares method

Kalman filter and Theta recursion information is stored in the following element:

<xs:element name="MaximumLikelihoodStat">
  <xs:complexType>
    <xs:sequence>
      <xs:choice>
        <xs:element ref="KalmanState"/>
        <xs:element ref="ThetaRecursionState"/>
      </xs:choice>
    </xs:sequence>
    <xs:attribute name="method" use="required">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="kalman"/>
          <xs:enumeration value="thetaRecursion"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="periodDeficit" default="0" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

method specifies the method used to do the prediction. Different methods involve different parameters that are needed when the model is used to forecast into the future. Supported methods are “kalman” and “thetaRecursion”. The Kalman filter method can be used for time series with or without missing, and the theta recursion method can be used for time series without missing value. The predicted time points are calculated based on Chapters 5 and 12 of Brockwell and Davis’s book.

periodDeficit specifies the number of missing values at the end of differenced series. The actual model fitting period ends at (endTime – periodDeficit). For example, the original dependent values Y[97], Y[98], Y[99], Y[100] are 11, 10, missing, 12, respectively. If the model calls for first-differencing, the differenced series dY[98], dY[99], dY[100] will be -1, missing, missing. The model fitting period will end at (100 – 2), and the periodDeficit=2 means that user expects to forecast from this model to start at 101.

Noise prediction for Kalman filter method

<xs:element name="KalmanState">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="FinalOmega"/>
      <xs:element ref="FinalStateVector"/>
      <xs:element ref="HVector" minOccurs="0" maxOccurs="1"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalOmega">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Matrix"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalStateVector">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="HVector">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

KalmanState represents the final state of the Kalman iterations.
FinalOmega includes a symmetric m*m matrix which represents the final omega matrix where the m = max(p, q).
FinalStateVector includes a REAL-ARRAY. The length of the ARRAY should equal to m, ‘the larger of the degree of autoregression and the degree of moving average’.
HVector includes a REAL-ARRAY to define the first m psi weights instead of recalculating at scoring time.

Noise prediction for theta recursion method

<xs:element name="ThetaRecursionState">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="FinalNoise"/>
      <xs:element ref="FinalPredictedNoise"/>
      <xs:element ref="FinalTheta"/>
      <xs:element ref="FinalNu"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalNoise">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalPredictedNoise">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="FinalTheta">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Theta" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="Theta">
  <xs:complexType>
    <xs:attribute name="i" type="INT-NUMBER"/>
    <xs:attribute name="j" type="INT-NUMBER"/>
    <xs:attribute name="theta" type="xs:double"/>
  </xs:complexType>
</xs:element>

<xs:element name="FinalNu">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

ThetaRecursionState represents the final state of theta-recursion estimation.
FinalNoise includes a REAL-ARRAY which represents the final values of noise. The length of the ARRAY equals to m, ‘the larger of the degree of autoregression and the degree of moving average’.
FinalPredictedNoise includes a REAL-ARRAY which represents the final predictions of noise. The length of the ARRAY equals to q, the degree of moving average.
FinalTheta contains a collection of θ_i,i-k values needed for predicting future values of the series. Those values are kept in triples with two indices to avoid ambiguity. Each triple is kept in element Theta that has attributes for i, j=i-k, and the corresponding θ_i,j value. FinalNu contain values for ν_i.

OutlierEffect

<xs:element name="OutlierEffect">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="type" use="required">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="additive"/>
          <xs:enumeration value="level"/>
          <xs:enumeration value="transient"/>
          <xs:enumeration value="seasonalAdditive"/>
          <xs:enumeration value="trend"/>
          <xs:enumeration value="innovational"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="startTime" use="required" type="REAL-NUMBER"/>
    <xs:attribute name="magnitude" use="required" type="REAL-NUMBER"/>
    <xs:attribute name="dampingCoefficient" use="optional" type="REAL-NUMBER"/>
  </xs:complexType>
</xs:element>

OutlierEffect supports a number of detected outliers during the modeling process. Their types, locations, and magnitudes are represented by attributes type, startTime, and magnitude respectively. dampingCoefficient specifies the value of delta for the transient outlier. Details of their application are given in the Prediction Methods section.

additive means there is an additive outlier, as a pulse function, at the time of startTime which needs to add a magnitude for predicted values at that time.

level means there is a level shift, as a step function, from the time of startTime which needs to add a magnitude for all predicted values from that time on.

transient means there is a transient change, as a damp function, from the time of startTime which needs to add a magnitude for predicted values from that time on, and the magnitude is damped over time.

seasonalAdditive is similar with level, but the magnitude, as a seasonal pulse function, is only added to the subsequent time points with the same seasonal phase of startTime.

trend means there is a local trend from the time of startTime which needs to add a magnitude for predicted values from that time, and the magnitude is increased or decreased over time based on its sign.

innovational means there is an innovational outlier at the time of startTime which needs to add a magnitude for the noise at that time.

Prediction Methods for ARIMA

For ARIMA models, the conditionalLeastSquares prediction method is simple. Given all information up to time t, one predicts the time series at time t+1, and then uses that to predict the series at time t+2 and so on. Given all information up to a given time t, the prediction at time t+1 is calculated using the ARIMA equations defined, the information required is the AR and MA coefficients and the residuals. To calculate predictions at time t+2 (or t+h in general), one needs the residuals and time series values up to time t+1 (or t+h-1 in general) to apply the AR and MA coefficients to respectively. One assumes all residuals after time t are 0 and the time series values after time t are given by the prediction calculations done for those time values. Scoring examples below provide all the details. The future values of the regressors are either calculated based on a trend, or are assumed not to change with time, or are supplied in PMML or are provided by the user during scoring.

Example for a non-seasonal ARIMA model with conditional least squares prediction

Consider an ARIMA(3,1,1) model with auto-regressive coefficients φ₁, φ₂, φ₃ and a moving-average coefficient θ₁. The model is represented as:

Δ₁Y_t = θ₁(B)/ϕ₃(B) * a_t

Multiplying both sides by ϕ₃ (B) this can be written as:

(1 - φ₁B - φ₂B² - φ₃B³)(1-B)y_t = (1 - θ₁B)a_t

Rewriting this, the prediction at time t+1 is:

y_t+1 = (1 + φ₁)y_t - (φ₁ - φ₂)y_t-1 - (φ₂ - φ₃)y_t-2 - φ₃y_t-3 - θ₁a_t

here a_t is the residual at time t. The AR and MA elements give the coefficients at each time step. The coefficient array gives the values of the coefficients starting at time t, t-1 and so on. Assume that the AR and MA coefficients at each time step were calculated as φ₁=0.05, φ₂=0.1, φ₃=0.3, θ₁=-0.4 and the fit error (residual) was 2:

<PMML xmlns="https://www.dmg.org/PMML-4_4" version="4.4">
  <Header copyright="DMG.org"/>
  <DataDictionary>
    <DataField name="orders" optype="continuous" dataType="double" displayName="TS"/>
    <DataField name="date" optype="continuous" dataType="dateDaysSince[1970]" displayName="Time"/>
  </DataDictionary>
  <TimeSeriesModel functionName="timeSeries" bestFit="ARIMA">
    <MiningSchema>
      <MiningField name="date" usageType="order"/>
      <MiningField name="orders" usageType="target"/>
    </MiningSchema>
    <TimeSeries usage="logical" startTime="1" endTime="6">
      <TimeAnchor type="dateDaysSince[1970]" offset="6940" stepsize="1" displayName="Month">
        <TimeCycle length="12" type="includeFromTo" displayName="Year">
          <Array type="int" n="2">0 11</Array>
        </TimeCycle>
      </TimeAnchor>
      <TimeValue index="1" value="11357.92"/>
      <TimeValue index="2" value="10605.95"/>
      <TimeValue index="3" value="16998.57"/>
      <TimeValue index="4" value="6563.75"/>
      <TimeValue index="5" value="6607.69"/>
      <TimeValue index="6" value="9839.0"/>
    </TimeSeries>
    <ARIMA RMSE="218.6" transformation="none">
      <NonseasonalComponent p="3" d="1" q="1">
        <AR>
          <Array type="real" n="3">0.05 0.1 0.3 </Array>
        </AR>
        <MA>
          <MACoefficients>
            <Array type="real">-0.4</Array>
          </MACoefficients>
          <Residuals>
            <Array type="real">2</Array>
          </Residuals>
        </MA>
      </NonseasonalComponent>
    </ARIMA>
  </TimeSeriesModel>
</PMML>

The prediction of 'orders' at time index "7" is therefore:

orders = 1.05*9839 + 0.05*6607.69 + 0.2*6563.75 - 0.3*16998.57 - (-0.4)*2

Transfer Function Scoring and Examples

Forecasting in the historical data

Suppose that a data set {Y_t,X₁t,…,X_Kt} is given for t=1,...,n, also a time series model is given. Then the one-step-ahead forecast can be obtained using the following steps:

Step 1. Computation of transfer function

For each predictor series, transfer function needs to be calculated. For a predictor series X_it, let the transfer function be V_it = ^Num_i / _{Den_i} Δ_{d_i} Δ_{D_i} B^l_i. It can be calculated as follows.
Calculate U_it = Δ_{d_i} Δ_{D_i} B^l_i X_it
Recursively calculate V_it = -Σ_{j=1 to d_i} Den_i.coeff(j) * V_it-j + Σ_{j=0 to n_i} Num_i.coeff(j) * U_it-j
where n_i and d_i are degrees of numerator and denominator polynomials for the i-th predictor, respectively, Num_i.coeff(j) and Den_i.coeff(j) are the coefficients of those polynomials. All missing V_it-j in the first term are taken to be V_i,-∞ and missing U_it-j in the second term are taken to be U_i,-∞, where U_i,-∞ is the first non-missing measurement of U_it-j and V_i,-∞ = ^{Num_i.coeff(1)}/ _{Den_i.coeff(1)} * U_i,-∞.

Step 2. Computation of noise process

N_t = Δ_d Δ_D Y_t - μ - Σ_{i=1 to K} ^Num_i / _{Den_i} Δ_{d_i} Δ_{D_i} B^l_i X_it

Step 3. One-step-ahead forecast of noise process

N_t is a zero mean ARMA(p+sP,q+sQ) process. One-step-ahead forecast can be obtained by theta recursive method or Kalman filter method, denote it as N̂_t.

Step 4. One-step-ahead of forecast for the series Δ_d Δ_d Y_t

Let Z_t = Δ_d Δ_D Y_t, then one-step-ahead of series Z_t is
Ẑ_t = μ + Σ_{i=1 to K} V_it + N̂_t

Step 5. One-step-ahead of forecast the Y_t

Ŷ_t = Ẑ_t - Σ_{j=1 to d+Ds} τ_j Y_t-j
where τ_j is the coefficient corresponding to power j of the difference operator Δ_d Δ_D
Note: If Y_t is a transformation of an original series, then we need to back transform results to get the prediction of the original series.

Forecasting beyond the historical data

Suppose that time series {Y_t, X_1t,…,X_Kt} are given up to time n, and that future predictor series {X_1t,…,X_Kt} are given for up to time n+H or can be computed based on futureValuesMethod specified in DynamicRegressor element. Now we can use the transfer function model that is built on the historical data to forecast future values of target for t=n+1 to n+H using the method described in the above section except that we cannot compute the noise values N_t, t = n+1,...,n+H because there is no target value Y_t, t = n+1,...,n+H. So based on N_t and N̂_t in historical data, we will compute one-step-ahead forecast for noise process if t=n+1, and h-step-ahead forecast if h is between 1 and n+H.

Kalman filter example and scoring

<PMML xmlns="https://www.dmg.org/PMML-4_4" version="4.4">
    <Header copyright="DMG.org"/>
    <DataDictionary>
        <DataField name="VALUE" optype="continuous" dataType="double" displayName=" TS-VALUE"/>
        <DataField name="TS" optype="continuous" dataType="double" displayName="TS"/>
    </DataDictionary>
    <TimeSeriesModel functionName="timeSeries" bestFit="ARIMA">
        <MiningSchema>
            <MiningField name="TS" usageType="order"/>
            <MiningField name="VALUE" usageType="target"/>
        </MiningSchema>
        <TimeSeries usage="logical" startTime="1" endTime="240">
            <TimeValue index="1" value="1.00"/>
            <TimeValue index="2" value="0.58"/>
            <TimeValue index="239" value="0.49"/>
            <TimeValue index="240" value="0.48"/>
        </TimeSeries>
        <ARIMA RMSE="1.00854505389749" transformation="none" constantTerm="0.375271182246529" predictionMethod="exactLeastSquares">
            <NonseasonalComponent p="1" d="0" q="1">
                <AR>
                    <Array type="real"> 0.441912328691372 </Array>
                </AR>
                <MA>
                    <MACoefficients>
                        <Array type="real">-0.135110189708823</Array>
                    </MACoefficients>
                </MA>
            </NonseasonalComponent>
            <MaximumLikelihoodStat method="kalman" periodDeficit="0">
                 <KalmanState>
                    <FinalOmega>
                        <Matrix kind="symmetric" nbRows="1" nbCols="1">
                            <Array type="real" n="1">0</Array>
                        </Matrix>
                    </FinalOmega>
                    <FinalStateVector>
                        <Array n="1" type="real">0.00493251439212172</Array>
                    </FinalStateVector>
                </KalmanState>
            </MaximumLikelihoodStat>
        </ARIMA>
    </TimeSeriesModel>
</PMML>

This is an ARIMA(1,0,1) model Y_t = μ + ^{( 1 - θ B )} / _{( 1 - φ B )} a_t
Here μ = 0.375271182246529, θ = -0.135110189708823, and φ = 0.441912328691372. Since there is no predictor, we do not need to compute the transfer function values. Let N_t = Y_t - μ, t = 1,2,...240. We can get the forecast value for N₂₄₁ assuming that the noise N_t follows ARMA(1,1) model. Let m = max( p, q ). The state-space representation of N_t has been already given as
S_t+1 = FS_t + He_t
N_t = GS_t + e_t
where S_t is a state vector of length m.

In this case, m=1, G=1, F=φ, and S₂₄₁ = 0.00493251439212172 which is computed after model building and saved in PMML. The one-step-ahead forecast of N₂₄₁ is
N̂₂₄₁ = GS₂₄₁ = 0.00493251439212172.
And the one-step-ahead forecast of Y₂₄₁ is Ŷ₂₄₁ = μ + N̂₂₄₁ = 0.375271182246529 + 0.00493251439212172 = 0.3802037
If we want the two-step-ahead forecast, then S₂₄₂ = FS₂₄₁ = 0.441912328691372 * 0.00493251439212172 = 0.002179739, N̂₂₄₂ = GS₂₄₂=0.002179739, and Ŷ₂₄₂ = μ + N̂₂₄₂ = 0.3774509.

Theta Recursion Example and Scoring

<PMML xmlns="https://www.dmg.org/PMML-4_4" version="4.4">
    <Header copyright="DMG.org"/>
    <DataDictionary>
        <DataField name="TS" optype="continuous" dataType="double" displayName="T"/>
        <DataField name="x" optype="continuous" dataType="double" displayName="predictor-series"/>
        <DataField name="y" optype="continuous" dataType="double" displayName="target-series"/>
    </DataDictionary>
    <TimeSeriesModel functionName="timeSeries" bestFit="ARIMA">
        <MiningSchema>
            <MiningField name="TS" usageType="order"/>
            <MiningField name="x" usageType="active"/>
            <MiningField name="y" usageType="target"/>
        </MiningSchema>
        <TimeSeries usage="logical" startTime="1" endTime="63">
            <TimeValue index="1" value="3750.1797916378027"/>
            <TimeValue index="2" value="3846.066273602286"/>
            <TimeValue index="63" value="12141.488999636887"/>
        </TimeSeries>
        <ARIMA RMSE="90.8960561930059" transformation="none" constantTerm="342.594932405502" predictionMethod="exactLeastSquares">
            <NonseasonalComponent p="1" d="1" q="1">
                <AR>
                    <Array type="real">0.590549588503187</Array>
                </AR>
                <MA>
                    <MACoefficients>
                        <Array type="real">0.459274266537189</Array>
                    </MACoefficients>
                </MA>
            </NonseasonalComponent>
            <DynamicRegressor field="x" futureValuesMethod="userSupplied">
                <Numerator>
                    <NonseasonalFactor>
                        <Array type="real">-0.0114728000479396 -.0115880130277021</Array>
                    </NonseasonalFactor>
                </Numerator>
                <RegressorValues>
                    <TimeSeries usage="logical" startTime="63" endTime="63">
                        <TimeValue index="63" value="2538309.727789499"/>
                    </TimeSeries>
                </RegressorValues>
            </DynamicRegressor>
            <MaximumLikelihoodStat method="thetaRecursion" periodDeficit="0">
                 <ThetaRecursionState>
                    <FinalNoise>
                        <Array n="1" type="real">127.264876980187</Array>
                    </FinalNoise>
                    <FinalPredictedNoise>
                        <Array n="1" type="real">-2.10096285515485</Array>
                    </FinalPredictedNoise>
                    <FinalTheta>
                        <Theta i="63" j="1" theta="-0.459274266537189"/>
                    </FinalTheta>
                    <FinalNu>
                        <Array n="1" type="real">1</Array>
                    </FinalNu>
                </ThetaRecursionState>
            </MaximumLikelihoodStat>
        </ARIMA>
    </TimeSeriesModel>
</PMML>

The dataset has two variables: target Y and predictor X. The dataset has 63 points and the model is ARIMA(1,1,1) with a predictor X_t:
(1-B)Y_t = μ + (ω₀ - ω₁ B) X_t + (1 - θB)/(1 - φB) a_t
Where μ=342.594932405502, ω₀=-0.0114728000479396 , ω₁=-0.0115880130277021, θ=0.459274266537189 and φ=0.590549588503187. Now the following steps are used to compute the forecast value for t = 64.

Step 1. Compute transfer function for X_t:
V₆₄ = ω₀*X₆₄ - ω₁*X₆₃ = -0.0114728000479396 * 2538309.727789499 - 0.0115880130277021 * 2538309.727789499 = 292.4462.
Where X₆₃ is saved in the PMML inside RegressorValues in DynamicRegressor, and X₆₄ is obtained based on the constant trend.

Step 2. Compute the noise prediction
N̂₆₄=φ * N₆₃ + θ_63,1 ( N₆₃ - N̂₆₃ )
where N₆₃ is noise at the time t=63. The noise N₆₃, noise prediction N̂₆₃, and parameter θ_63,1 are saved in PMML after model building. We compute N̂₆₄ = 15.74182.

Step 3. Let Z₆₄ = ( 1 - B ) Y₆₄. Its prediction is Ẑ₆₄ = μ + V₆₄ + N̂₆₄ = 650.783

Step 4. The forecasted value for target at t=64 is Ŷ₆₄ = Y₆₃ + Ẑ₆₄ = 12141.488999636887 + 650.783 = 12792.27

More generally, theta recursion scoring is computed as follows. Let N_t follow ARMA( p, q ) model. One-step-ahead noise forecast is given by

  N̂_n+1 = φ₁ N_n + φ₂ N_n-1 +...+ φ_p N_n+1-p + Σ_{j=1 to q} θ_n,j (N_n+1-j - N̂_n+1-j )

h-step-ahead noise forecast when h is less than or equal q is
   N̂_n(h) = Σ_{i=1 to p} φ_i N̂_n (h-i) + Σ_{j=h to q} θ_n+h-1,j (N_n+h-j - N̂_n+h-j )
and for h > q
   N̂_n(h) = Σ_{i=1 to p} φ_i N̂_n (h-i)
where N̂_n (j) = N_n-j for j less than or equal to 0.

And θ_ij are computed recursively as follows:
For i=n+1,n+2,... when k < i - q we set θ_i,i-k = 0. For k between i-q and i-1 we have
  θ_i,i-k = ν_k^-1 (κ(i+1,k+1) - Σ_{j=i-q to k-1} θ_k,k-j θ_i,i-j ν_j )
  ν_i = κ(i+1,i+1) - Σ_{j=i-q to i-1} θ_i,i-j² ν_j = Σ_{r=0 to q} ϑ_r² - Σ_{k=1 to q} θ_i,k² ν_i-k
where κ(i+1,k+1) = Σ_{r=0 to q-|i-k|} ϑ_r ϑ_r+|i-k|, for i-q-1 < k < i, with ϑ₀ = 1, ϑ_i = -θ_i for i=1,...,q, and ϑ_i = 0 if i > q.

Consider a slightly more complicated example of a theta recursion model.

<PMML xmlns="https://www.dmg.org/PMML-4_4" version="4.4">
    <Header copyright="DMG.org"/>
    <DataDictionary>
        <DataField name="TS" optype="continuous" dataType="double" displayName="T"/>
        <DataField name="x" optype="continuous" dataType="double" displayName="predictor-series"/>
        <DataField name="y" optype="continuous" dataType="double" displayName="target-series"/>
    </DataDictionary>
    <TimeSeriesModel functionName="timeSeries" bestFit="ARIMA">
        <MiningSchema>
            <MiningField name="TS" usageType="order"/>
            <MiningField name="x" usageType="active"/>
            <MiningField name="y" usageType="target"/>
        </MiningSchema>
        <TimeSeries usage="logical" startTime="1" endTime="63">
            <TimeValue index="1" value="3750.1797916378027"/>
            <TimeValue index="2" value="3846.066273602286"/>
            <TimeValue index="63" value="12141.488999636887"/>
        </TimeSeries>
        <ARIMA transformation="none" constantTerm="320.636778635366" predictionMethod="exactLeastSquares">
            <NonseasonalComponent p="1" d="1" q="2">
                <AR>
                    <Array type="real">-0.09114206662863</Array>
                </AR>
                <MA>
                    <MACoefficients>
                        <Array type="real">-0.21505809467966 -0.182217351518243 </Array>
                    </MACoefficients>
                </MA>
            </NonseasonalComponent>
            <DynamicRegressor field="x" futureValuesMethod="userSupplied">
                <Numerator>
                    <NonseasonalFactor>
                        <Array type="real">-0.0101715603339601  -0.0102725915582332</Array>
                    </NonseasonalFactor>
                </Numerator>
                <RegressorValues>
                    <TimeSeries usage="logical" startTime="63" endTime="63">
                        <TimeValue index="63" value="2538309.727789499"/>
                    </TimeSeries>
                </RegressorValues>
            </DynamicRegressor>
            <MaximumLikelihoodStat method="thetaRecursion" periodDeficit="0">
                <ThetaRecursionState>
                    <FinalNoise>
                        <Array n="2" type="real">-0.948468422344376 130.727431958243</Array>
                    </FinalNoise>
                    <FinalPredictedNoise>
                        <Array n="2" type="real">-9.3141625486384 27.6726476335164</Array>
                    </FinalPredictedNoise>
                    <FinalTheta>
                        <Theta i="63" j="1" theta="0.21505809467966"/>
                        <Theta i="63" j="2" theta="0.182217351518243"/>
                    </FinalTheta>
                    <FinalNu>
                        <Array n="2" type="real">1 1</Array>
                    </FinalNu>
                </ThetaRecursionState>
            </MaximumLikelihoodStat>
        </ARIMA>
    </TimeSeriesModel>
</PMML>

The model is ARIMA(1,1,2) with a predictor X_t:
(1-B)Y_t = μ + (ω₀ - ω₁ B) X_t + (1 - θ₁B - θ₂B²)/(1 - φB) a_t
Where μ=320.636778635366, ω₀=-0.0101715603339601 , ω₁=-0.0102725915582332, θ₁=-0.21505809467966, θ₂=-0.182217351518243, and φ=-0.09114206662863. Now the following steps are used to compute the forecast value for t = 64.

Step 1. Compute transfer function for X_t:
V₆₄ = ω₀*X₆₄ - ω₁*X₆₃ = =-0.0101715603339601 * 2538309.727789499 -(-0.0102725915582332) * 2538309.727789499 = 256.4485.
Where X₆₃ is saved in the PMML inside RegressorValues in DynamicRegressor, and X₆₄ is specified by user at the time of scoring.

Step 2. Compute the noise prediction
N̂₆₄ = φ * N₆₃ + θ_63,1 ( N₆₃ - N̂₆₃ ) + θ_63,2 ( N₆₂ - N̂₆₂ )
where N₆₃ is noise at the time t=63, N₆₂ is noise at the time t=62. The noise N₆₂ and N₆₃, noise prediction N̂₆₂ and N̂₆₃, and parameters θ_63,1 and θ_63,2 are saved in PMML after model building. We compute N̂₆₄ = 11.77238.

Step 3. Let Z₆₄ = ( 1 - B ) Y₆₄. Its prediction is Ẑ₆₄ = μ + V₆₄ + N̂₆₄ = 588.8577

Step 4. The forecasted value for target at t=64 is Ŷ₆₄ = Y₆₃ + Ẑ₆₄ = 12141.488999636887 + 588.8577 = 12730.35

To predict the time series at t=65 we will use the same steps, but we also need to compute the new θ_i,i-k.

Step 1. Compute transfer function for X_t, with user-supplied predictor values:
V₆₅ = ω₀*X₆₅ - ω₁*X₆₄ = =-0.0101715603339601 * 2538308 -(-0.0102725915582332) * 2538309.727789499 = 256.4661.

Step 2. Compute the noise prediction
N̂₆₅ = φ * N̂₆₄ + θ_64,2 ( N₆₃ - N̂₆₃ )
where θ_64,2 is computed as follows:
Let i=64, k=62, then according to the theta recursive method, θ_64,2 = θ_64,(64-62) = ν₆₂^-1 (κ(65,63) - θ_62,1 * θ_64,3 * ν₆₁ ) = ν₆₂^-1 (ϑ₀ ϑ₂ - θ_62,1 * θ_64,3 * ν₆₁ )
where ν₆₁=1 and ν₆₂=1, and they are saved in PMML after model building.
κ(65,63) = ϑ₀ * ϑ₂ = 1*(-θ₂ ) = 0.182217351518243, and θ_64,3 = 0, therefore θ_64,2 = ν₆₂^-1 (ϑ₀ ϑ₂ ) = 0.1822174
Finally, N̂₆₃(2) = φ * N̂₆₄ + θ_64,2 (N₆₃ - N̂₆₃ ) = -0.09114206662863 * 11.77238 + 0.1822174 * (130.727431958243 - 27.6726476335164) = 17.70541

Step 3. Let Z₆₅ = (1-B)Y₆₅. Its prediction is Ẑ₆₅ = μ + V₆₅ + N̂₆₃(2) = 594.8083

Step 4. The forecast value for target at t=65 is Ŷ₆₅ = Ŷ₆₄ + Ẑ₆₅ = 12730.35 + 594.8083 = 13325.16

Confidence of Prediction Calculation

Step 1. Compute the variance estimate of h-step ahead of noise process:

If the theta recursive method is used, the variance of h-step ahead of noise process can be computed as

σ_{N_t}²(h) = σ²Σ_{j=0 to h-1} (Σ_{r=0 to j} χ₂θ_t+h-r-1,j-r)²ν_t+h-j-1

Where σ² is estimated as the squared RMSE, the constants χ_r are calculated recursively as

χ₀ = 1

χ_r = Σ_{k=1 to min(p,r)} φ_kχ_r-k, r=1,2,3,...

And the computation details of ν_t and θ_ij can be found in proposition 5.2.2 and equation 5.3.9 in Chapter 5 of Brockwell and Davis (1991) where this is called "The innovation algorithm".

If the Kalman Filter method is used, the variance of h-step ahead of noise process can be computed as

σ_{N_t}²(h) = σ²(Ω_t^(h)(1,1) + 1)

Where σ² is estimated as the squared RMSE, and the Ω_t^(h)(1,1) can be computed as

Ω_t^(h) = FΩ_t^(h-1)F^T + HH^T

with Ω_t⁽¹⁾ = Ω_t+1

Step 2. Compute the confidence interval of h-step-ahead prediction:

The 100(1-α)% confidence interval of prediction at the time point t+h is computed as

(Ŷ_t(h) - t_df,α/2*σ_{N_t}(h), Ŷ_t(h) + t_df,α/2*σ_{N_t}(h))

where t_df,α/2 is the (1-α/2)100^th percentile of the t distribution with degree of freedom df which can be computed by the number of valid noise residuals minus the number of parameters.

GARCH

GARCH (Generalized Auto Regressive Conditional Heteroscedasticity) models (Bollerslev, T. P., 1986, Generalized Autoregressive Conditional Heteroscedasticity, Journal of Econometrics, 31: 307-327.) are widely used to characterize and model time series with time-varying volatility. In GARCH models the conditional variance of the error terms are assumed to have a structure very similar to the structure of the conditional expectation in an ARMA model, i.e., GARCH models assume the variance of the current error term to be a linear function of the squares of the previous errors and their variances:

Y_t = μ + Σ_{i=1 to p} φ_iY_t-i + Σ_{j=1 to q} θ_jε_t-j + ε_t
h_t = α₀ + Σ_{i=1 to gq} α_iε_t-i² + Σ_{j=1 to gp} β_jh_t-j

Here ε_t are residuals from the MA part of the non-seasonal component, h_t are variances, α_i and β_j are residual square coefficients and coefficients of past variances, respectively. gp and gq are order of variance and order of residual squares, respectively.

The following element contains both the ARMA part and a GARCH part for a complete GARCH model.

<xs:element name="GARCH">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="ARMAPart"/>
      <xs:element ref="GARCHPart"/>
    </xs:sequence>    
  </xs:complexType>
</xs:element>

<xs:element name="ARMAPart">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="AR"/>
      <xs:element ref="MA"/>
    </xs:sequence>    
    <xs:attribute name="constant" default="0" type="REAL-NUMBER"/>
    <xs:attribute name="p" use="required" type="INT-NUMBER"/>
    <xs:attribute name="q" use="required" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

<xs:element name="GARCHPart">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="ResidualSquareCoefficients"/>
      <xs:element ref="VarianceCoefficients"/>
    </xs:sequence>    
    <xs:attribute name="constant" default="0" type="REAL-NUMBER"/>
    <xs:attribute name="gp" use="required" type="INT-NUMBER"/>
    <xs:attribute name="gq" use="required" type="INT-NUMBER"/>
  </xs:complexType>
</xs:element>

<xs:element name="ResidualSquareCoefficients">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="Residuals" minOccurs="0"/>
      <xs:element ref="MACoefficients" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="VarianceCoefficients">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="PastVariances" minOccurs="0"/>
      <xs:element ref="MACoefficients" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="PastVariances">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

GARCH Example and Scoring

<PMML xmlns="https://www.dmg.org/PMML-4_4" version="4.4">
  <Header copyright="(c) DMG 2016" description="Time Series Example"/>
  <DataDictionary numberOfFields="1">
    <DataField dataType="double" displayName="Time Series" name="ts" optype="continuous"/>
  </DataDictionary>
  <TimeSeriesModel algorithmName="GARCH" bestFit="GARCH" functionName="timeSeries">
      <MiningSchema>
        <MiningField name="ts" usageType="target"/>
      </MiningSchema> 
      <TimeSeries usage="original" startTime="97" endTime="100" interpolationMethod="none">
        <TimeValue index="97" value="1.1"/>
        <TimeValue index="98" value="1.2"/>
        <TimeValue index="99" value="1.3"/>
        <TimeValue index="100" value="1.4"/>
      </TimeSeries>
      <GARCH>
        <ARMAPart p="4" q="2" constant="-3.4">
            <AR>
                <Array n="4" type="real">1.5 1.4 1.3 1.2</Array>
            </AR>
            <MA>
                <MACoefficients>
                    <Array n="2" type="real">1.1 1.2</Array>
                </MACoefficients>
                <Residuals>
                    <Array n="2" type="real">1.1 1.2</Array>
                </Residuals>
            </MA>
        </ARMAPart>
        <GARCHPart constant="0.12" gp="2" gq="3">
            <ResidualSquareCoefficients>
                <Residuals>
                    <Array n="3" type="real">1.1 1.2 1.3</Array>
                </Residuals>
                <MACoefficients>
                    <Array n="3" type="real">1.3 1.2 1.1</Array>
                </MACoefficients>
            </ResidualSquareCoefficients>
            <VarianceCoefficients>
                <PastVariances>
                    <Array n="2" type="real">1.1 1.2</Array>
                </PastVariances>
                <MACoefficients>
                    <Array n="2" type="real">1.3 1.2</Array>
                </MACoefficients>
            </VarianceCoefficients>
        </GARCHPart>
      </GARCH>
  </TimeSeriesModel>
</PMML>

The steps to score such a model one step ahead are:

Compute Y_t+1 using conditional least squares prediction method.
Compute ε_t+1 = Y_t+1 - Σ_{i=1 to p} φ_iY_t+1-i - Σ_{j=1 to q} θ_jε_t+1-j - μ
Compute h_t+1 = α₀ + Σ_{i=1 to gq} α_iε_t+1-i² + Σ_{j=1 to gp} β_jh_t+1-j, where α₀ is the constant in GARCHPart element, in this case 0.12.

This can be repeated for the needed number of steps.

In this example PMML the values of Y_t are: 1.1, 1.2, 1.3, 1.4, μ=-3.4, p=4, q=2, gp=2, gq=3; values of φ_i are found in AR element of ARMAPart: 1.5, 1.4, 1.3, 1.2; values of θ_j are in MACoefficients of MA in ARMAPart: 1.1, 1.2; α_i's are the MACoefficients in ResidualSquareCoefficients: 1.3, 1.2, 1.1; β_i's are the MACoefficients in VarianceCoefficients: 1.3, 1.2.

State Space models

State Space models are multi-dimensional time series models which can be represented in the following form:

Y_t = GX_t + W_t
X_t = FX_t-1 + V_t-1

The Observable vector Y is a vector representing an m-dimensional time series y_t at time t.
The State vector X is a size p vector describing how the series is advanced from time t-1 to time t.
The Measurement matrix G is a m by p matrix converting the State Vector X at time t to an observable vector Y_t.
The Transition matrix F is a p by p matrix describing how the state of the series changes from the previous time step to the present time.
The Intercept vector is a vector of intercepts which is used for multivariate models. In the case of univariate models and single intercept values the attribute "intercept" can be used.
The Predicted State Covariance matrix is the estimate of the state uncertainty matrix.
The Selected State Covariance matrix is utilized to calculate the estimate of the state uncertainity matrix.
The Observation Variance matrix defines the observation variance of that time series.
W and V represent noise and error, not important for prediction of the value of each time series itself.
If dynamic regressors are present, the final predicted value is the sum of the value predicted by the dynamic regressors and the value predicted by the matrix multiplication method.

Note that general ARIMA models can be expressed as state space models, but not all state space models are ARIMA models.

<xs:element name="StateSpaceModel">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="StateVector" minOccurs="0"/>
      <xs:element ref="TransitionMatrix" minOccurs="0"/>
      <xs:element ref="MeasurementMatrix" minOccurs="0"/>
      <xs:element ref="InterceptVector" minOccurs="0"/>
      <xs:element ref="PredictedStateCovarianceMatrix" minOccurs="0"/>
      <xs:element ref="SelectedStateCovarianceMatrix" minOccurs="0"/>
      <xs:element ref="ObservationVarianceMatrix" minOccurs="0"/>
      <xs:element ref="PsiVector" minOccurs="0"/>
      <xs:element ref="DynamicRegressor" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="variance" type="REAL-NUMBER"/>
    <xs:attribute name="period" default="none"/>
    <xs:attribute name="intercept" type="REAL-NUMBER" default="0"/>
  </xs:complexType>
</xs:element>

<xs:element name="StateVector">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="TransitionMatrix">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="Matrix"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="MeasurementMatrix">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="Matrix"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="InterceptVector">
    <xs:complexType>
        <xs:sequence>
            <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
            <xs:group ref="REAL-ARRAY"/>
        </xs:sequence>
        <xs:attribute name="type" default="state">
          <xs:simpleType>
            <xs:restriction base="xs:string">
              <xs:enumeration value="state"/>
              <xs:enumeration value="observation"/>
            </xs:restriction>
         </xs:simpleType>
    </xs:attribute>
    </xs:complexType>
</xs:element>

<xs:element name="PredictedStateCovarianceMatrix">
    <xs:complexType>
        <xs:sequence>
            <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
            <xs:element ref="Matrix"/>
        </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:element name="SelectedStateCovarianceMatrix">
    <xs:complexType>
        <xs:sequence>
            <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
            <xs:element ref="Matrix"/>
        </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:element name="ObservationVarianceMatrix">
    <xs:complexType>
        <xs:sequence>
            <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
            <xs:element ref="Matrix"/>
        </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:element name="PsiVector">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
      <xs:group ref="REAL-ARRAY"/>
    </xs:sequence>
    <xs:attribute name="targetField" type="xs:string"/>
    <xs:attribute name="variance" type="xs:string"/>
  </xs:complexType>
</xs:element>

The vector "PsiVector" is a representation of an ARMA model as a MA model of infinite order. The elements of this vector, ψ, can be used to forecast the variance of the next step prediction; in most cases, a very good approximation of the variance of the predicted value h time steps in the future is:
σ²_t+h = σ_t²(1 + ψ₁² + ... + ψ_h-1²)
For the case of multidimensional time series models, the targetField attribute defines which time series the vector belongs to and variance defines the variance of that time series. This allows for different variances for the different time series.

Example for a state space model with two target series

The result of fitting a state space model to a time series is the three parameters; the state vector X_t, the transition matrix F and the Measurement matrix G. For the purposes of prediction, the errors are ignored. Using the equations defining s state space model, the prediction of the observable vector Y is:
Y_t = G (FX_t)
Continuing the method, the prediction for the next time step would be:
Y_t+1 = G (F X_t+1)
hence
Y_t+1 =G (F (F X_t))

Consider two stationary time series, Y₁ and Y₂, fit with a state space model resulting in the parameters:
G =

1	0	0	1
0	0	1	0

F =

0.052	1	0	0
0.019	0	1	0
0.373	0	0	0
1.0	0	0	1

X = [-0.53, -2.41, 0.78, 88.74]^T
Next suppose the time series depends on 1 regressor variable z, given by
Y₁(z) = z_t + z_t-1
Y₂(z) = z_t
In other words, the final prediction for the time series at time t+1 is:

Y_t+1 = G (FX_t) + Z
where
Z₁ = z_t+1 + z_t
Z₂ = z_t+1

The PMML representation is given as:

<PMML xmlns="https://www.dmg.org/PMML-4_4" version="4.4.1">
  <Header copyright="DMG.org"/>
  <DataDictionary>
    <DataField name="Y1" optype="continuous" dataType="double"/>
    <DataField name="Y2" optype="continuous" dataType="double"/>
    <DataField name="date" optype="continuous" dataType="dateDaysSince[1970]" displayName="TS-VALUE"/>
    <DataField name="z" optype="continuous" dataType="double" displayName="ExternalRegressor"/>
  </DataDictionary>
  <TimeSeriesModel functionName="timeSeries" bestFit="StateSpaceModel">
    <MiningSchema>
      <MiningField name="date" usageType="order"/>
      <MiningField name="z" usageType="active"/>
      <MiningField name="Y1" usageType="target"/>
      <MiningField name="Y2" usageType="target"/>
    </MiningSchema>
    <Output>
      <OutputField name="orders" feature="predictedValue" targetField="Y1" dataType="double"/>
      <OutputField name="profit" feature="predictedValue" targetField="Y2" dataType="double"/>
    </Output>
    <TimeSeries usage="logical" startTime="1" endTime="6" field="Y1">
      <TimeAnchor type="dateDaysSince[1970]" offset="6940" stepsize="1" displayName="Month">
        <TimeCycle length="12" type="includeFromTo" displayName="Year">
          <Array type="int" n="2">0 11</Array>
        </TimeCycle>
      </TimeAnchor>
      <TimeValue index="1" value="11357.92"/>
      <TimeValue index="2" value="10605.95"/>
      <TimeValue index="3" value="16998.57"/>
      <TimeValue index="4" value="6563.75"/>
      <TimeValue index="5" value="6607.69"/>
      <TimeValue index="6" value="9839.0"/>
    </TimeSeries>
    <TimeSeries usage="logical" startTime="1" endTime="6" field="Y2">
      <TimeAnchor type="dateDaysSince[1970]" offset="6940" stepsize="1" displayName="Month">
        <TimeCycle length="12" type="includeFromTo" displayName="Year">
          <Array type="int" n="2">0 11</Array>
        </TimeCycle>
      </TimeAnchor>
      <TimeValue index="1" value="11357.92"/>
      <TimeValue index="2" value="10605.95"/>
      <TimeValue index="3" value="16998.57"/>
      <TimeValue index="4" value="6563.75"/>
      <TimeValue index="5" value="6607.69"/>
      <TimeValue index="6" value="9839.0"/>
    </TimeSeries>
      <StateSpaceModel>
        <StateVector>
          <Array type="real">-0.52 -2.41 0.78 88.74</Array>
        </StateVector>
        <TransitionMatrix>
      <Matrix nbRows="4" nbCols="4">
        <Array type="real">0.052 1 0 0</Array>
            <Array type="real">0.119 0 1 0</Array>
            <Array type="real">0.373 0 0 0</Array>
            <Array type="real">1.000 0 0 1</Array>
      </Matrix>
        </TransitionMatrix>
        <MeasurementMatrix>
          <Matrix nbRows="1" nbCols="4">
            <Array type="real">1 0 0 1</Array>
            <Array type="real">0 0 1 0</Array>
          </Matrix>
        </MeasurementMatrix>
        <PsiVector targetField="Y1" variance="10">
          <Array type="real">-0.402 0.098 0.330</Array>
        </PsiVector>
        <DynamicRegressor field="z" targetField="Y1" delay="0" futureValuesMethod="constant">
          <Numerator>
            <NonseasonalFactor>
              <Array type="real">1 -1</Array>
            </NonseasonalFactor>
          </Numerator>
          <Denominator>
            <NonseasonalFactor>
              <Array type="real">1</Array>
            </NonseasonalFactor>
          </Denominator>
          <RegressorValues>
            <TimeSeries>
              <TimeValue index="6" value="9.1"/>
            </TimeSeries>
          </RegressorValues>
        </DynamicRegressor>
        <DynamicRegressor field="z" targetField="Y2" delay="0" futureValuesMethod="constant">
          <Numerator>
            <NonseasonalFactor>
              <Array type="real">1</Array>
            </NonseasonalFactor>
          </Numerator>
          <Denominator>
            <NonseasonalFactor>
              <Array type="real">1</Array>
            </NonseasonalFactor>
          </Denominator>
          <RegressorValues>
            <TimeSeries>
              <TimeValue index="6" value="9.1"/>
            </TimeSeries>
          </RegressorValues>
        </DynamicRegressor>
      </StateSpaceModel>
  </TimeSeriesModel>
</PMML>

  Y_1t = 85.77
  Y_2t = -0.19
The coefficients of the regressors are stored in the Numerator and Denominator element. Substituting those, we get the expected equation for the regressor prediction:
  Y₁(z) = z₇ + z₆
  Y₂(z) = z₇
The regressor value at t=6 is stored in the RegressorValue element
   z₆ = 9.1
The regressor value at t=7 is unknown, but the futureValuesMethod="constant" attribute in the DynamicRegressor element indicates that we should take it to be same as that at t=6:
   z₇ = 9.1
We therefore get the final prediction:
   orders = 85.77 + 1*9.1 + 1*9.1 = 103.97
   profit = -0.19 + 1*9.1 = 8.91

The variance of the series Y₁at time index 6 is given as 10 and the first element of the PsiVector is given; the predicted variance of the series prediction at time index 7 is therefore:
  10*(1 + (-0.402)²) = 11.6
Since the variance of the series Y₂ is not given, we cannot calculate the variance of the predicted value of Y₂

ARIMA, SARIMAX and VARMAX Models in State Space form

VARMAX are Vector AutoRegressive Moving-Average models with optional eXogenous regressors. SARIMAX are Seasonal ARIMA models with optional eXogenous regressors.

To express ARIMA, SARIMAX and VARMAX models in state space form, we add intercept terms to the state and observation equations. For SARIMAX and multivariate VARMAX models, μ_o is 0. For ARIMA models, μ_s is 0.
X_t = FX_t-1 + μ_s
Y_t = GX_t + μ_o

These intercepts are stored in the InterceptVector element, and the attribute type (state or observation) distinguishes between the intercepts.

To calculate prediction intervals exactly, we use the following:
P_t = FP_t-1F^T + RQR^T
J = H + GP_tG^T
σ̂_h² = σ̂² * [ {J_t¹, ...}, .... , {..., J_t^r}]
CPI: Y_t ± t_df,α/2*σ̂_h

Here, P is the predicted state covariance matrix which is updated using the transition matrix F and the selected state covariance matrix RQR^T. The conditional mean square error matrix J is calculated using the observation variance matrix H, measurement matrix G, and P. The variance of the point forecast h steps in the future is calculated using the estimate of the variance of the residuals σ̂² and J. For a multivariate model, J is a matrix, and its diagonal elements are used in this step. For a univariate model, J is of size 1x1. The confidence of prediction intervals are then calculated using the multiplier t_df,α/2 and the variance from the previous step. The multiplier t_df,α/2 is computed in the same way as in the Confidence of Prediction Calculation section above.

Note: Notations like [{a₁₁, a₁₂}, {a₂₁, a₂₂}] represent matrix. Here the number of rows and columns are 2, the element at row 1 and column 1 is a₁₁, row 1 and column 2 is a₁₂, row 2 and column 1 is a₂₁ and row 2 and column 2 is a₂₂.

Example for univariate ARIMA model with intercept

<PMML xmlns="https://www.dmg.org/PMML-4_4" version="4.4">
 <Header description="ARIMA Time Series Model"/>
 <DataDictionary numberOfFields="2">
  <DataField name="ts_value" optype="continuous" dataType="double"/>
  <DataField name="h" optype="continuous" dataType="double"/>
 </DataDictionary>
 <TimeSeriesModel modelName="ARIMA_model" functionName="timeSeries" bestFit="StateSpaceModel">
  <MiningSchema>
   <MiningField name="ts_value" usageType="predicted" invalidValueTreatment="returnInvalid"/>
   <MiningField name="h" usageType="supplementary" invalidValueTreatment="returnInvalid"/>
  </MiningSchema>
  <Output>
   <OutputField name="Predicted_ts_value" optype="continuous" dataType="string" feature="predictedValue"/>
   <OutputField name="cpi_80_lower" optype="continuous" dataType="string" feature="confidenceIntervalLower" value="80"/>
   <OutputField name="cpi_80_upper" optype="continuous" dataType="string" feature="confidenceIntervalUpper" value="80"/>
  </Output>
  <TimeSeries usage="original" startTime="1" endTime="100">
   <TimeValue index="1" value="88"/>
   <TimeValue index="2" value="84"/>
   <TimeValue index="99" value="222"/>
   <TimeValue index="100" value="220"/>
  </TimeSeries>
  <StateSpaceModel variance="9.97357634862506">
   <StateVector>
    <Array type="real" n="2">74.9567433980672 -52.3002235159934</Array>
   </StateVector>
   <TransitionMatrix>
    <Matrix nbRows="2" nbCols="2">
     <Array type="real">1.66128548696599 1</Array>
     <Array type="real">-0.679287914563663 0</Array>
    </Matrix>
   </TransitionMatrix>
   <MeasurementMatrix>
    <Matrix nbRows="1" nbCols="2">
     <Array type="real">1 0</Array>
    </Matrix>
   </MeasurementMatrix>
   <InterceptVector type="observation">
    <Array type="real" n="1">143.007281014872</Array>
   </InterceptVector>
   <PredictedStateCovarianceMatrix>
    <Matrix nbRows="2" nbCols="2">
     <Array type="real">0 0</Array>
     <Array type="real">0 0</Array>
    </Matrix>
   </PredictedStateCovarianceMatrix>
   <SelectedStateCovarianceMatrix>
    <Matrix nbRows="2" nbCols="2">
     <Array type="real">1 0.508921473603515</Array>
     <Array type="real">0.508921473603515 0.259001066294773</Array>
    </Matrix>
   </SelectedStateCovarianceMatrix>
   <ObservationVarianceMatrix>
    <Matrix nbRows="1" nbCols="1">
     <Array type="real">0</Array>
    </Matrix>
   </ObservationVarianceMatrix>
  </StateSpaceModel>
 </TimeSeriesModel>
</PMML>

This example is an ARIMA(2,0,1) model expressed in state space form.

For h=1 (one step ahead), the point forecast is calculated as follows. Since interceptVector contains type = "observation", the intercept μ_o is used when calculating Y.

X_t+1 = FX_t
X_t+1 = [{1.66, 1},{−0.68, 0}] * [{77.00},{−52.95}] = [{74.96},{−52.30}]
Y_t+1 = GX_t+1 + μ_o
Y_t+1 = [{1, 0}] * [{74.96},{−52.30}] + 143.01 = 217.96

And the 80% prediction intervals for h=1 are calculated as follows:
P_t+1 = FP_tF^T + RQR^T
P_t+1 = [{1.66, 1},{−0.68, 0}] * [{0, 0},{0, 0}] * [{1.66, −0.68},{1, 0}] + [{1, 0.51},{0.51, 0.26}] = [{1, 0.51},{0.51, 0.26}]
J = H + GP_t+1G^T
J = 0 + [{1, 0}] * [{1, 0.51},{0.51, 0.26}] * [{1},{0}] = 1
σ̂₁² = σ̂²*J
σ̂₁² = 9.97 = 3.16
CPI: Y_t+1 ± cσ̂₁
CPI: 217.96 ± 1.28 * 3.16 = [213.92, 222.01]

To calculate values for h=2, the state vector X is updated again:
X_t+2 = FX_t+1
X_t+2 = [{1.66, 1},{−0.68, 0}] * [{74.96},{−52.30}}] = [{72.22},{−50.92}]
Y_t+2 = GX_t+2 + μ_o
Y_t+2 = [{1, 0}] * [{72.22}, {−50.92}] + 143.01 = 215.23

Similarly, to get prediction intervals, predicted state covariance matrix P is updated again:
P_t+2 = FP_t+1F^T + RQR^T
P_t+2 = [{1.66, 1},{−0.68, 0}] * [{1, 0.51},{0.51, 0.26}] * [{1.66, −0.68},{1, 0}] + [{1, 0.51},{0.51, 0.26}] = [{5.71, −0.97}, {−0.97, 0.72}]
J = H+GP_t+2G^T
J = 0 + [{1, 0}] * [{5.71, −0.97}, {−0.97, 0.72}] * [{1}, {0}] = 5.71
σ̂₂² = σ̂²*J
σ̂₂² = 9.97*5.71 = 7.55
CPI: Y_t+2 ± cσ̂₂
CPI: 215.23 ± 1.28*7.55=[205.56, 224.90]

Other types of models.

The following elements are not used in this version of PMML and only serve as placeholders for future versions.
SpectralAnalysis describes the Fourier spectrum of a time series.
SeasonalTrendDecomposition contains one or more fit functions which represent the trend component of the time series and optionally contain information on seasonal oscillations which are modeled on top of the trend component.

<xs:element name="SpectralAnalysis">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="SeasonalTrendDecomposition">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="Extension" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

e-mail

info at dmg.org