PMML Sample Models (ARCHIVE):
Please Note: Not all models on this archive page are conformant. Please be careful while using them.
The PMMLs provided below are examples of predicted models
developed that use the PMML standard. These samples are not intended for
performance or vendor comparisons as they are provided solely for users to gain
a better understanding of PMML. No representation is made as to the
accuracy and applicability of these models. Also included are the
datasets used to train and validate these predictive models.
For a full list of our most current examples, please visit our current examples page
PMML Version |
Model Type |
Vendor |
Application |
Dataset |
PMML
File |
2.0 |
Association |
Oracle |
Oracle 9i Data Mining, 9.2.0 |
Iris |
View |
2.0 |
center-based Clustering |
IBM |
DB2 Intelligent Miner for Data V8.1 |
Iris |
View |
2.0 |
distribution-based Clustering |
IBM |
DB2 Intelligent Miner for Data V8.1 |
Iris |
View |
2.0 |
Naïve Bayes |
Oracle |
Oracle 9i Data Mining, 9.2.0 |
Iris |
View |
2.0 |
Neural Network (Classification) |
IBM |
DB2 Intelligent Miner for Data V8.1 |
Iris |
View |
2.0 |
Neural Network (Regression) |
IBM |
DB2 Intelligent Miner for Data V8.1 |
Iris |
View |
2.0 |
Regression |
IBM |
DB2 Intelligent Miner for Data V8.1 |
Iris |
View |
2.0 |
Tree |
IBM |
DB2 Intelligent Miner for Data V8.1 |
Iris |
View |
2.1 |
Association |
IBM |
DB2 Intelligent Miner Modeling V8.2 |
Voting |
View |
2.1 |
Clustering |
IBM |
DB2 Intelligent Miner Modeling V8.2 |
Robustness |
View |
2.1 |
Tree |
IBM |
DB2 Intelligent Miner Modeling V8.2 |
Robustness |
View |
3.0 |
Association |
IBM |
DB2 Data Warehouse Edition V9.1 |
Shopping |
View |
3.0 |
Association |
SPSS |
Clementine, 10.0 |
Shopping |
View |
3.0 |
Distribution-based Clustering |
IBM |
DB2 Data Warehouse Edition V9.1 |
Elnino |
View |
3.0 |
Center-based Clustering |
IBM |
DB2 Data Warehouse Edition V9.1 |
Elnino |
View |
3.0 |
Clustering |
SPSS |
Clementine, 10.0 |
Iris |
View |
3.0 |
Model Composition |
IBM |
DB2 Data Warehouse Edition V9.1 |
Elnino |
View |
3.0 |
Neural Network |
SPSS |
Clementine, 10.0 |
Iris |
View |
3.0 |
Neural Network |
SPSS |
Clementine, 10.0 |
Heart |
View |
3.0 |
Neural Network |
SPSS |
Clementine, 10.0 |
Iris |
View |
3.0 |
Neural Network |
SPSS |
Clementine, 10.0 |
Heart |
View |
3.0 |
General Regression |
SPSS |
Clementine, 10.0 |
Iris |
View |
3.0 |
Regression |
IBM |
DB2 Data Warehouse Edition V9.1 |
Elnino |
View |
3.0 |
Regression |
IBM |
DB2 Data Warehouse Edition V9.1 |
Elnino |
View |
3.0 |
Regression |
SPSS |
Clementine, 10.0 |
Elnino |
View |
3.0 |
Regression |
SPSS |
Clementine, 10.0 |
Elnino |
View |
3.0 |
Regression |
SPSS |
Clementine, 10.0 |
Heart |
View |
3.0 |
Ruleset |
SPSS |
Clementine, 10.0 |
Heart |
View |
3.0 |
Sequence |
SPSS |
Clementine, 10.0 |
Visits |
View |
3.0 |
Tree |
IBM |
DB2 Data Warehouse Edition V9.1 |
Heart |
View |
3.0 |
Tree |
SPSS |
Clementine, 10.0 |
Iris |
View |
3.0 |
Tree |
SPSS |
Clementine, 10.0 |
Heart |
View |
3.1 |
Sequence |
IBM |
DB2 Data Warehouse Edition V9.1 |
Visits |
View |
3.1 |
Association |
SAS |
SAS 9.2 |
Unknown |
View |
3.1 |
Ann |
SAS |
SAS 9.2 |
Iris |
View |
3.1 |
Cluster |
SAS |
SAS 9.2 |
Iris |
View |
3.1 |
Logistic Reg. |
SAS |
SAS 9.2 |
Iris |
View |
3.1 |
Tree |
SAS |
SAS 9.2 |
Iris |
View |
4.0 |
Cluster |
KNIME |
KNIME 2.4 |
Iris |
View |
4.0 |
Regression |
KNIME |
KNIME 2.4 |
Iris |
View |
4.0 |
Neural Network |
KNIME |
KNIME 2.4 |
Iris |
View |
4.0 |
Support Vector Machine |
KNIME |
KNIME 2.4 |
Iris |
View |
4.0 |
Regression |
KNIME |
KNIME 2.4 |
Elnino |
View |
4.0 |
Regression |
KNIME |
KNIME 2.4 |
Elnino |
View |
The Data Mining Group is always looking to increase the
variety of these samples. If you would like to submit samples,
please see the instructions on our current examples page.
Datasets for PMML Sample Models
These datasets are used in conjunction with the sample
PMML models. While a high level description is provided here, more details
can be found in the ReadMe text file associated with each dataset. If you
publish material based on these datasets, please note the source in your
acknowledgements.
Dataset |
Description |
Source |
Comma-Delimited File |
Elnino |
Contains oceanographic and surface meteorological readings taken from a
series of buoys positioned throughout the equatorial Pacific. The "small"
dataset is provided here, larger dataset are available via the UCI KDD
Archive. The data is expected to aid in the understanding and prediction of
El Nino/Southern Oscillation (ENSO) cycles (from National Oceanic and
Atmospheric Administration, donated by Dr. Di Cook of Iowa State
University). Click here for more info... |
UCI KDD Archive |
View |
Heart |
Data provided by the Cleveland Clinic Foundation on the diagnosis of heart
disease. The data file consists of 13 potential predictors and a target field
(num) identifying patients diagnosed with > 50% diameter narrowing of arteries
(value >50), otherwise (<50) is assigned. In the original file, categorical
values were represented by numeric codes, these have been replaced with
representative strings for easy use.
|
UCI Machine Learning Repository
|
View |
Iris |
Perhaps the best known database to be found in the pattern recognition
literature, R. A. Fisher's 1936 paper is a classic in the field and is
referenced frequently to this day. The data set contains 3 classes of 50
instances ach, where each class refers to a type of iris plant. One class
is linearly separable from the other 2; the latter are NOT linearly
separable from each other (from Fisher,R.A. "The use of multiple
measurements in taxonomic problems," Annual Eugenics, 7, Part II, 179-188,
1936).
Click here for more info... |
UCI
Machine Learning Repository |
View |
Robustness |
This dataset is aimed at finding flaws in PMML export implementations.
In terms of data mininig, the data makes no sense at all, since the values are
randomly distributed, and in no way ment to be correlated. If you receive a
meaningful model, you most probably did something wrong.
Click here for more info |
IBM |
View Apply Data
View Train Data |
Shopping |
Contains data for SPSS SHOPPING_ASSOC.xml |
SPSS |
View |
Visits |
Describes the page visits of users who visited msnbc.com on September 28,
1999. Visits are recorded at the level of URL category and are recorded in
time order (from David Heckerman of Microsoft Corporation).
Visits_Small.csv contains about 65,000 visits, Visits_Large.csv contains
over 880,000 visits
Click here for more info… |
UCI KDD Archive |
View 65KB
View 880KB |
Voting |
Includes votes for each of the U.S. House of Representatives Congressperson
on 16 key votes (from Congressional Quarterly Almanac, 98th Congress, 2nd
session 1984, Volume XL: Congressional Quarterly Inc. Washington, D.C.,
1985. Donated by Jeff Schlimmer at Carnegie-Mellon University).
Click here for more info... |
UCI
Machine Learning Repository |
View |
Additional PMML Examples
These models are additional examples of PMML, not based on
the datasets listed above (datasets marked * can be found by seaching the
UCI Machine Learning
Repository, datasets marked N/A are not available). These models are included here to provide a wider range
of PMML examples for inspection and understanding.
PMML
Version |
Model Type |
Vendor |
Application |
Dataset |
PMML
File |
3.0 |
Regression |
Salford
Systems |
MARS |
N/A |
View |
2.0 |
Tree |
Weka |
Weka 3-3-5 |
Anneal* |
View |
2.0 |
Tree |
Weka |
Weka 3-3-5 |
Audiology* |
View |
2.0 |
Tree |
Weka |
Weka 3-3-5 |
Autos* |
View |
2.0 |
Tree |
Weka |
Weka 3-3-5 |
Balance Scale* |
View |
2.0 |
Tree |
Weka |
Weka 3-3-5 |
Breast Cancer* |
View |
2.0 |
Tree |
Weka |
Weka 3-3-5 |
Wisconsin Breast Cancer* |
View |
Acknowledgements:
The Data Mining Group thanks the UCI Repository of Machine
Learning Databases for being a valuable resource:
Blake, C.L. & Merz, C.J. (1998). UCI Repository of
machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html].
Irvine, CA: University of California, Department of Information and Computer
Science.
|