PMML Associative Rules
PMML1.1 Menu

Home


PMML Notice and License

PMML Conformance

Header

Data Dictionary

Mining Schema

Statistics

Normalization

Tree Model

General Regression

General Structure

Association Rules

Neural Network

Center and Distribution - based Clustering

PMML 1.1 DTD

Download PMML v1.1 (zip)

PMML 1.1 -- DTD of Association Rules Model

The Association Rule model represents rules where some set of items is associated to another set of items. For example a rule can express that a certain product is often bought in combination with a certain set of other products.

The attribute definitions of the association rule model uses the entity ELEMENT-ID in order to express a semantical constraint that a value must be unique in a set of elements (contained in the same XML document) of the same type.

			
	<!ENTITY % ELEMENT-ID "CDATA">

An Association Rule model consists of four major parts:

					
	<!ELEMENT AssociationModel (Extension*, AssocInputStats,  AssocItem+, 
	     AssocItemset+, AssocRule+)>

	<!ATTLIST AssociationModel
	     modelName     CDATA     #IMPLIED
	>

Basic information of the input data:

		
		<!ELEMENT AssocInputStats EMPTY>
		
		
		<!ATTLIST AssocInputStats
		     numberOfTransactions     %INT-NUMBER;     #REQUIRED
		     maxNumberOfItemsPerTA    %INT-NUMBER;     #IMPLIED
		     avgNumberOfItemsPerTA    %REAL-NUMBER;    #IMPLIED
		     minimumSupport           %PROB-NUMBER;    #REQUIRED
		     minimumConfidence        %PROB-NUMBER;    #REQUIRED
		     lengthLimit              %INT-NUMBER;     #IMPLIED
		     numberOfItems            %INT-NUMBER;     #REQUIRED
		     numberOfItemsets         %INT-NUMBER;     #REQUIRED
		     numberOfRules            %INT-NUMBER;     #REQUIRED
		>

Attribute description:

    numberOfTransactions : The number of transactions (baskets of items) contained in the input data.

    maxNumberOfItemsPerTA : The number of items contained in the largest transaction.

    avgNumberOfItemsPerTA : The average number of items contained in a transaction.

    minimumSupport : The minimum relative support value (#supporting transactions / #total transactions) satisfied by all rules.

    minimumConfidence : The minimum confidence value satisfied by all rules. Confidence is calculated as (support (rule) / support(antecedent)).

    lengthLimit : The maximum number of items contained in a rule which was used to limit the number of rules.

    numberOfItems : The number of different items contained in the input data.

    numberOfItemsets : The number of itemsets contained in the model.

    numberOfRules : The number of rules contained in the model.


Items contained in itemsets


	<!ELEMENT AssocItem EMPTY>
		
	<!ATTLIST AssocItem
	     id           %ELEMENT-ID;       #REQUIRED
	     value        CDATA              #REQUIRED
	     mappedValue  CDATA              #IMPLIED
	     weight       %REAL-NUMBER;      #IMPLIED
	>

Attribute description:

    id : An identification to uniquely identify an item.

    value : The value of the item as in the input data.

    mappedValue : Optional, a value to which the original item value is mapped. For instance, this could be a product name if the original value is an EAN code.

    weight : The weight of the item. For example, the price or value of an item.


Itemsets which are contained in rules

					
	<!ELEMENT AssocItemset (Extension*, AssocItemRef+)>
	
	<!ATTLIST AssocItemset
	     id              %ELEMENT-ID;        #REQUIRED
	     support         %PROB-NUMBER;       #REQUIRED
	     numberOfItems   %INT-NUMBER;        #REQUIRED
	>
	

Attribute description:

    id : An identification to uniquely identify an itemset.

    support : The relative support of the itemset.

    numberOfItems : The number of items contained in this itemset.

    Subelements : Item references to point to elements of type item.

					
	<!ELEMENT AssocItemRef EMPTY>
					
	<!ATTLIST AssocItemRef
	     itemRef           %ELEMENT-ID;     #REQUIRED
	>
	

Attribute description:

    itemRef : The id value of an item element.


Rules: Elements of the form <antecedent itemset> => <consequent itemset>

					
	<!ELEMENT AssocRule( Extension* )>

	<!ATTLIST AssocRule
	     support           %PROB-NUMBER;      #REQUIRED
	     confidence        %PROB-NUMBER;      #REQUIRED
	     antecedent        %ELEMENT-ID;       #REQUIRED
	     consequent        %ELEMENT-ID;       #REQUIRED
	>
	

Attribute definitions:

    support : The relative support of the rule.

    confidence : The confidence of the rule.

    antecedent : The id value of the itemset which is the antecedent of the rule.

    consequent : The id value of the itemset which is the consequent of the rule.


Example:

Let's assume we have four transactions with the following data:

    t1: Cracker, Coke, Water

    t2: Cracker, Water

    t3: Cracker, Water

    t4: Cracker, Coke, Water

			
	<?xml version="1.0" ?>
	<PMML version="1.1">

	<Header copyright="www.dmg.org" 
	     description="example model for association rules"/>

	<DataDictionary numberOfFields="1"/>
	<DataField name="item" optype="categorical"/>
	</DataDictionary>
	
	<AssociationModel>
	
	<AssocInputStats numberOfTransactions="4" 
	     numberOfItems="3" minimumSupport="0.6" 
             minimumConfidence="0.5" numberOfItemsets="3" 
	     numberOfRules="2"/>
	
	<!-- We have three items in our input data -->
	
	<AssocItem id="1"value="Cracker"/>
	<AssocItem id="2"value="Coke"/>
	<AssocItem id="3"value="Water"/>
	
	<!-- and two frequent itemsets with a single item -->
	
	<AssocItemset id="1"support="1.0" 
	     numberOfItems="1"/>
	<AssocItemRef itemRef="1"/>
	</AssocItemset>
	
	<AssocItemset id="2" support="1.0"
	     numberOfItems="1"/>
	<AssocItemRef itemRef="3"/>
	</AssocItemset>
	
	<!-- and one frequent itemset with two items. -->
	
	<AssocItemset id="3" support="1.0" 
	     numberOfItems="2"/>
	<AssocItemRef itemRef="1"/>
	<AssocItemRef itemRef="3"/>
	</AssocItemset>

	<!-- Two rules satisfy the requirements -->
	
	<AssocRule support="1.0" confidence="1.0"
	     antecedent="1" consequent="2"/>
	<AssocRule support="1.0" confidence="1.0"
	     antecedent="2" consequent="1"/>

	</AssociationModel>
	</PMML>
e-mail info at dmg.org