2.4 Frequent itemsets

Assuming four customers (f, g, h, i) purchased the items a, b, c, d in a supermarket.

Let’s find out the common itemsets across three or more customers from the purchase data. The procedure is simple. First, find out all subsets of the itemset each customer has purchased, then aggregate them. The method is shown as follows.

> f=(a+1)*(b+1)*(c+1)*(d+1)
> g=(b+1)*(d+1)
> h=(a+1)*(c+1)*(d+1)
> i=(a+1)*(b+1)*(d+1)
> all=f+g+h+i
> all.show
a b c d + a b c + 2 a b d + 2 a b + 2 a c d + 2 a c + 3 a d + 3 a + b c d + b c +
 3 b d + 3 b + 2 c d + 2 c + 4 d + 4  

Based on the above result, there are two customers who purchased products a, b, and two customers who purchased products a, b, c. Now, use the termsGE function to select the terms with a weight that is equal or greater than 3.

> all.termsGE(3).show
 3 a d + 3 a + 3 b d + 3 b + 4 d + 4

Alternatively, one may use the restrict function to select the itemsets which include itemset "a b", alternatively, use the permit function to select itemsets which contains itemset "a b".

> all.restrict("a b").show
a b c d + a b c + 2 a b d + 2 a b 
> all.permit("a b").show
2 a b + 3 a + 3 b + 4  

The ZDD package contains several methods to store the enumeration results of frequent itemsets as ZDD object other than the method described above. More details can be found infreqpatA function or lcm function.