3.31 lcm : LCM over ZDD

Format

$obj$.lcm($type,transaction,minsup[,order,ub]$) $\rightarrow $ $zdd$

  $transaction$ : string

  $type$ : string

  $minsup$ : integer

  $order$ : string

  $ub$ : integer

Description

Using LCM over ZDD algorithm, enumerate frequent patterns above the specified minimum support $minsup$ from the transaction file specified at $transaction$, and return as ZDD object $zdd$. Specify the ZDD item order file at $order$, and specify the upper limit of the itemset size at $ub$.

Three types of frequent patterns can be enumerated which is defined at $type$ including "F" (frequent itemset), "M" (maximal itemset), "C" (closed itemsets).

Further, "FQ", attached with "Q", returns the frequencies and weight of each frequent itemset enumerated. When "F" is used without "Q", itemsets that meet the minimum support is returned without frequency information.

The transaction file as shown in the text file below, one row corresponds to one transaction. Items are specified by sequential numbers starting from 1, with a space delimiter between items.

Alphabet cannot be used as an item.

1 2 3 6
4 5 6
1 2 4 6
2 4 6
1 2 4 5

$order$ file is a text file that shows the order of items registered in the ZDD item order table. Typically, all items contained in the transaction data are assigned to sequential numbers. In addition, note that when if there is a missing transaction item number, the missing number must be specified.

1 2 3 4 5 6

When $order$ file is not specified (or specify as nil), the item order will be determined by the internal algorithm of LCM to increase efficiency.

However, this method, the item number is assigned sequentially in order, thus, the item number assigned to ZDD frequent itemset will be different than the original transaction number.

As long as the analysis is not related to the content of the item, it is computationally more efficient to exclude $order$. Conversely, if the purpose is to analyze the contents of the items, an order file as shown above should be specified.

Specify the upper limit for the size of frequent itemsets to be enumerated at $ub$. When the parameter is not specified, nil will be assigned when there is no maximum limit to the enumeration.

Examples

Example 1: Basic Example

> require 'zdd'
# Contents of tra.txt
# 1 2 3 6
# 4 5 6
# 1 2 4 6
# 2 4 6
# 1 2 4 5
# Contents of order.txt
# 1 2 3 4 5 6
> p1=ZDD::lcm("FQ","tra.txt",3,"order.txt")
> p1.show
 3 x6 x4 + 3 x6 x2 + 4 x6 + 3 x4 x2 + 4 x4 + 3 x2 x1 + 4 x2 + 3 x1 + 5

# If order file is not specified, the resulting frequent itemset will be the same
# Note that the item number in the results is different from the item number
# in the transaction file.
> p2=ZDD::lcm("FQ","tra.txt",3)
> p2.show
 3 x4 x1 + 3 x4 + 3 x3 x2 + 3 x3 x1 + 4 x3 + 3 x2 x1 + 4 x2 + 4 x1 + 5

See Also

freqpatA : Enumerate frequent itemsets

freqpatM : Enumerate maximal itemsets

freqpatC : Enumerate closed itemsets