A priori - Frequent Item Approach

Main Idea

Supersets of infrequent Itemsets should not be gernerated and tested.

Downward-Closure Property

Method

  • Scan Database to get 1-frequent-Itemsets
  • Generate k+1 long candidates where each subset is frequent (Downward-Closure Property)
  • Test these against DB and discard infrequent Itemsets
  • Terminate when no further candidate can be generated.

Example

Problems

  • DB has to be scanned a lot of times.
  • Count of candidates can get huge (exponential crossproduct).

Itemsets can be stored in Hashtrees like this: