Main Page | Namespace List | Class List | File List | Class Members | File Members

Apriori_Trie Class Reference

Apriori_Trie (or prefix-tree) is a tree-based datastructure. More...

#include <Apriori_Trie.hpp>

Collaboration diagram for Apriori_Trie:

[legend]
List of all members.

Public Member Functions

 Apriori_Trie (const countertype counter_of_emptyset)
void insert_frequent_items (const vector< countertype > &counters)
 Insert the frequent items and their counters into the trie;.
void candidate_generation (const itemtype &frequent_size, Input_Output_Manager &input_output_manager)
 Generates candidates.
void find_candidate (const vector< itemtype > &basket, const itemtype candidate_size, const countertype counter=1)
 Increases the counter of those candidates that are contained by the given basket.
void delete_infrequent (const double min_occurrence, const itemtype candidate_size)
 Deletes unfrequent itemsets.
bool is_there_any_candidate () const
 Returns true if trie is not empty.
 ~Apriori_Trie ()

Protected Member Functions

bool is_all_subset_frequent (const set< itemtype > &maybe_candidate) const
 Decides if all subset of an itemset is contained in the Apriori_Trie.
void candidate_generation_two ()
 Generates candidate of size two.
void candidate_generation_assist (Trie *trie, set< itemtype > &maybe_candidate, Input_Output_Manager &input_output_manager)
 Generates candidate of size more than two.
void find_candidate_two (const vector< itemtype > &basket, const countertype counter=1)
 Increases the counter for those itempairs that are in the given basket.
void delete_infrequent_two (const double min_occurrence)
 Deletes the Tries that represent infrequent itemsets of size 2.

Protected Attributes

Trie main_trie
 Trie to store the candidates.
vector< vector< countertype > > temp_counter_array
 temp_counter_array stores the occurences of the itempairs

Detailed Description

Apriori_Trie (or prefix-tree) is a tree-based datastructure.

Apriori_Trie is a special tree designed to provide efficient methods for the apriori algorithm. It mostly uses a regular trie except when there exist faster solution. For example for storing one and two itemset candidate where a simple vector and array gives better performance. Apriori_Trie extends the functions provided by the regular trie with a candidate generation process.

Definition at line 35 of file Apriori_Trie.hpp.


Constructor & Destructor Documentation

Apriori_Trie::Apriori_Trie const countertype  counter_of_emptyset  )  [inline]
 

Parameters:
counter_of_emptyset The support of the empty set, i.e. the number of transactions.

Definition at line 43 of file Apriori_Trie.hpp.

Apriori_Trie::~Apriori_Trie  )  [inline]
 

Definition at line 74 of file Apriori_Trie.hpp.


Member Function Documentation

void Apriori_Trie::candidate_generation const itemtype frequent_size,
Input_Output_Manager input_output_manager
 

Generates candidates.

Generates candidates and writes frequent itemset that are obtained in the previous iteration to disk.

Definition at line 32 of file Apriori_Trie.cpp.

References candidate_generation_assist(), candidate_generation_two(), and main_trie.

void Apriori_Trie::candidate_generation_assist Trie trie,
set< itemtype > &  maybe_candidate,
Input_Output_Manager input_output_manager
[protected]
 

Generates candidate of size more than two.

Definition at line 121 of file Apriori_Trie.cpp.

References Trie::add_empty_state(), Trie::counter, Trie::edgevector, is_all_subset_frequent(), and Input_Output_Manager::write_out_basket_and_counter().

Referenced by candidate_generation().

void Apriori_Trie::candidate_generation_two  )  [protected]
 

Generates candidate of size two.

Definition at line 104 of file Apriori_Trie.cpp.

References Trie::edgevector, main_trie, and temp_counter_array.

Referenced by candidate_generation().

void Apriori_Trie::delete_infrequent const double  min_occurrence,
const itemtype  candidate_size
 

Deletes unfrequent itemsets.

Parameters:
min_occurrence The threshold of absolute support.
candidate_size The size of the candidate itemset.

Definition at line 69 of file Apriori_Trie.cpp.

References Trie::delete_infrequent(), delete_infrequent_two(), and main_trie.

void Apriori_Trie::delete_infrequent_two const double  min_occurrence  )  [protected]
 

Deletes the Tries that represent infrequent itemsets of size 2.

Parameters:
min_occurrence The occurence threshold

temp_counter_array[stateIndex_1] will never be used again!

temp_counter_array will never be used again!

Definition at line 203 of file Apriori_Trie.cpp.

References Trie::add_empty_state(), Trie::edgevector, main_trie, and temp_counter_array.

Referenced by delete_infrequent().

void Apriori_Trie::find_candidate const vector< itemtype > &  basket,
const itemtype  candidate_size,
const countertype  counter_incr = 1
 

Increases the counter of those candidates that are contained by the given basket.

Parameters:
basket The basket that hs to be analyzed.
candidate_size the size of the candidates.
counter_incr The number of time the basket occured. The counters of candidates that occure in the basket has to be incremented by counter_incr.

Definition at line 53 of file Apriori_Trie.cpp.

References Trie::find_candidate(), find_candidate_two(), and main_trie.

Referenced by Apriori::support().

void Apriori_Trie::find_candidate_two const vector< itemtype > &  basket,
const countertype  counter = 1
[protected]
 

Increases the counter for those itempairs that are in the given basket.

Parameters:
basket the given basket
counter The number the processed basket occures in the transactional database

Definition at line 183 of file Apriori_Trie.cpp.

References temp_counter_array.

Referenced by find_candidate().

void Apriori_Trie::insert_frequent_items const vector< countertype > &  counters  ) 
 

Insert the frequent items and their counters into the trie;.

Parameters:
counters It stores the support of the items. counters[i] stores the suport of item i.

Definition at line 20 of file Apriori_Trie.cpp.

References Trie::add_empty_state(), and main_trie.

bool Apriori_Trie::is_all_subset_frequent const set< itemtype > &  maybe_candidate  )  const [protected]
 

Decides if all subset of an itemset is contained in the Apriori_Trie.

Parameters:
maybe_candidate The itemset that has to be checked.

Definition at line 82 of file Apriori_Trie.cpp.

References Trie::is_included(), and main_trie.

Referenced by candidate_generation_assist().

bool Apriori_Trie::is_there_any_candidate  )  const [inline]
 

Returns true if trie is not empty.

Definition at line 69 of file Apriori_Trie.hpp.


Member Data Documentation

Trie Apriori_Trie::main_trie [protected]
 

Trie to store the candidates.

Definition at line 105 of file Apriori_Trie.hpp.

Referenced by candidate_generation(), candidate_generation_two(), delete_infrequent(), delete_infrequent_two(), find_candidate(), insert_frequent_items(), and is_all_subset_frequent().

vector< vector<countertype> > Apriori_Trie::temp_counter_array [protected]
 

temp_counter_array stores the occurences of the itempairs

We can use a simple array to determine the support of itemset of size two. This requires less memory than the trie-based supportcount. temp_counter_array[i][j-i] stores the occurence of the itempair (i,j).

Definition at line 113 of file Apriori_Trie.hpp.

Referenced by candidate_generation_two(), delete_infrequent_two(), and find_candidate_two().


The documentation for this class was generated from the following files:
Generated on Fri Mar 11 14:48:06 2005 for APRIORI algorithm by  doxygen 1.3.9.1