Main Page | Namespace List | Class Hierarchy | Class List | Directories | File List | Namespace Members | Class Members | File Members

main-fsm.cpp File Reference

The main function of frequent itemsequence mining algorithms. More...

#include "common.hpp"
#include "io/input/transaction_reader/brBufferedTransactionReader.hpp"
#include "io/codec/decoder/df/CacheDFDecoder.hpp"
#include "util/StreamParser.hpp"
#include "util/SeqFrequentFilter.cpp"
#include "datastructures/maxvector.hpp"
#include "datastructures/trie/edgelist/OrderedEdgelist.hpp"
#include "apriori/bodon/Trie.hpp"
#include "apriori/SeqAprioriSelector.hpp"
#include <vector>
#include <iostream>
#include <string>

Include dependency graph for main-fsm.cpp:

Go to the source code of this file.

Functions

void init ()
void usage ()
 commandline -- some utility methods for fim commandline tools.
int process_arguments (int argc, char *argv[], counter_t &min_supp, bool &isrel, double &relminsupp, unsigned int &maxsize)
 This procedure processes the arguments.
int main (int argc, char *argv[])

Variables

std::string file_format


Detailed Description

The main function of frequent itemsequence mining algorithms.

Mining frequent sequence of item is a natural generalization of frequent itemset mining (FIM). Given a sequence of transactions, where each transaction is a sequence over an alphabet $I$ , we have to find the sequences that occure as a subsequence in at least $min\_supp$ number of transactions. A sequence $s=\langle i_1, i_2, \ldots, i_n\rangle$ is a subsequence of $s'=\langle i'_1, i'_2, \ldots, i'_m\rangle$ if there exist integers $1\le j_1 < j_2 < \cdots < j_n \le m$ , such that $i_1 = i'_{j_1}, i_2 = i'_{j_2}, \ldots, i_n = i'_{j_n}$ , i.e. we can get $s$ by deleting some items from $s'$ . For example $\langle e, a, a, b\rangle \prec \langle f, e, a, b, c, a, a, c, b\rangle$ because $i_1 = 2$, $i_2 = 3$, $i_4 = 6$, $i_4 = 9$ meet the requirements.

Currently this program only contains a trie-based Apriori implementation. For more information about this solution the reader is referred to the paper "A Trie-based APRIORI Implementation for Mining Frequent Item sequences" from Ferenc Bodon. It can be downloaded from the webpage of OSDM'05 workshop of ACM SIGKDD (http://www.cs.rpi.edu/~zaki/OSDM05/papers/p56-bodon.pdf).

Author:
Ferenc BODON
Date:
2005-04-19

Definition in file main-fsm.cpp.


Function Documentation

void init  ) 
 

Definition at line 52 of file main-fsm.cpp.

References file_format.

int main int  argc,
char *  argv[]
 

Definition at line 149 of file main-fsm.cpp.

References SeqFrequentFilter< IT_R >::findFrequentItems(), init(), process_arguments(), FileReprBase::READ, usage(), and FileReprBase::WRITE.

int process_arguments int  argc,
char *  argv[],
counter_t min_supp,
bool &  isrel,
double &  relminsupp,
unsigned int &  maxsize
 

This procedure processes the arguments.

Returns:
  • 0, if no error is occured,
  • 1, in case of an IO error,
  • 2, in case of too few arguments,
  • 3, if proper min_supp cannot be generated.

Definition at line 88 of file main-fsm.cpp.

References convert(), largest_itemsetsize, and usage().

void usage  ) 
 

commandline -- some utility methods for fim commandline tools.

Definition at line 63 of file main-fsm.cpp.

References file_format.


Variable Documentation

std::string file_format
 

Definition at line 50 of file main-fsm.cpp.


Generated on Sun Sep 17 17:53:13 2006 for FIM environment by  doxygen 1.4.4