Data Mining and Knowledge Discovery vol:28 issue:3 pages:593-633
In this paper, we present ParaMiner which is a generic
and parallel algorithm for closed pattern mining. ParaMiner
is built on the principles of pattern enumeration in strongly accessible set systems. Its efficiency is due to a novel dataset reduction technique (that we call
EL-reduction), combined with novel technique for performing dataset reduction in a parallel execution on a multi-core architecture. We illustrate ParaMiner’s genericity by using this algorithm to solve three different pattern mining problems: the frequent itemset mining problem, the mining frequent connected relational graphs problem and the mining gradual itemsets problem. In this paper, we prove the soundness and the completeness of ParaMiner. Furthermore, our experiments show that despite being a generic algorithm,
ParaMiner can compete with specialized state of the art algorithms designed for the pattern mining problems mentioned above. Besides, for the particular problem of gradual itemset mining, ParaMiner outperforms the state of the art algorithm by two orders of magnitude.