Lecture Notes in Computer Science vol:5476 pages:664-672
Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining edition:13 location:Bangkok date:27-30 April
The usual data mining setting uses the full amount of data to derive patterns for different purposes. Taking cues from machine learning techniques, we explore ways to divide the data into subsets, mine patterns on them and use post-processing techniques for acquiring the result set. Using the patterns as features for a classification task to evaluate their quality, we compare the different subset compositions, and selection techniques. The two main results -- that small independent sets are better suited than large amounts of data, and that uninformed selection techniques perform well -- can to a certain degree be explained by quantitative characteristics of the derived pattern sets.