Title: Declarative Pattern Mining using Constraint Programming (Een declaratieve aanpak tot pattern mining door middel van constraint programming)
Other Titles: Declarative Pattern Mining using Constraint Programming
Authors: Guns, Mattias
Issue Date: 27-Jan-2012
Abstract: The goal of pattern mining is to discover patterns in data. Many techniques have been proposed for this task, differing in the type of patterns they find. To ensure that only patterns of interest are found, a common approach is to impose constraints on the patterns. Constraint-based mining systems exist in which multiple constraints can be specified. However, combining constraints in new ways or adding complex constraints requires changing the underlying algorithms. A truly general approach to constraint-based pattern mining has been missing.In this thesis we propose a general, declarative approach to pattern mining based on constraint programming. In a declarative approach one specifies what patterns need to be found, instead of algorithmically specifying how they must be found. Constraint programming offers a methodology in which a problem is stated in terms of constraints and a generic solver finds the solutions.A first contribution of this thesis is that we show how constraint programming can be used to solve constraint-based, closed and discriminative itemset mining problems as well as combinations thereof. A second contribution is that we demonstrate how the difference in performance between general constraint solvers and specialised mining algorithms can be reduced. A third contribution is the introduction of the k-pattern set mining problem, which involves finding a set of k patterns that together satisfy constraints. We propose a high-level declarative language for k-pattern set mining as well as a transformation of this language to constraint programming. Finally we apply our declarative pattern mining framework on a challenging problem in bioinformatics, namely cis-regulatory module detection. For this application, the ability to add domain-specific constraints and to combine them with existing constraints is essential.Hence we investigate for the first time how constraint programming can be used in pattern mining. We conclude on this promising approach with several remaining challenges.
Table of Contents: 1 Introduction
1.1 Data Mining
1.2 Constraint Programming
1.3 Contributions
1.4 Structure of the thesis
2 Background
2.1 Pattern Mining
2.2 Constraint Programming (CP)
3 Constraint-based Itemset Mining using CP
3.1 Introduction
3.2 Frequent Itemset Mining
3.3 Closed Itemset Mining
3.4 Discriminative Itemset Mining
3.5 Itemset Mining with Costs
3.6 Experiments
3.7 Conclusions
4 Integrating Constraint Programming and Itemset Mining
4.1 Introduction
4.2 Existing Methods
4.3 Comparison of Methods
4.4 An Integrated Approach
4.5 Complexity Analysis
4.6 Experiments
4.7 Conclusions
5 k-Pattern Set Mining using CP
5.1 Introduction
5.2 k-Pattern Set Mining
5.3 Constraint Programming Model
5.4 Experiments
5.5 Related Work
5.6 Conclusions
6 Evaluating Pattern Set Mining Strategies using CP
6.1 Introduction
6.2 Pattern Set Mining Task
6.3 Constraint Programming Model
6.4 Experiments
6.5 Conclusions
7 cis-Regulatory Module Detection using CP
7.1 Introduction
7.2 Method Overview
7.3 Constraint Programming Model
7.4 Experiments
7.5 Conclusions
8 Conclusions and Future Work
8.1 Summary and conclusions
8.2 Discussion and Future work
ISBN: 978-94-6018-468-0
Publication status: published
KU Leuven publication type: TH
Appears in Collections:Informatics Section

Files in This Item:
File Status SizeFormat
phd_thesis_tias.pdf Published 4413KbAdobe PDFView/Open


All items in Lirias are protected by copyright, with all rights reserved.