Download PDF

IEEE International Conference on Data Mining (ICDM 2015), Date: 2016/12/13 - 2016/12/15, Location: Barcelona, Spain

Publication date: 2016-12-01
Pages: 913 - 918
ISSN: 9781509054725
Publisher: IEEE Computer Society

Proceedings of the 2016 IEEE International Conference on Data Mining (ICDM 2016)

Author:

Guns, Tias
Aknin, Achille ; Lijffijt, Jefrey ; De Bie, Tijl ; Bonchi, F ; DomingoFerrer, J ; BaezaYates, R ; Zhou, ZH ; Wu, X

Keywords:

Science & Technology, Technology, Computer Science, Artificial Intelligence, Computer Science, Information Systems, Computer Science, Data mining, Relational databases

Abstract:

Data is typically complex and relational. Therefore, the development of relational data mining methods is an in- creasingly active topic of research. Recent work has resulted in new formalisations of patterns in relational data and in a way to quantify their interestingness in a subjective manner, taking into account the data analyst’s prior beliefs about the data. Yet, a scalable algorithm to find such most interesting patterns is lacking. We introduce a new algorithm based on two notions: (1) the use of Constraint Programming, which results in a notably shorter development time, faster runtimes, and more flexibility for extensions such as branch-and-bound search; and (2), the direct search for the most interesting patterns only, instead of exhaustive enumeration of patterns before ranking them. Through empirical evaluation, we find that our novel bounds yield speedups up to several orders of magnitude, especially on dense data with a simple schema. This makes it possible to mine the most subjectively-interesting relational patterns present in databases where this was previously impractical or impossible.