Download PDF

Machine Learning

Publication date: 2016-10-01
Volume: 105 Pages: 41 - 75
Publisher: Springer New York LLC

Author:

van Leeuwen, Matthijs
De Bie, Tijl ; Spyropoulou, Eirini ; Mesnage, Cédric

Keywords:

Science & Technology, Technology, Computer Science, Artificial Intelligence, Computer Science, Dense subgraph patterns, Community detection, Subjective interestingness, Maximum entropy, DISCOVERY, 0801 Artificial Intelligence and Image Processing, 0806 Information Systems, 1702 Cognitive Sciences, Artificial Intelligence & Image Processing, 4611 Machine learning

Abstract:

© 2016, The Author(s). The utility of a dense subgraph in gaining a better understanding of a graph has been formalised in numerous ways, each striking a different balance between approximating actual interestingness and computational efficiency. A difficulty in making this trade-off is that, while computational cost of an algorithm is relatively well-defined, a pattern’s interestingness is fundamentally subjective. This means that this latter aspect is often treated only informally or neglected, and instead some form of density is used as a proxy. We resolve this difficulty by formalising what makes a dense subgraph pattern interesting to a given user. Unsurprisingly, the resulting measure is dependent on the prior beliefs of the user about the graph. For concreteness, in this paper we consider two cases: one case where the user only has a belief about the overall density of the graph, and another case where the user has prior beliefs about the degrees of the vertices. Furthermore, we illustrate how the resulting interestingness measure is different from previous proposals. We also propose effective exact and approximate algorithms for mining the most interesting dense subgraph according to the proposed measure. Usefully, the proposed interestingness measure and approach lend themselves well to iterative dense subgraph discovery. Contrary to most existing approaches, our method naturally allows subsequently found patterns to be overlapping. The empirical evaluation highlights the properties of the new interestingness measure given different prior belief sets, and our approach’s ability to find interesting subgraphs that other methods are unable to find.