Lecture Notes in Computer Science vol:8577 pages:25-30
International Conference on Conceptual Structures edition:21 location:Iasi, Romania date:27-30 July 2014
Over the past decades, one has seen databases of ever increasing size and com-
plexity. While the increasing size is easy to measure in bytes, kilobytes or ter-
abytes, the increase in complexity is more difficult to quantify, however, it has
a very deep effect on the theory we use to reason about the data. While in
earlier days many researchers reasoned in terms of sets of similarly structured
and independent objects, today we are facing large networks of data where ev-
erything is connected directly or indirectly to everything else. Examples include
social networks, traffic networks, biological networks, administrative networks
and economic networks.
These developments have spurred a renewed interest in data storage and
knowledge extraction (answers to queries, patterns, models, ...). Three key un-
derlying challenges are the representation of the data and knowledge, managing
the computational cost of the problems which we need to solve and the statistical
challenge related to the complexity of the data.
In this contribution, I will survey these challenges from a data mining point of
view. I will argue that in order to address the current challenges it is valuable to
gain a better understanding of fundamental statistical and algorithmic properties
of large data networks and to integrate ideas from the many fields of research
that are concerned with such networks.