Chemometrics and Intelligent Laboratory Systems vol:104 issue:1 pages:83-94
As a consequence of our information society, not only more and larger data sets become available, but also data sets that include multiple sorts of information regarding the same system. Such data sets can be denoted by the terms coupled, linked, or multiset data, and the associated data analysis can be denoted by the term data fusion. In this paper, we first give a formal description of coupled data, which allows the data-analyst to typify the structure of a coupled data set at hand. Second, we list two meta-questions and a series of complicating factors that may be useful to focus the initial content-driven research questions that go with coupled data, and to choose a suitable data-analytic method. Third, we propose a generic framework for a family of decomposition-based models pertaining to an important subset of data fusion problems. This framework is intended to constitute both a means to arrive at a better understanding of the features and the interrelations of the specific models subsumed by it, and as a powerful device for the development of novel, custom-made data fusion models. We conclude the paper by showing how the proposed formal data description, meta-questions, and generic model may assist the data-analyst in choosing and developing suitable strategies for the treatment of coupled data in practice. Throughout the paper we illustrate with examples from the domain of systems biology. (C) 2010 Elsevier B.V. All rights reserved.