Title: DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes
Authors: Van Deun, Katrijn ×
Van Mechelen, Iven
Thorrez, Lieven
Schouteden, Martijn
De Moor, Bart
van der Werf, Mariët J.
De Lathauwer, Lieven
Smilde, Age K.
Kiers, Henk A. L. #
Issue Date: 2012
Publisher: Public Library of Sciene
Series Title: PLoS One vol:7 issue:5 pages:1-13
Article number: e37840
Abstract: Background: In systems biology it is common to obtain for the same set of biological entities information from multiple
sources. Examples include expression data for the same set of orthologous genes screened in different organisms and data
on the same set of culture samples obtained with different high-throughput techniques. A major challenge is to find the
important biological processes underlying the data and to disentangle therein processes common to all data sources and
processes distinctive for a specific source. Recently, two promising simultaneous data integration methods have been
proposed to attain this goal, namely generalized singular value decomposition (GSVD) and simultaneous component
analysis with rotation to common and distinctive components (DISCO-SCA).
Results: Both theoretical analyses and applications to biologically relevant data show that: (1) straightforward applications
of GSVD yield unsatisfactory results, (2) DISCO-SCA performs well, (3) provided proper pre-processing and algorithmic
adaptations, GSVD reaches a performance level similar to that of DISCO-SCA, and (4) DISCO-SCA is directly generalizable to
more than two data sources. The biological relevance of DISCO-SCA is illustrated with two applications. First, in a setting of
comparative genomics, it is shown that DISCO-SCA recovers a common theme of cell cycle progression and a yeast-specific
response to pheromones. The biological annotation was obtained by applying Gene Set Enrichment Analysis in an
appropriate way. Second, in an application of DISCO-SCA to metabolomics data for Escherichia coli obtained with two
different chemical analysis platforms, it is illustrated that the metabolites involved in some of the biological processes
underlying the data are detected by one of the two platforms only; therefore, platforms for microbial metabolomics should
be tailored to the biological question.
Conclusions: Both DISCO-SCA and properly applied GSVD are promising integrative methods for finding common and
distinctive processes in multisource data. Open source code for both methods is provided.
ISSN: 1932-6203
Publication status: published
KU Leuven publication type: IT
Appears in Collections:Quantitative Psychology and Individual Differences
ESAT - STADIUS, Stadius Centre for Dynamical Systems, Signal Processing and Data Analytics
Electrical Engineering (ESAT), Campus Kulak Kortrijk
Electrical Engineering - miscellaneous
× corresponding author
# (joint) last author

Files in This Item:

There are no files associated with this item.

Request a copy


All items in Lirias are protected by copyright, with all rights reserved.

© Web of science