Download PDF

Bioinformatics

Publication date: 2008-01-01
Volume: 24 Pages: 63 - 70
Publisher: Oxford University Press

Author:

Mantini, Dante
Petrucci, Francesca ; Del Boccio, Piero ; Pieragostino, Damiana ; Di Nicola, Marta ; Lugaresi, Alessandra ; Federici, Giorgio ; Sacchetta, Paolo ; Di Ilio, Carmine ; Urbani, Andrea

Keywords:

Algorithms, Amino Acid Sequence, Molecular Sequence Data, Pattern Recognition, Automated, Peptide Mapping, Principal Component Analysis, Proteome, Reproducibility of Results, Sensitivity and Specificity, Sequence Analysis, Protein, Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization, Science & Technology, Life Sciences & Biomedicine, Technology, Physical Sciences, Biochemical Research Methods, Biotechnology & Applied Microbiology, Computer Science, Interdisciplinary Applications, Mathematical & Computational Biology, Statistics & Probability, Biochemistry & Molecular Biology, Computer Science, Mathematics, LASER-DESORPTION, PEAK DETECTION, ALGORITHMS, SPECTROMETRY, SEPARATION, LIMITATIONS, IONIZATION, MS, 01 Mathematical Sciences, 06 Biological Sciences, 08 Information and Computing Sciences, Bioinformatics, 31 Biological sciences, 46 Information and computing sciences, 49 Mathematical sciences

Abstract:

MOTIVATION: Independent component analysis (ICA) is a signal processing technique that can be utilized to recover independent signals from a set of their linear mixtures. We propose ICA for the analysis of signals obtained from large proteomics investigations such as clinical multi-subject studies based on MALDI-TOF MS profiling. The method is validated on simulated and experimental data for demonstrating its capability of correctly extracting protein profiles from MALDI-TOF mass spectra. RESULTS: The comparison on peak detection with an open-source and two commercial methods shows its superior reliability in reducing the false discovery rate of protein peak masses. Moreover, the integration of ICA and statistical tests for detecting the differences in peak intensities between experimental groups allows to identify protein peaks that could be indicators of a diseased state. This data-driven approach demonstrates to be a promising tool for biomarker-discovery studies based on MALDI-TOF MS technology. AVAILABILITY: The MATLAB implementation of the method described in the article and both simulated and experimental data are freely available at http://www.unich.it/proteomica/bioinf/.