Download PDF

Bioinformatics

Publication date: 2012-12-01
Volume: 28 Pages: 3081 - 3088
Publisher: Oxford University Press

Author:

Böringen, D
Tranchevent, Léon-Charles ; Bonachela Capdevila, Francisco ; Devriendt, Koenraad ; De Moor, Bart ; De Causmaecker, Patrick ; Moreau, Yves

Keywords:

ITEC, SISTA, iMinds, Science & Technology, Life Sciences & Biomedicine, Technology, Physical Sciences, Biochemical Research Methods, Biotechnology & Applied Microbiology, Computer Science, Interdisciplinary Applications, Mathematical & Computational Biology, Statistics & Probability, Biochemistry & Molecular Biology, Computer Science, Mathematics, GENOME-WIDE ASSOCIATION, IDENTIFIES SUSCEPTIBILITY LOCI, CANDIDATE GENES, CONGENITAL-ANOMALIES, RECEPTOR GENE, DISEASE GENES, MUTATIONS, VARIANTS, DUPLICATION, HAPLOINSUFFICIENCY, Databases, Genetic, Genetic Association Studies, Humans, Internet, 01 Mathematical Sciences, 06 Biological Sciences, 08 Information and Computing Sciences, Bioinformatics, 31 Biological sciences, 46 Information and computing sciences, 49 Mathematical sciences

Abstract:

MOTIVATION: Gene prioritization aims at identifying the most promising candidate genes among a large pool of candidates-so as to maximize the yield and biological relevance of further downstream validation experiments and functional studies. During the past few years, several gene prioritization tools have been defined and some of them have been implemented and made available through freely available web tools. In this study, we aim at comparing the predictive performance of eight publicly available prioritization tools on novel data. We have performed an analysis in which 42 recently reported disease gene associations from literature are used to benchmark these tools before the underlying databases are updated. RESULTS: Cross-validation on retrospective data provides performance estimate likely to be overoptimistic because some of the data sources are contaminated with knowledge from the disease-gene association. Our approach mimics a novel discovery more closely and thus provides more realistic performance estimates. There are however marked differences, and tools that rely on more advanced data integration schemes appear more powerful. CONTACT: yves.moreau@esat.kuleuven.be.