Title: GA(M)E-QSAR: A Novel, Fully Automatic Genetic-Algorithm-(Meta)-Ensembles Approach for Binary Classification in Ligand-Based Drug Design
Authors: Perez-Castillo, Yunierkis ×
Lazar, Cosmin
Taminau, Jonatan
Froeyen, Mathy
Angel Cabrera-Perez, Miguel
Nowe, Ann #
Issue Date: Sep-2012
Publisher: American Chemical Society
Series Title: Journal of Chemical Information and Modeling vol:52 issue:9 pages:2366-2386
Abstract: Computer-aided drug design has become an important component of the drug discovery process. Despite the advances in this field, there is not a unique modeling approach that can be successfully applied to solve the whole range of problems faced during QSAR modeling. Feature selection and ensemble modeling are active areas of research in ligand-based drug design. Here we introduce the GA(M)E-QSAR algorithm that combines the search and optimization capabilities of Genetic Algorithms with the simplicity of the Adaboost ensemble-based classification algorithm to solve binary classification problems. We also explore the usefulness of Meta-Ensembles trained with Adaboost and Voting schemes to further improve the accuracy, generalization, and robustness of the optimal Adaboost Single Ensemble derived from the Genetic Algorithm optimization. We evaluated the performance of our algorithm using five data sets from the literature and found that it is capable of yielding similar or better classification results to what has been reported for these data sets with a higher enrichment of active compounds relative to the whole actives subset when only the most active chemicals are considered. More important, we compared our methodology with state of the art feature selection and classification approaches and found that it can provide highly accurate, robust, and generalizable models. In the case of the Adaboost Ensembles derived from the Genetic Algorithm search, the final models are quite simple since they consist of a weighted sum of the output of single feature classifiers. Furthermore, the Adaboost scores can be used as ranking criterion to prioritize chemicals for synthesis and biological evaluation after virtual screening experiments.
ISSN: 1549-9596
Publication status: published
KU Leuven publication type: IT
Appears in Collections:Medicinal Chemistry (Rega Institute)
× corresponding author
# (joint) last author

Files in This Item:
File Description Status SizeFormat
ci300146h.pdfarticle Published 4159KbAdobe PDFView/Open Request a copy
ci300146h_si_001.pdfsupp mat Published 1241KbAdobe PDFView/Open

These files are only available to some KU Leuven Association staff members


All items in Lirias are protected by copyright, with all rights reserved.

© Web of science