Title: Predicting gene function in S. cerevisiae and A. thaliana using hierarchical multi-label decision tree ensembles
Authors: Schietgat, Leander
Vens, Celine
Struyf, Jan
Kocev, Dragi
Dzeroski, Saso
Blockeel, Hendrik #
Issue Date: 2008
Conference: Spring Workshop on Mining and Learning edition:1 location:Traben-Trarbach, Germany date:23-25 April 2008
Abstract: We introduce a new machine learning technique for gene function prediction, and investigate its performance on S. cerevisiae and A. thaliana. Two characteristics of this task distinguish it from common machine learning problems: a single gene may have multiple functions, and the functions are organized in a hierarchy: a gene that is related to some function is automatically related to all its "superfunctions" (this is called the hierarchy constraint). This particular problem setting is known in machine learning as hierarchical multi-label classification (HMC). We present an HMC decision tree learner which makes predictions for all classes together, takes into account the hierarchy constraint, and is able to process DAG hierarchies, and compare it to other decision tree approaches for HMC (which mostly learn a tree for each class separately). We show that our method outperforms previously published results on functional genomics tasks. Moreover, we can further increase the predictive performance by upgrading our method to an ensemble technique, if the user is willing to (partly) give up on interpretability.
Publication status: published
KU Leuven publication type: IMa
Appears in Collections:Informatics Section
# (joint) last author

Files in This Item:
File Description Status SizeFormat
SML08.pdfPoster Accepted 1184KbAdobe PDFView/Open


All items in Lirias are protected by copyright, with all rights reserved.