Title: Multilayered class discrimination in large-scale taxonomies
Authors: Gomez, Juan Carlos
Moens, Marie-Francine
Issue Date: 2012
Publisher: IOS Press
Host Document: Frontiers in Artificial Intelligence vol:243 pages:615-625
Conference: International conference on knowledge-based and intelligent information & engineering systems edition:16 location:San Sebastian, Spain date:10-12 September 2012
Abstract: In this work we implement and evaluate a methodology to classify multi-labeled web documents into large-scale taxonomies, using their text content. Multi-label hierarchical classification using large-scale taxonomies is a hard task due to problems of scarcity of training data in many nodes of the hierarchy, overlapping of content and complex decision surfaces. We propose a novel feature extraction model called Multilayered Class Discrimination (MCD), which reduces the dimensions of the text-content features of the web documents along the different levels of the hierarchy, helping to discriminate each class from other classes in the same level and reducing the effects of the mentioned problems. The results of categorizing web documents from the DMOZ directory show that our model improves the accuracy of the categorization when compared with the use of word features, and that the results are competitive with the ones presented in the Second LSHTC Challenge.
ISSN: 0302-9743
Publication status: published
KU Leuven publication type: IC
Appears in Collections:Informatics Section

Files in This Item:
File Description Status SizeFormat
GomezMoensKES2012.pdfMain article Published 158KbAdobe PDFView/Open


All items in Lirias are protected by copyright, with all rights reserved.

© Web of science