Title: Feature construction based on class outliers
Authors: Zimmermann, Albrecht
Issue Date: Nov-2013
Publisher: Department of Computer Science, KU Leuven
Series Title: CW Reports vol:CW648
Abstract: Data for training a classification model can be considered to consist of two types of points: easy to classify ones — typical for a class — and difficult to classify ones — atypical for a class and often lying on class boundaries. Most existing techniques deal with atypical points in later stages of model building, after typical points have been modeled. This means that atypical points are often modeled only if doing so results in an improvement in comparison to the model of typical points. An alternative way of viewing atypical points is as outliers w.r.t. the class to which they supposedly belong. Based on this realization, we introduce the concept of class outliers, whose immediate neighborhoods we use to construct discriminative features. We investigate ways of employing the newly derived features and compare the quality of resulting models with results on un-augmented data for a variety of UCI benchmarks sets. We find that while some overfitting control can be necessary, the newly derived features improve the classification accuracy of SVM, Naive Bayes, and C4.5 classifiers.
Publication status: published
KU Leuven publication type: IR
Appears in Collections:Informatics Section

Files in This Item:
File Description Status SizeFormat
CW648.pdfDocument Published 1003KbAdobe PDFView/Open


All items in Lirias are protected by copyright, with all rights reserved.