Download PDF

CLEF 2014 Evaluation Labs, Date: 2014/09/15 - 2014/09/18, Location: Sheffield, UK

Publication date: 2014-01-01
Volume: 1180 Pages: 1129 - 1136
Publisher: CEUR-WS.org

Proceedings of CLEF 2014 Evaluation Labs

Author:

Marquardt, James
Farnadi, Golnoosh ; Vasudevan, Gayathri ; Moens, Marie-Francine ; Davalos, Sergio ; Teredesai, Ankur ; De Cock, Martine ; Cappellato, Linda ; Ferro, Nicola ; Halvey, Martin ; Kraaij, Wessel

Keywords:

Gender identification, Age prediction, Text mining

Abstract:

This paper describes the submission of the University of Washington's Center for Data Science to the PAN 2014 author profiling task. We examine the predictive quality in terms of age and gender of several sets of features extracted from various genres of online social media. Through comparison, we establish a feature set which maximizes accuracy of gender and age prediction across all genres examined. We report accuracies obtained by two approaches to the multi-label classification problem of predicting both age and gender; a model wherein the multi-label problem is reduced to a single-label problem using powerset transformation, and a chained classier approach wherein the output of a dedicated classier for gender is used as input for a classier for age.