CLEF 2014 Evaluation Labs, Date: 2014/09/15 - 2014/09/18, Location: Sheffield, UK
Proceedings of CLEF 2014 Evaluation Labs
Author:
Keywords:
Gender identification, Age prediction, Text mining
Abstract:
This paper describes the submission of the University of Washington's Center for Data Science to the PAN 2014 author profiling task. We examine the predictive quality in terms of age and gender of several sets of features extracted from various genres of online social media. Through comparison, we establish a feature set which maximizes accuracy of gender and age prediction across all genres examined. We report accuracies obtained by two approaches to the multi-label classification problem of predicting both age and gender; a model wherein the multi-label problem is reduced to a single-label problem using powerset transformation, and a chained classier approach wherein the output of a dedicated classier for gender is used as input for a classier for age.