Download PDF

User Modeling in Social Media

Publication date: 2017-06-27

Author:

Farnadi, Golnoosh
De Cock, Martine ; Moens, Marie-Francine

Abstract:

The era of Social Media-as we know it today-started around the early 2000s. Social media enable a form of virtual content sharing that is fundamentally different than before. Social media content is no longer created and published by specific individuals, but instead is continuously modified by all users in a collaborative fashion. Nowadays, users around the world are taking advantage of social media as one of their key components of communication, and they rely on social media for news and information. The large number of social media users provides a unique opportunity for researchers to explore user modeling. There exist many applications across a wide array of fields such as marketing, law enforcement, and targeted advertising, which benefit from reliable approaches of user modeling. The traditional approaches of gathering data from users to build a user model by directly asking them to fill out questionnaires is time-consuming and impractical for online users. Users are continuously generating content about themselves, their lifestyle, likes/dislikes, and preferences on social media platforms. This user-generated content (UGC) and social ties among users and the platform itself contain a rich amount of data about users. In this thesis, we address user modeling by processing user data available on social media platforms. We leverage both UGC and user social relational content to automatically infer user attributes, such as age, gender and personality traits. The thesis has five main contributions. First, to model social media users based on their UGC, we propose a comparative analysis of state-of-the-art computational personality recognition machine learning methods on a varied set of social media benchmark datasets. In addition to UGC, we model users given their social relational content such as their friendship connections. Our second contribution is a novel graph mining technique that dynamically adapts to the underlying characteristics of the connections of the network to infer the profile of users in social media. Third, we propose the first statistical relational learning (SRL) framework which supports reasoning with soft quantifiers such as ``most'' and ``a few'' to better model relational content in social media. We show that the use of soft quantifiers not only increases the expressively of the language to better express users' behavior in social media, but also allows more accurate predictions about the users' profile such as their age and gender. Furthermore, our forth contribution is a new SRL model which integrates both UGC and social relational content to jointly model user behavior. We provide novel sub-models based on user-item relations in our SRL fusion model. Our fusion model successfully incorporates multiple sources of information and outperforms competing methods that use only one source of information for modeling users in social media. Fifth, we propose a novel deep neural network fusion framework to model social media users given their posts, profile picture and pages that they like, to infer users' age, gender and personality traits. We propose a stacking strategy to empower the capabilities of the fusion framework for multi-task learning and propose a hybrid fusion process which integrates modalities both at the feature level and the decision level. The results demonstrate the advantage of using a hybrid fusion framework in user profiling with highly accurate results in age and gender prediction. The techniques presented in this thesis are evaluated empirically on benchmark datasets from Facebook, YouTube, Twitter and Netlog, with data from thousands to millions of users.