ITEM METADATA RECORD
Title: A machine learning approach to sentiment analysis in multlingual Web texts
Authors: Boiy, Erik ×
Moens, Marie-Francine #
Issue Date: 2009
Publisher: Kluwer Academic Publishers
Series Title: Information retrieval vol:12 issue:5 pages:526-558
Abstract: Sentiment analysis, also called opinion mining, is a form of information extraction from text of growing commercial and research interest. In this paper we present our machine learning experiments with regard to sentiment analysis in blog, review and forum texts found on the World Wide Web and written in English, Dutch and French. We train from a set of example sentences or statements that are manually annotated as positive, negative or neutral with regard to a certain entity of interest. We are interested in the feelings that people express towards certain consumption products. We learn and evaluate several classification models that can be configured in a cascaded pipeline. We have to deal with several problems being the noisy character of the input texts, the attribution of the sentiment to a particular entity of interest, and the small size of the training set. We succeed in identifying positive, negative and neutral feelings to the entity under consideration with ca. 83% accuracy for English texts based on unigram features augmented with linguistic features. The accuracy of processing the Dutch and French texts are ca. 70% and 68% respectively due to the larger variety of linguistic expressions that more often diverge from standard language, thus demanding more training patterns. In addition our experiments give us insights into the portability of the learned models across domains and languages. A substantial part of the article investigates the role of active learning techniques for reducing the number of examples to be manually annotated.
ISSN: 1386-4564
Publication status: published
KU Leuven publication type: IT
Appears in Collections:Informatics Section
× corresponding author
# (joint) last author

Files in This Item:
File Description Status SizeFormat
BoiyIR2008.pdfMain article Published 324KbAdobe PDFView/Open

 


All items in Lirias are protected by copyright, with all rights reserved.

© Web of science