Title: The Construction and Evaluation of Word Space Models
Authors: Peirsman, Yves
De Deyne, Simon
Heylen, Kris
Geeraerts, Dirk #
Issue Date: 2008
Publisher: ELRA
Host Document: Proceedings of the Language Resources and Evaluation Conference (LREC) pages:7p
Conference: Language Resources and Evaluation Conference (LREC) edition:6 location:Marrakech, Morocco date:26 May - 1 June 2008
Abstract: Semantic similarity is a key issue in many computational tasks. This paper goes into the development and evaluation of two common ways of automatically calculating the semantic similarity between two words. On the one hand, such methods may depend on a manually constructed thesaurus like (Euro)WordNet. Their performance is often evaluated on the basis of a very restricted set of human similarity ratings. On the other hand, corpus-based methods rely on the distribution of two words in a corpus to determine their similarity. Their performance is generally quantified through a comparison with the judgements of the first type of approach. This paper introduces a new Gold Standard of more than 5,000 human intra-category similarity judgements. We show that corpus-based methods regularly outperform (Euro)WordNet on this data set, and that the use of the latter as a Gold Standard for the former, is thus often far from ideal.
ISBN: 2-9517408-4-0
Publication status: published
KU Leuven publication type: IC
Appears in Collections:Quantitative Lexicology and Variational Linguistics (QLVL), Leuven
Onderzoeksgroep hogere cognitie en individuele verschillen (-)
# (joint) last author

Files in This Item:
File Description Status SizeFormat
PeirsmanEtAl_LREC2008.pdfMain article Published 127KbAdobe PDFView/Open


All items in Lirias are protected by copyright, with all rights reserved.