Download PDF

EACL 2009 Workshop on GEMS: GEometrical Models of Natural Language Semantics, Date: 2009/03/31 - 2009/03/31, Location: Athens, Greece

Publication date: 2009-03-01
16
ISSN: 9781618392374
Publisher: Association for Computational Linguistics

Proceedings of the EACL 2009 Workshop on GEMS: GEometrical Models of Natural Language Semantics

Author:

Peirsman, Yves
Speelman, Dirk

Keywords:

lexical variation, semantics

Abstract:

In the recognition of words that are typical of a specific language variety, the classic keyword approach performs rather poorly. We show how this keyword analysis can be complemented with a word space model constructed on the basis of two corpora: one representative of the language variety under investigation, and a reference corpus. This combined approach is able to recognize the markers of a language variety as words that not only have a significantly higher frequency as compared to the reference corpus, but also a different distribution. The application of word space models moreover makes it possible to automatically discover the lexical alternative to a specific marker in the reference corpus.