Download PDF

International Journal of Corpus Linguistics

Publication date: 2016-03-01
Volume: 21 Pages: 48 - 79
Publisher: John Benjamins Publishing

Author:

Ruette, Tom
Ehret, Katharina ; Szmrecsanyi, Benedikt

Keywords:

Social Sciences, Linguistics, Language & Linguistics, lectometry, lexis, aggregation, Semantic Vector Space models, Standard English, IDENTIFICATION, LANGUAGE, 1702 Cognitive Sciences, 2004 Linguistics, Languages & Linguistics, 4703 Language studies, 4704 Linguistics

Abstract:

Lectometry is a corpus-based methodology that explores how multiple language-external dimensions shape language usage in an aggregate perspective. The paper combines this methodology with Semantic Vector Space modeling to investigate lexical variability in written Standard English, as sampled in the original Brown family of corpora (Brown, LOB, Frown and F-LOB). Based on a joint analysis of 303 lexical variables, which are semi-automatically extracted by means of a SVS, we find that lexical variation in the Brown family is systematically related to three lectal dimensions: discourse type (informative versus imaginative), standard variety (British English versus American English), and time period (1960s versus 1990s). It turns out that most lexical variables are sensitive to at least one of these three language-external dimensions, yet not every dimension has dedicated lexical variables: in particular, distinctive lexical variables for the real time dimension fail to emerge.