Download PDF

International Journal of Corpus Linguistics

Publication date: 2014-11-01
Volume: 19 Pages: 478 - 504
Publisher: John Benjamins Pub. Co.

Author:

Tummers, José
Speelman, Dirk ; Geeraerts, Dirk

Keywords:

variational linguistics, spurious effects, confounding, Multiple Correspondence Analysis, Social Sciences, Linguistics, Language & Linguistics, 1702 Cognitive Sciences, 2004 Linguistics, Languages & Linguistics, 4703 Language studies, 4704 Linguistics

Abstract:

As repositories of spontaneously realized language, corpora generally have an uncontrolled and unbalanced structure where all variables operate simultaneously. Consequently, a variable’s real effect can be concealed when studied in isolation because of the exclusion of the impact of other potentially confounding variables. Analyzing a variational case study, the alternation between inflected and uninflected attributive adjectives in Dutch, it will be demonstrated how confounding variables alter the impact of explanatory variables on the response variable, resulting in spurious effects in the bivariate analyses. Multiple Correspondence Analysis will be used as a heuristic tool to unveil the association patterns between explanatory variables in the data matrix which induce the spurious effects. Based on these findings, we will argue for a thorough analysis of the database patterns to gain insight in the underlying associations between explanatory variables before modeling their real impact on the response variable in a multivariate model.