Download PDF

International Conference on the Statistical Analysis of Textual Data, Date: 2012/06/11 - 2012/06/13, Location: Liège

Publication date: 2012-06-01
Pages: 923 - 936
Publisher: Presses Universitaires de Louvain; Louvain-la-Neuve

Proceedings of the 11th International Conference on the Statistical Analysis of Textual Data

Author:

Tummers, José
Speelman, Dirk ; Geeraerts, Dirk

Keywords:

confounding variables, corpus linguistics, multiple correspondence analysis

Abstract:

Corpus linguistic research relies on corpora which generally display an unbalanced structure. We will discuss a potential corollary of this biased structure which is rarely accounted for in (corpus) linguistics, namely con-founding variables. These are variables increasing, diminishing or reversing an explanatory variables marginal effect compared to its conditional effect. Analyzing four instances of confounding in a variational case study governed by a series of categorical explanatory variables, we will argue that these latent confounders can be unveiled modeling the co-occurrence patterns of the explanatory variables by means of a multiple correspondence analysis.