International Conference on the Statistical Analysis of Textual Data, Date: 2012/06/11 - 2012/06/13, Location: Liège
Proceedings of the 11th International Conference on the Statistical Analysis of Textual Data
Author:
Keywords:
confounding variables, corpus linguistics, multiple correspondence analysis
Abstract:
Corpus linguistic research relies on corpora which generally display an unbalanced structure. We will discuss a potential corollary of this biased structure which is rarely accounted for in (corpus) linguistics, namely con-founding variables. These are variables increasing, diminishing or reversing an explanatory variables marginal effect compared to its conditional effect. Analyzing four instances of confounding in a variational case study governed by a series of categorical explanatory variables, we will argue that these latent confounders can be unveiled modeling the co-occurrence patterns of the explanatory variables by means of a multiple correspondence analysis.