Computational Statistics & Data Analysis vol:37 issue:1 pages:65-75
In this paper, we show that the recent notion of regression depth can be used as a data-analytic tool to measure the amount of separation between successes and failures in the binary response framework. Extending this algorithm, allows us to compute the overlap in data sets which are commonly fitted by logistic or probit regression models. The overlap is the number of observations that would need to be removed to obtain complete or quasi-complete separation, i.e. the situation where the regression parameters are no longer identifiable and the maximum likelihood estimate does not exist. It turns out that the overlap is often quite small. The results are equally useful in linear discriminant analysis.