Computational Statistics & Data Analysis vol:43 issue:3 pages:315-332
The logistic regression model is commonly used to describe the effect of one or several explanatory variables on a binary response variable. It suffers from the problem that its parameters are not identifiable when there is separation in the space of the explanatory variables. In that case, existing fitting techniques fail to converge or give the wrong answer. To remedy this, a slightly more general model is proposed under which the observed response is strongly related but not equal to the unobservable true response. This model will be called the hidden logistic regression model because the unobservable true responses are comparable to a hidden layer in a feedforward neural net. The maximum estimated likelihood estimator is proposed in this model. It is robust against separation, always exists, and is easy to compute. Outlier-robust estimation is also studied in this setting, yielding the weighted maximum estimated likelihood estimator.