K.U.Leuven - Departement toegepaste economische wetenschappen
DTEW Research Report 0455 pages:1-26
Partial Least Squares (PLS) is a standard statistical method in chemometrics. It can be considered as an incomplete, or 'partial', version of the Least Squares estimator of regres- sion, applicable when high or perfect multicollinearity is present in the predictor variables. The Least Squares estimator is well-known to be an optimal estimator for regression, but only when the error terms are normally distributed. In absence of normality, and in particular when outliers are in the data set, other more robust regression estimators have better properties. In this paper a 'partial' version of M-regression estimators will be defined. If an appropriate weighting scheme is chosen, partial M-estimators become entirely robust to any type of out- lying points. It is shown that robust M-regression outperforms existing methods for robust PLS regression in terms of statistical precision and computational speed, while keeping the robustness properties. The method is applied to a data set consisting of EPXMA spectra of archaeological glass vessels. This data set contains several outliers, and the advantages of Par- tial Robust M-regression are illustrated. Applying Partial Robust M-regression yields much smaller prediction errors for noisy calibration samples than PLS. On the other hand, if the data follow perfectly well a normal model, the loss in effciency to be paid for is very small.