Title: How partial least squares validation procedures may lead to deceptive models
Authors: Kemps, Bart ×
Mertens, Kristof
Saeys, Wouter
Darius, Paul
De Baerdemaeker, Josse
De Ketelaere, Bart #
Issue Date: 2009
Series Title: Journal of Near Infrared Spectroscopy issue:submitted
Abstract: The multivariate PLS technique is a very powerful technique however great care should be taken that the obtained results reflect the matters of interest. During this work different validation procedures were investigated on a dataset collected during various measuring days. Validation procedures frequently applied in practice i.e. leave one out cross validation (LOOCV) and validation based on random subdivision in calibration and validation set (RSS) were compared to a validation by splitting the dataset into a calibration and a validation set based on measurement day (MD). Results showed that LOOCV and RSS validation lead to very high RPD values whilst MD validation indicated that no information was available in the spectra. It was shown that PLS analysis was able to use small differences in spectra measured at different days for prediction. By means of a random permutation of the dependent variable it was shown that these differences were not related to differences in the dependent variable. It was stated that interpretation and fully understanding of PLS model is needed in order to rely on the results generated by the PLS analysis. Furthermore, the authors state that the only reliable PLS validation in case the dataset is gathered in time should be based on splitting the dataset into a calibration and a validation set based on measurement day (MD).
ISSN: 0967-0335
Publication status: submitted
KU Leuven publication type: IT
Appears in Collections:Division of Mechatronics, Biostatistics and Sensors (MeBioS)
× corresponding author
# (joint) last author

Files in This Item:

There are no files associated with this item.

Request a copy


All items in Lirias are protected by copyright, with all rights reserved.