Title: The importance of choosing the right validation strategy in inverse modelling
Authors: Kemps, Bart ×
Saeys, Wouter
Mertens, Kristof
Darius, Paul
De Baerdemaeker, Josse
De Ketelaere, Bart #
Issue Date: 2010
Publisher: N i r publications
Series Title: Journal of Near Infrared Spectroscopy vol:18 issue:4 pages:231-237
Abstract: Inverse modelling techniques, such as principal component regression, partial least squares regression and support vector machines, are very powerful multivariate calibration strategies which are widely used in near infrared spectroscopy. However, these techniques are so efficient in finding correlations between the spectral variables and the parameter to be predicted that great care should be taken to avoid over-optimistic results by use of a proper validation strategy. In this study, different validation strategies were investigated on a dataset that was acquired during various measurement days. The goal was to predict albumen freshness based on spectral measurements. Validation procedures frequently applied in practice, i.e. 10-fold cross-validation (10-fold CV) and validation based on random subdivision in calibration and validation set (RS) were compared to a cross-validation across measuring day (MD). Whereas 10-fold CV and RS validation suggested that prediction of albumen freshness is possible, MD validation on the same dataset indicated that albumen freshness cannot be predicted from the spectral measurements. It is shown that inverse modelling is very sensitive to unspecific correlations between the spectral measurements and the dependent variable, which might be artifacts of the measurement protocol and will not be persistent in the future. Therefore, selection of the right validation strategy for a given application and critical evaluation of the obtained results are crucial steps in inverse modelling to obtain useful calibration models. More specifically, in the context of process analytical technology where spectra are acquired over time, great care should be taken to break the unspecific correlation between the dependent variable and the variations in the spectral measurements over time.
ISSN: 0967-0335
Publication status: published
KU Leuven publication type: IT
Appears in Collections:Division of Mechatronics, Biostatistics and Sensors (MeBioS)
× corresponding author
# (joint) last author

Files in This Item:
File Description Status SizeFormat
J18_0231.pdf Published 309KbAdobe PDFView/Open Request a copy

These files are only available to some KU Leuven Association staff members


All items in Lirias are protected by copyright, with all rights reserved.

© Web of science