On estimating model accuracy with repeated cross-validation

Vanwinckelen, Gitte; Blockeel, Hendrik; De Baets, Bernard; Manderick, Bernard; Rademaker, Michaël; Waegeman, Willem

BeneLearn 2012: Proceedings of the 21st Belgian-Dutch Conference on Machine Learning

On estimating model accuracy with repeated cross-validation

Author:

Vanwinckelen, Gitte

Blockeel, Hendrik ; De Baets, Bernard ; Manderick, Bernard ; Rademaker, Michaël ; Waegeman, Willem

Keywords:

repeated cross-validation, predictive model evaluation, conditional prediction error

Abstract:

Evaluation of predictive models is a ubiquitous task in machine learning and data mining. Cross-validation is often used as a means for evaluating models. There appears to be some confusion among researchers, however, about best practices for cross-validation, and about the interpretation of cross-validation results. In particular, repeated cross-validation is often advocated, and so is the reporting of standard deviations, confidence intervals, or an indication of "significance". In this paper, we argue that, under many practical circumstances, when the goal of the experiments is to see how well the model returned by a learner will perform in practice in a particular domain, repeated cross-validation is not useful, and the reporting of confidence intervals or significance is misleading. Our arguments are supported by experimental results.

BeneLearn 2012: Proceedings of the 21st Belgian-Dutch Conference on Machine Learning On estimating model accuracy with repeated cross-validation

Author:

Keywords:

Abstract:

BeneLearn 2012: Proceedings of the 21st Belgian-Dutch Conference on Machine Learning

On estimating model accuracy with repeated cross-validation