Title: On estimating model accuracy with repeated cross-validation
Authors: Vanwinckelen, Gitte ×
Blockeel, Hendrik #
Issue Date: 2012
Host Document: BeneLearn 2012: Proceedings of the 21st Belgian-Dutch Conference on Machine Learning pages:39-44
Conference: Belgian-Dutch Conference on Machine Learning (BeneLearn) edition:21 location:Ghent date:24-25 May 2012
Abstract: Evaluation of predictive models is a ubiquitous task in machine learning and data mining. Cross-validation is often used as a means for evaluating models. There appears to be some confusion among researchers, however, about best practices for cross-validation, and about the interpretation of cross-validation results. In particular, repeated cross-validation is often advocated, and so is the reporting of standard deviations, confidence intervals, or an indication of "significance". In this paper, we argue that, under many practical circumstances, when the goal of the experiments is to see how well the model returned by a learner will perform in practice in a particular domain, repeated cross-validation is not useful, and the reporting of confidence intervals or significance is misleading. Our arguments are supported by experimental results.
ISBN: 978-94-6197-044-2
Publication status: published
KU Leuven publication type: IC
Appears in Collections:Informatics Section
× corresponding author
# (joint) last author

Files in This Item:
File Description Status SizeFormat
OnEstimatingModelAccuracy.pdf Published 242KbAdobe PDFView/Open


All items in Lirias are protected by copyright, with all rights reserved.