Title: Does ignoring clustering in multicenter data influence the performance of prediction models? A simulation study
Authors: Wynants, Laure ×
Vergouwe, Y
Van Huffel, Sabine
Timmerman, Dirk
Van Calster, Ben #
Issue Date: Sep-2016
Publisher: Edward Arnold
Series Title: Statistical Methods in Medical Research pages:1-14
Article number: 0962280216668555
Abstract: Clinical risk prediction models are increasingly being developed and validated on multicenter datasets. In this article, we present a comprehensive framework for the evaluation of the predictive performance of prediction models at the center level and the population level, considering population-averaged predictions, center-specific predictions, and predictions assuming an average random center effect. We demonstrated in a simulation study that calibration slopes do not only deviate from one because of over- or underfitting of patterns in the development dataset, but also as a result of the choice of the model (standard versus mixed effects logistic regression), the type of predictions (marginal versus conditional versus assuming an average random effect), and the level of model validation (center versus population). In particular, when data is heavily clustered (ICC 20%), center-specific predictions offer the best predictive performance at the population level and the center level. We recommend that models should reflect the data structure, while the level of model validation should reflect the research question.
ISSN: 0962-2802
Publication status: accepted
KU Leuven publication type: IT
Appears in Collections:ESAT - STADIUS, Stadius Centre for Dynamical Systems, Signal Processing and Data Analytics
Organ Systems (+)
× corresponding author
# (joint) last author

Files in This Item:
File Description Status SizeFormat
smmr 2016.pdf Accepted 627KbAdobe PDFView/Open Request a copy

These files are only available to some KU Leuven Association staff members


All items in Lirias are protected by copyright, with all rights reserved.