Title: Clinical Risk Prediction Models based on Multicenter Data: Methods for Model Development and Validation
Other Titles: Klinische predictiemodellen op basis van multicentrische studies: methoden voor geclusterde data
Authors: Wynants, Laure
Issue Date: 1-Dec-2016
Abstract: Risk prediction models are developed to assist doctors in diagnosing patients, decision-making, counseling patients or providing a prognosis. To enhance the generalizability of risk models, researchers increasingly collect patient data in different settings and join forces in multicenter collaborations. The resulting datasets are clustered: patients from one center may have more similarities than patients from different centers, for example, due to regional population differences or local referral patterns. Consequently, the assumption of independence of observations, underlying the most often used statistical techniques to analyze the data (e.g., logistic regression), does not hold. This is mostly ignored in much of the current clinical prediction research. Research that relies on faulty assumptions may yield misleading results and lead to suboptimal improvements in patient care.
To address this issue, I investigated the consequences of ignoring the assumption of independence and studied alternative techniques that acknowledge clustering throughout the process of planning a study, building a model and validating models in new data. I used mixed and random effects methods throughout the research as they allow to explicitly model differences between centers, and evaluated the proposed solutions with simulations and real clinical data. This dissertation covers sample size requirements, data collection and predictor selection, model fitting, and the validation of risk models in new data, focusing mainly on diagnostic models. The main case study is the development and validation of models for the pre-operative diagnosis of ovarian cancer, for which the multicenter dataset collected by the International Ovarian Tumor Analysis (IOTA) consortium is used.
The results suggested that mixed effects logistic regression models offer center-specific predictions that have a better predictive performance in new patients than the predictions from standard logistic regression models. Although simulations showed that models were severely overfitted with only five events per variable, mixed effects models did not require more demanding sample size guidelines than standard logistic regression models. A case study on predictors of ovarian malignancy demonstrated that in multicenter data, measurements may vary systematically from one center to another, indicating potential threats to generalizability. These predictors could be detected using the residual intraclass correlation coefficient and may be excluded from risk models. In addition, a case study showed that, if statistical variable selection is used, mixed effects models are required in every step of the selection procedure to prevent incorrect inferences. Finally, case studies on risk models for ovarian cancer demonstrated that the predictive performance of risk models varied considerably between centers. This could be detected using meta-analytic models to analyze discrimination, calibration and clinical utility.
In conclusion, taking into account differences between centers during the planning of prediction research, the development of a model and the validation of risk predictions in new patients offers insight in the heterogeneity and better predictions in local settings. Many methodological challenges remain, among which the inclusion of predictor-by-center interactions, the optimal application of mixed effects models in new centers, and the refinement of techniques to summarize clinical utility in multicenter data. Nonetheless, the findings in this dissertation imply that current clinical prediction research would benefit from adopting mixed and random effects techniques to fully employ the information that is available in multicenter data.
Table of Contents: Acknowledgments iii
Nederlandse samenvatting v
Abstract vii
Nomenclature ix
Table of contents xiii
List of figures xix
List of tables xxv
1 General introduction 1
1.1 Outline of the thesis 3
1.2 Intended audience 4
2 Development and validation of risk models 7
2.1 Methods to develop and validate risk models 7
2.1.1 What is a risk model? 7
2.1.2 Formulating the research question 9
2.1.3 Study design and setup 9
2.1.4 Modeling strategy 12
2.1.5 Fitting the model 15
2.1.6 Validation of model performance 15
2.1.7 Reporting 21
2.1.8 Impact studies 21
2.1.9 Model implementation 22
2.1.10 Conclusion 22
2.2 Predicting successful vaginal birth after a Cesarean section 22
2.2.1 The clinical need for a new VBAC model 23
2.2.2 Subjects and methods 24
2.2.3 Results 26
2.2.4 Discussion 31
3 Statistical methods for multicenter data 35
3.1 Issues and opportunities of clustered data 35
3.2 Methods for clustered data 36
3.2.1 Ignoring the clustered data structure 37
3.2.2 Center-specific models 39
3.2.3 Correcting for clustering 45
3.2.4 Combining within-cluster results 46
3.2.5 Discussion 47
3.3 Multicenter data from the International Ovarian Tumor Analysis group 47
3.3.1 Background 48
3.3.2 The IOTA dataset 49
3.3.3 IOTA models and classification rules 54
3.3.4 The Simple Rules risk scoring system 59
3.4 Conclusion 63
4 Sample size for multicenter studies 65
4.1 Background 65
4.2 Design of the simulation study 66
4.2.1 The source populations 67
4.2.2 Sampling 67
4.2.3 Model building 69
4.2.4 Model evaluation 69
4.3 Results of the simulation study 71
4.3.1 Data clustering and the number of events per variable 71
4.3.2 Variable selection 73
4.3.3 Sample size 74
4.3.4 Random cluster effects correlated with predictors 77
4.4 Empirical example 77
4.5 Discussion 80
4.6 Conclusion 82
5 Predictor selection for multicenter studies 83
5.1 Screening for data clustering in multicenter studies: the residual intraclass correlation 83
5.1.1 Background 83
5.1.2 Methods 85
5.1.3 Results 91
5.1.4 Discussion 97
5.2 Statistical variable selection in clustered data 100
5.2.1 Background 100
5.2.2 Methods 101
5.2.3 Results 102
5.2.4 Discussion 104
5.3 Conclusion 106
6 Performance evaluation in multicenter studies 107
6.1 Heterogeneity in predictive performance 108
6.2 Performance measures for multicenter validation 109
6.2.1 Sensitivity and specificity 109
6.2.2 The c-statistic 111
6.2.3 Calibration 112
6.2.4 Net benefit 113
6.2.5 Explaining heterogeneity 114
6.2.6 Leave-one-center-out cross-validation 114
6.3 Some examples 116
6.3.1 The validation of IOTA strategies on phase III data: a meta-analysis of discrimination and calibration 116
6.3.2 The validation of the IOTA Simple Rules risks scoring system on phase III data: a meta-analysis of discrimination and a graphical assessment of calibration in specialized oncology centers and other centers 124
6.3.3 The validation of IOTA models and RMI in the hands of users with varied training on phase IVb data: a meta-regression of test
accuracy 129
6.3.4 The validation of the clinical utility of models on phase III data: net benefit in specialized oncology centers and other centers 132
6.4 A meta-analysis of net benefit 137
6.4.1 Various fixed and random effects weights 137
6.4.2 Random effects meta-analysis of the net benefit: an
example 141
6.4.3 Future research: a Bayesian approach 145
6.5 Conclusion 148
7 Does ignoring clustering in multicenter data influence the predictive performance of risk models? A simulation study 149
7.1 Introduction 149
7.2 A framework of performance evaluation of risk models in clustered data 150
7.3 Calibration slopes for marginal and center-specific logistic regression models 152
7.4 Simulation study 153
7.4.1 Design 153
7.4.2 Results 155
7.5 Empirical example 163
7.6 Discussion 165
7.7 Conclusion 168
8 General discussion 169
8.1 Implications and recommendations 171
8.2 Future research 173
Appendices 177
A1 Multiple imputation in the IOTA dataset 178
A2 Technical overview of the design of the EPV simulation study 179
A3 Examples of R code for the EPV simulation study 181
A3.1 Generation of source populations 181
A3.2 Sampling from the source population and model building within samples 182
A4 Additional results from the EPV simulation study 187
A4.1 Bias in the estimated regression coefficients 187
A4.2 Standard (population-level or “overall”) c-statistics and calibration slopes 188
A4.3 Bias in the estimated random intercept variance 189
A5 SAS macro to estimate the residual intraclass correlation 190
A6 Difference in net sensitivity and net amount of avoided false positives per 100 patients 202
A7 Center-specific case-mix and population models to generate a heterogeneous multicenter dataset 204
A8 Observed differences in center-specific decision curves in simulated data 207
A9 Results for meta-analyses of center-specific net benefit using various weights 208
A10 Additional results of the random effects meta-analysis of NB in IOTA data 209
A11 Formulas for the c-statistic and logistic calibration in a comprehensive framework 210
A12 Calibration with a biased estimate of the between-center variance 211
A13 Correspondence between predictions from the standard logistic regression model and marginalized predictions from the mixed effects logistic regression model 212
A14 R code for the simulation study of the impact of ignoring clustering on predictive performance 213
A15 Additional results for the simulation study of the impact of ignoring clustering on predictive performance 222
A15.1 Detailed calibration results of simulations with EPV=100 and ICC=20% 222
A15.2 Relation between the estimated random intercept variance and the calibration slope 223
A15.3 Calibration intercepts and c-statistics obtained with development samples with EPV 100 in a population with ICC=5% 224
A15.4 Calibration intercepts and c-statistics obtained with development samples with EPV 5 in a population with ICC=20% 226
References 229
Curriculum vitae 249
List of publications 251
Papers in international journals 251
Letters and replies in international journals 252
Conference abstracts in international journals 252
Unpublished conference contributions 253
Invited talks 255
Publication status: published
KU Leuven publication type: TH
Appears in Collections:Screening, Diagnostics and Biomarkers (-)
Section Woman - Miscellaneous (-)
Organ Systems (+)
Electrical Engineering - miscellaneous
ESAT - STADIUS, Stadius Centre for Dynamical Systems, Signal Processing and Data Analytics

Files in This Item:
File Status SizeFormat
PhD Dissertation Laure Wynants Clinical Risk Prediction Models Based On Multicenter Data.pdf Published 5833KbAdobe PDFView/Open Request a copy

These files are only available to some KU Leuven Association staff members


All items in Lirias are protected by copyright, with all rights reserved.