Background: It is widely recommended that any developed - diagnostic or prognostic - prediction model should be externally validated across different settings and populations. When multiple validations have been performed, a systematic review followed by a formal meta analysis may help to understand whether and under what circumstances the model performs accurately or requires further improvements.
Objectives: To discuss methods for summarising the performance of prediction models with both binary and time-to-event outcomes.
Methods: We present statistical methods for dealing with incomplete reporting (of performance and precision estimates), and to obtain time-specific summary estimates of the c-statistic, the calibration-in-the-large and the calibration slope. In addition, we provide guidance on the implementation of a Bayesian estimation framework, and discuss several empirically based prior distributions. All methods are illustrated in two example reviews where we evaluate the predictive performance of EuroSCORE II and Framingham Wilson.