The area under the ROC curve may be a biased performance measure for meta-analysis of diagnostic accuracy studies. A simulation study

ID: 

18621

Session: 

Short oral session 2: Considerations for meta-analyses

Date: 

Wednesday 13 September 2017 - 11:00 to 12:30

Location: 

All authors in correct order:

Wang J1, Leeflang M1
1 Academic Medical Center, Univeristy of Amsterdam, Netherlands
Presenting author and contact person

Presenting author:

Mariska Leeflang

Contact person:

Abstract text
Background: In systematic reviews of diagnostic accuracy studies, summary estimates of sensitivity and specificity and summary ROC curve are the preferred test-performance measures. In some of the recent systematic reviews, area under summary ROC (AUSROC) curve is also reported as an overall performance measure.

Methods: We investigated the performance of AUSROC estimates based on simulated test results in primary studies and 2-by-2 tables with different thresholds. Area Under the ROC Curve (AUC) was estimated in different ways: summary AUC from the HSROC/bivariate model; summary AUC from a meta-analysis of reported AUCs; and, an overall AUC from IPD meta-analysis. Four different scenarios were considered, with true AUC fixed at 0.64, 0.76, 0.81 and 0.91, respectively. True AUC was calculated based on parametric method with the known distribution (mean and SD) of test results. Performance of the estimates was assessed by bias and root-mean-square error.

Results: In all the 4 scenarios, the bivariate model using the pre-defined threshold always underestimated the AUC, while using the optimal threshold overestimated the AUC. Both approaches resulted in high RMSE. Meta-analysis of AUC, either from empirical estimate or distribution of the test results, performed fairly well. AUC calculated from pooling IPD data was not superior to meta-analysis of AUC, but was more accurate than estimating an AUC from the bivariate model. When the number of primary studies included in the meta-analysis increased from 5 to 20, all approaches returned a lower RMSE.

Conclusions: This simulation study provides empirical evidence for the observation that the AUHSROC cannot precisely estimate the performance of a test in a meta-analysis. Therefore, the AUHSROC should not be reported as an overall accuracy measure. By directly meta-analysing the AUC and its SE reported in primary studies, we can get a better summary estimate of AUC. Therefore, in those cases where the AUC may be a relevant measure of test accuracy, using the hierarchical models may not be the most accurate way to estimate the AUC.

Attachments: