3 March 2021
Jose Luis Raya Povedano et al
Replacing double reading in mammography screening with single reading and artificial intelligence: a large retrospective screening evaluation
Jose Luis Raya Povedano, Sara Romero Martín, Esperanza Elías Cabot, Albert Gubern-Merida, Alejandro Rodríguez-Ruiz, Marina Álvarez Benito
To investigate the impact of replacing the double human reading of mammograms (DM) with a single human reading and a standalone artificial intelligence (AI) system in a breast cancer screening program.
A consecutive cohort of 15987 DM screening exams (with 113 cancers, including 15 interval cancers) was retrospectively collected from a previous trial study (Tomosynthesis Cordoba Screening Trial) comparing DM and breast tomosynthesis in a paired cohort. Each DM screening exam was independently double read by radiologists without consensus.
All the DM exams were processed by an AI system (Transpara, ScreenPoint Medical), which categorizes them on a scale 1-10 representing the risk of containing visible abnormalities.
An AI-based screening scenario was simulated replacing the original double reading of DM with a single human reading of DM and the AI system as an independent reader (operating at the same specificity).
Sensitivity and recall rate were compared between the original and the AI-based scenario using a McNemar test.
During the original double human reading with DM, 805 women were recalled (5.04%, 95% CI: 4.7-5.4) and 74 cancers were detected (sensitivity 66.7%, 95% CI: 57.5-74.7). In the single human reading + AI scenario, 734 women would have been recalled (4.6%, 95% CI: 4.3-4.9; relative reduction of 8.8%, P=0.002) and 79 cancers would have been detected (sensitivity 71.2%, CI = 62.1-78.8%; relative improvement of 6.7%, P=0.26).
AI could be used in screening programs as an independent reader. Replacing one screening reader with AI could halve the radiologists workload while at the same time potentially increasing sensitivity and reducing recall rate.
Limitations: More mammograms acquired with different vendors and AI algorithms are needed to understand the generalizability of our results.