28 November 2020
C Balta et al
Can AI help to increase the positive predictive value of screen-recalled biopsies on calcifications?
C Balta, A Rodriguez-Ruiz, C Mieskes, N Karssemeijer, S H Heywang-Koebrunner
To determine if an artificial intelligence (AI) system can help to reduce the number of non-cancer biopsies of suspicious calcifications seen on digital mammograms (DM)s recalled from screening.
METHOD AND MATERIALS
A consecutive sample of 17,886 screening DMs with 114 biopsy-proven cancers was retrospectively collected. Exams were acquired with Siemens and Hologic DM systems from a single institution that follows biennial screening and underwent independent double reading. From the 960 recalled cases (recall rate: 5.4%), 333 cases were biopsied (biopsy rate: 1.8%) of which 30% were biopsied due to the presence of suspicious calcifications (n=108). An AI system automatically analysed each screening mammogram and assigned an “AI Exam-Score” ranging from 1 to 10. The higher the AI Exam-Score the higher the likelihood of malignancy being present in the DM. The number of biopsies proved to contain high-risk, malignant and benign calcifications was calculated.
Assuming that AI Exam-Scores 1-9 should not be biopsied, the positive predictive value of biopsies (PPV2) and the percentage of benign-proved biopsies and the confidence intervals (CI) of the binomial distribution (Wilson score interval) were determined.
Out of 108 biopsies in women with calcifications, 31 proved malignant (PPV2=28.7%), 65 benign and 12 high-risk lesions.
90 of 108 biopsies had Exam-Score 10, including 30 of 31 malignant calcifications (97%). The PPV2 of biopsies of exams with Score 10 increased 36.7% (by 28%) (95% CI=[27.5-46.9]%).
In exams with an Exam-Score lower than 10, there were: 1 malignant low-grade (G1) DCIS with Exam-Score 8 and 2 high-risk lesions (lobular carcinoma in situ, papilloma) with Exam-Score 9.
Out of 65 proven benign calcification biopsies, 23.1% (95% CI=[14.5-34.6]%) had an AI Exam-Score lower than 10 (3 fibroadenomas, 2 adenoses, 7 mastopathy and 3 (other) benign).
When only lesions in exams with an Exam-Score of 10 had been biopsied only one low-grade DCIS would have been missed while the number of benign biopsies would have been reduced by 23.1%.
AI has potential to help reduce the number of benign biopsies resulting from screen-detected calcifications in mammography, thereby lowering the psychological burden and associated costs.