3 March 2023
J Waugh, et al
Performance comparison of artificial intelligence (AI) to double reading in the Australian BreastScreen Program
To evaluate the performance of Transpara using a ground truth data set of consecutive prevalent round digital mammograms with verified 3-year follow-up, in a low cancer prevalence (~ 1%) screening setting.
Methods and Materials
The Monash BreastScreen service uses independent double reading with arbitration by single expert reader. Transpara (version 1.7) retrospectively analysed prevalent screens from 2017 and assigned scores in deciles from 1 – 10, with higher scores indicating an increasing risk of malignancy. Interval cancers and next round cancers were categorised at blinded consensus review. Findings were used to develop a proposed protocol for the incorporation of Transpara into standard practice.
Of the 7533 consecutive prevalent screens, 728 were recalled by the radiologists and included 54 cases of invasive breast cancer (IBC). Fifty two of the 54 IBCs were within the 1611 cases scored as either 9 or 10 by Transpara. If the IBCs from the prevalent round were combined with interval IBCs and IBCs from the subsequent round considered after expert review to have had signs in 2017, then the radiologists identified 54/67 IBCs and Transpara identified 62/67 (sensitivity 80.6% radiologists vs 92.5% Transpara, p = 0.07).
Only 2/54 prevalent IBCs were not scored 9 or 10 by Transpara and this AI group included some interval cancers and cancers from the subsequent round that were not identified by the radiologists. A proposed protocol for the incorporation of AI into a screening protocol will be discussed which includes all images being read by at least one radiologist plus Transpara, with no loss of sensitivity. Potential benefits include reducing the number of women recalled for benign lesions, interval cancers and workload.