27 November 2022
Lauritzen A D., Janssen N., Brehl A K, Vejborg I., Lillholm M.
A prospective study of breast cancer screening with AI as first reader for likely normal mammographies
The effect of employing artificial intelligence (AI) systems into breast cancer screening programs to safely reduce radiologist’s workload has been investigated in retrospective studies. In Denmark, Transpara has been fully implemented into a large regional screening program and the direct effects can be measured prospectively. We aimed to investigate how initial results affect workload reduction and recall rate after the implementation of Transpara.
Materials and Method
Women in the Capital Region of Denmark aging between 50-69 are screened biennially. Each full-field digital mammography (FFDM) is independently read by two radiologists that decide whether to recall for diagnostic mammography, ultrasound, and eventually needle biopsy. Upon disagreement, a consensus conference is held with a third radiologist. In November 2021, the AI system Transpara (version 1.7.1, ScreenPoint Medical), was taken into use in this screening program. Transpara assigns a score on a scale of 1-10 to each exam with increasing likelihood of cancer. FFDMs with an exam score less than or equal to 5 are considered likely normal and are therefore read by Transpara as a first reader and by a senior radiologist as second reader. The remaining FFDMs are doubly read by radiologists as normally. Data on radiologists’ decisions was extracted from October 2020 – October 2021 and from January 2022 – February 2022 which we refer to as baseline screening and AI screening, respectively.
The baseline screening sample consisted of 59,325 women (1816 recalled). The AI screening sample consisted of 11,205 women (312 recalled). With AI screening 58% of FFDMs were likely normal and were therefore read by Transpara and only one radiologist, which corresponded to a 29% workload reduction. The recall rate was 3.06% at baseline screening and 2.78% with AI screening, which was lower but not significantly different (p=0.12).
Initial results of screening with Transpara as first reader, in cases of likely normal FFDMs, reduced the reader workload by 29% and resulted in a lower recall rate, however not significantly so. More time is needed to collect additional data and to detect whether recall rate will safely decrease without sacrificing cancer detection rate. An ongoing study for future publication is currently monitoring rate of consensus conferences, level of reader agreement, interval cancer rate, and cancer detection rate.
It is important to monitor AI screening performance to ensure a continuously high safety level. It might enable clinicians to work at a higher threshold for likely normal mammographies and decrease workload and recall rate even further.