28 November 2020
S J Vinnicombe et al
What impact could AI based computer aided detection have on the number and biological relevance of interval cancers in a population based screening programme?
S J Vinnicombe, O Parr, R Sidebottom, D Godden, E Cornford, I D Lyburn
Interval cancers (IC) feature in all mammographic screening programmes, and are classified in the UK NHS Breast Screening Programme as a key component of quality assurance. The aims of this study were: 1) to assess whether an AI-based computer aided detection software (CAD) could identify relevant findings on prior screens 2) to compare CAD with reader categorisation of IC and 3) to ascertain the histologic subtype of CAD-detected ICs.
METHOD AND MATERIALS
Necessity for informed consent was waived in this retrospective study of anonymised images. Consecutive ICs with available diagnostic and prior screening mammograms (PS) and controls with subsequent normal screens were identified from NHS BSP records. Exclusion criteria were: incomplete mammography, IC outside the field of view and unavailable pathology. BI-RADS density, CAD global scores and prompts were recorded. Prompts were deemed true positive (TP) if they corresponded to the site of the subsequent IC. CAD specificity was set at 96% to match the UK screening recall rate of 4%. CAD scores and prompts on PS were compared with prior NHS BSP categorization of IC (true interval, occult, minimal signs or false negative). Proportions of TP prompts on PS were calculated and correlated with grade and receptor status of ICs. ROC curve analysis and Chi2 tests were used at 95% significance level.
852 controls and 300 ICs were analyzable. Median CAD scores were: controls, 4.22; PS, 7.53; IC, 9.69. By NHS BSP categorization, 149 (50%) were true intervals, 42 (14%) mammographically occult, 49 (16%) minimal signs and 60 (20%) false negative. ROC area under the curve (AUC) for the entire IC cohort was 0.72 (95% C.I. 0.68-0.75). Of 109 false negative/minimal sign cases, CAD correctly identified 21 at a recall rate of 4%. 15 (71%) were grade 1/2 and 20/21 were ER positive. In the true interval group, CAD would have recalled 14/149 (9%); 9 were grade 3 (p. 0.097). For the occult group diagnosed within 12 months of PS (n=45), the AUC was 0.73 (95% C.I. 0.66-0.81); for BIRADS density alone it was 0.59 (95% C.I. 0.5 – 0.99).
AI-based CAD localised some cancers on prior screens that were missed by readers, mostly low/intermediate grade ER positive cancers.
In a 3-yearly breast screening programme, AI CAD may improve detection of subtle cancers, and might have a role in triage of cases considered imperceptible to human readers for enhanced screening.