Effects of a comprehensive brain computed tomography deep learning model on radiologist detection accuracy

24.08.23

Effects of a comprehensive brain computed tomography deep learning model on radiologist detection accuracy

European Radiology. First published online 22 August 2023.

Authors: Buchlak QD, Tang CHM, Seah JCY, Johnson A, Holt X, Bottrell GM, Wardman JB, Samarasinghe G, Dos Santos Pinheiro L, Xia H, Ahmad HK, Pham H, Chiang JI, Ektas N, Milne MR, Chiu CHY, Hachey B, Ryan MK, Johnston BP, Esmaili N, Bennett C, Goldschlager T, Hall J, Vo DT, Oakden-Rayner L, Leveque J-C, Farrokhi F, Abramson RG, Jones CM, Edelstein S & Brotchie P

Abstract

Objectives

Non-contrast computed tomography of the brain (NCCTB) is commonly used to detect intracranial pathology but is subject to interpretation errors. Machine learning can augment clinical decision-making and improve NCCTB scan interpretation. This retrospective detection accuracy study assessed the performance of radiologists assisted by a deep learning model and compared the standalone performance of the model with that of unassisted radiologists.

Methods

A deep learning model was trained on 212,484 NCCTB scans drawn from a private radiology group in Australia. Scans from inpatient, outpatient, and emergency settings were included. Scan inclusion criteria were age ≥ 18 years and series slice thickness ≤ 1.5 mm. Thirty-two radiologists reviewed 2848 scans with and without the assistance of the deep learning system and rated their confidence in the presence of each finding using a 7-point scale. Differences in AUC and Matthews correlation coefficient (MCC) were calculated using a ground-truth gold standard.

Results

The model demonstrated an average area under the receiver operating characteristic curve (AUC) of 0.93 across 144 NCCTB findings and significantly improved radiologist interpretation performance. Assisted and unassisted radiologists demonstrated an average AUC of 0.79 and 0.73 across 22 grouped parent findings and 0.72 and 0.68 across 189 child findings, respectively. When assisted by the model, radiologist AUC was significantly improved for 91 findings (158 findings were non-inferior), and reading time was significantly reduced.

Conclusions

The assistance of a comprehensive deep learning model significantly improved radiologist detection accuracy across a wide range of clinical findings and demonstrated the potential to improve NCCTB interpretation.

Clinical relevance statement

This study evaluated a comprehensive CT brain deep learning model, which performed strongly, improved the performance of radiologists, and reduced interpretation time. The model may reduce errors, improve efficiency, facilitate triage, and better enable the delivery of timely patient care.

Key Points

• This study demonstrated that the use of a comprehensive deep learning system assisted radiologists in the detection of a wide range of abnormalities on non-contrast brain computed tomography scans.

• The deep learning model demonstrated an average area under the receiver operating characteristic curve of 0.93 across 144 findings and significantly improved radiologist interpretation performance.

• The assistance of the comprehensive deep learning model significantly reduced the time required for radiologists to interpret computed tomography scans of the brain.

Full study

Back to Evidence