A so-called deep learning artificial intelligence (AI) algorithm for analyzing MRI scans showed “acceptable” agreement with expert human readers for detecting sacroiliac joint (SIJ) inflammation in patients with axial spondyloarthritis (axSpA), researchers said.
The AI-powered algorithm reached the same conclusions as a “gold standard” panel of three central readers on 543 of 731 patient images provided, with both finding inflamed SIJs in 304 and both agreeing on the absence of inflammation in 239, according to Joeri Nicolaes, PhD, of UCB Pharma in Brussels, Belgium, and colleagues.
In 132 of the remaining cases, however, the expert readers identified inflammation that was missed by the algorithm, Nicolaes and colleagues reported in Annals of the Rheumatic Diseases. And the algorithm flagged inflammation in the final 56 images where the experts determined there was none.
Assuming that the human readers were always correct, as is normal for such studies, that led to the following summary of the AI system’s statistical performance:
- Absolute agreement: 74% (95% CI 72-77)
- Sensitivity: 70% (95% CI 66-73)
- Specificity: 81% (95% CI 78-84)
- Positive predictive value: 84% (95% CI 82-87)
- Negative predictive value: 64% (95% CI 61-68)
- Cohen’s kappa: 0.49 (95% CI 0.43-0.55)
Nicolaes and colleagues acknowledged that these values were not great on their face but noted some extenuating circumstances. “[I]t must be considered that the criteria used to define inflammation on MRI (≥2 SIJs with inflammation) were conservative, and the expert readers … in this study may have used other contextual or clinical information (e.g., CRP [C-reactive protein] levels or HLA-B27 positivity) when determining the presence of inflammation,” they wrote.
The researchers also argued that their expert panel was likely more expert than some real-world clinicians, such as general rheumatologists and radiologists, who interpret MRI images — “further supporting the potential use case of machine-learning algorithms in contexts where expert readers may not be available.” If nothing else, an algorithm such as this might at least provide reproducible results, unlike human readers whose interpretations may vary both inter- and intra-individually.
Overall, they concluded, the system “enabled acceptable detection of inflammation” as defined in published guidelines.
Study Details
The algorithm had been developed and reported previously using images and other data from an observational French cohort of 256 axSpA patients. The current study sought to validate its performance in a larger, unrelated group of patients. These were drawn from two UCB-funded clinical trials, RAPID-axSpA (152 patients) and C-OPTIMISE (579 patients). Both trials were testing medications for radiographic and non-radiographic axSpA, with MRI scans made at baseline as part of their respective protocols.
Mean trial participant age was about 34, and two-thirds of them were men; disease duration averaged about 5 years. Over 90% were white. Some 45% had radiographic axSpA. Active disease according to standard clinical evaluations (e.g., Bath Ankylosing Spondylitis Disease Activity Index score ≥4) was required for eligibility in the original trials.
Expert analysis was performed by two readers; if they disagreed on the presence of SIJ inflammation, the third was called in to break the tie. As noted above, a positive finding required at least two SIJs to show inflammation.
An important limitation — for the study and for the AI system’s clinical utility — was that, out of 11,116 patients in C-OPTIMISE, the algorithm couldn’t process scans for 129 because image sizes or numbers of slices taken were outside its design specifications. (The same was true for eight of 172 patients in RAPID-axSpA.)
Nicolaes and colleagues also observed that standard classification criteria with regard to SIJ inflammation have changed since the algorithm was first designed, necessitating a future update. And the algorithm currently has no ability to detect structural damage, which clinicians also rely on in treatment planning.
-
John Gever was Managing Editor from 2014 to 2021; he is now a regular contributor.
Disclosures
This study and the trials from which the patient images were taken were funded by UCB Pharma.
Nicolaes and another author were UCB employees. Other authors reported relationships with UCB and other pharmaceutical companies.
Primary Source
Annals of the Rheumatic Diseases
Source Reference: Nicolaes J, et al “Performance analysis of a deep-learning algorithm to detect the presence of inflammation in MRI of sacroiliac joints in patients with axial spondyloarthritis” Ann Rheum Dis 2024; DOI: 10.1136/ard-2024-225862.
Please enable JavaScript to view the