Manual data collection is still the gold standard for disease-specific patient registries. However, CAPRI-3 uses text mining (an artificial intelligence (AI) technology) for patient identification and data collection. The aim of this study is to demonstrate the reliability and efficiency of this AI-driven approach.
CAPRI-3 is an observational retrospective multicenter cohort registry on metastatic prostate cancer. We tested the patient-identification algorithm and automated data extraction through manual validation of the same patients in two pilots in 2019 and 2022.
Pilot one identified 2030 patients and pilot two 9464 patients. The negative predictive value of the algorithm was maximized to prevent false exclusions and reached 94.8%. The completeness and accuracy of the automated data extraction were 92.3% or higher, except for date fields and inaccessible data (images/pdf) (10-88.9%). Additional manual quality control took over 3 h less time per patient than the original fully manual CAPRI registry (105 vs. 300 min).
The CAPRI-3 patient-identification algorithm is a sound replacement for excluding ineligible candidates. The AI-driven data extraction is largely accurate and complete, but manual quality control is needed for less reliable and inaccessible data. Overall, the AI-driven approach of the CAPRI-3 registry is reliable and timesaving.
Cancers. 2023 Jul 27*** epublish ***
Dianne Bosch, Malou C P Kuppen, Metin Tascilar, Tineke J Smilde, Peter F A Mulders, Carin A Uyl-de Groot, Inge M van Oort
Department of Urology, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands., Department of Radiotherapy, Maastro Clinic, 6229 ET Maastricht, The Netherlands., Department of Medical Oncology, Isala Hospital, 8025 AB Zwolle, The Netherlands., Department of Medical Oncology, Jeroen Bosch Hospital, 5223 GZ 's-Hertogenbosch, The Netherlands., Erasmus School of Health Policy and Management, Erasmus University Rotterdam, 3062 PA Rotterdam, The Netherlands.