IBCN 2024: AI in Bladder Cancer: The Good, the Bad, and the Ugly

(UroToday.com) The 2024 IBCN annual meeting included a session on emerging technologies in bladder cancer, featuring a presentation by Dr. Jethro Kwong discussing the good, the bad, and the ugly of artificial intelligence in bladder cancer. There is basic artificial intelligence terminology to understand, which Dr. Kwong highlighted at the beginning of his talk:

  • Artificial Intelligence: the ability of machines to emulate or perform tasks that typically require human intelligence
  • Machine Learning: Ability to automatically learn patterns without explicit instructions and improve with more data over time
  • Deep Learning: Mimic the structure and function of the human brain through artificial neural networks

The Good

Dr. Kwong notes that recent advances in artificial intelligence have allowed the improvement of cystoscopic evaluation, with 3-dimensional bladder reconstruction, providing a more complete assessment of the bladder. The input is a white light cystoscopy video and the output is 3-dimensional bladder reconstruction:1 

Early work suggests that real-time tumor detection has a sensitivity of 91% and a specificity of 99%. Additional applications include: (i) measuring tumor diameter, (ii) predicting stage and grade, and (iii) adequate muscle sampling.

The second application of artificial intelligence is improving pathological assessment. Work from Queens University has focused on reducing inter- and intraobserver variability:2
These results suggest an AUROC of 0.94, sensitivity of 87%, and specificity of 89%. Additional applications include standardized assessment of pathology information.

A third application of artificial intelligence in bladder cancer is improving prognostication. Work presented by Dr. Yair Lotan at AUA 2024 looked at predicting response to BCG in high-risk non-muscle invasive bladder cancer. They developed and validated a computational histologic artificial intelligence (CHAI) assay using a cohort of 944 patients with high-risk, BCG-naive NMIBC treated with BCG from 12 centers. All patients underwent TURBT +/- restaging TURBT followed by intravesical BCG. This cohort was split into development (n = 303) and external validation cohorts (n = 641). The CHAI assay noted that high-risk patients had a 1.7x higher risk of high-grade recurrence risk at 2 years, 2.5x higher risk of progression risk at 5 years, and 1.8x higher risk of BCG unresponsiveness:
At the University of Toronto, Dr. Kwong, along with colleagues Dr. Girish Kulkarni and Dr. Alex Zlotta, have developed PROGRxN-BCa, a model for predicting progression in NMIBC. At AUA 2024, Dr. Kwong presented validation data for PROGRxN-BCa across 13 academic institutions affiliated with the Canadian Bladder Cancer Information System (n = 3,708). PROGRxN-BCa outperformed the EAU risk calculator and LASSO Cox model, achieving a c-index between 0.76 to 0.83. This performance benefit was consistent across clinically relevant subgroups, including age, sex, and tumor history. PROGRxN-BCa was well-calibrated for risks between 0-40%. At 5 and 10 years, PROGRxN-BCa demonstrated a higher net benefit (i.e. avoid unnecessary treatment escalation) compared to the EAU risk calculator for clinically relevant decision thresholds between 15-35%. Discussing work in progress, Dr. Kwong noted that PROGRxN-BCa is being further validated in 34 academic and community centers (n = 11,713), again outperforming the EAU risk calculator and LASSO Cox model, achieving a c-index between 0.76 to 0.83, with benefit across subgroups:PROGRxN-BCa across 13 academic institutions affiliated with the Canadian Bladder Cancer Information System
They are also able to sub-stratify intermediate risk NMIBC more efficaciously compared to typical risk factors (multiple tumors, early recurrence (<1 year), frequent recurrence (>1 year), tumor size >3 cm, failure of previous intravesical treatment):
sub-stratify intermediate risk NMIBC more efficaciously compared to typical risk factors (multiple tumors, early recurrence (<1 year), frequent recurrence (>1 year), tumor size >3 cm, failure of previous intravesical treatment)
The Bad

Dr. Kwong notes that concerns have been raised around appropriate methods and fairness regarding artificial intelligence in health care. Because artificial intelligence reporting guidelines do not provide a means of quantifying overall quality of AI research, his group led an effort to develop a tool (APPRAISE-AI) to evaluate the methodological and reporting quality of AI prediction models for clinical decision support.3 The APPRAISE-AI tool is a quantitative method for evaluating quality of AI studies across 6 domains: clinical relevance, data quality, methodological conduct, robustness of results, reporting quality, and reproducibility. Each domain includes 24 items with a maximum overall score of 100 points. Among 28 studies, the APPRAISE-AI scores ranged from 33 (low quality) to 67 (high quality), with most studies being moderate quality.

Dr. Kwong’s group then performed a systematic review using APPRAISE-AI for predicting NMIBC outcomes using artificial intelligence.4 Performance between artificial intelligence and non-artificial intelligence approaches included within 15 studies were compared. Most studies were low quality, with only one classified as high quality:
APPRAISE-AI for predicting NMIBC outcomes using artificial intelligence
While artificial intelligence models generally outperformed non-artificial intelligence approaches with respect to accuracy, c-index, sensitivity, and specificity, this margin of benefit varied with study quality:While artificial intelligence models generally outperformed non-artificial intelligence approaches with respect to accuracy, c-index, sensitivity, and specificity, this margin of benefit varied with study quality
NMIBC progression regression 

Common pitfalls included dataset limitations, heterogeneous outcome definitions, methodological flaws, suboptimal model evaluation, and reproducibility issues. As part of their manuscript, Dr. Kwong and colleagues provided a summary of recommendations to improve artificial intelligence studies in NMIBC prognostication:Common pitfalls included dataset limitations, heterogeneous outcome definitions, methodological flaws, suboptimal model evaluation, and reproducibility issues. As part of their manuscript, Dr. Kwong and colleagues provided a summary of recommendations to improve artificial intelligence studies in NMIBC prognostication
Importantly, Dr. Kwong emphasized that we should be using artificial intelligence reporting guidelines, including the STREAM-URO Framework5 and TRIPOD+AI.6

The Ugly 

Dr. Kwong notes that we are still far from scalable clinical deployment of artificial intelligence. To date, there is no artificial intelligence bladder cancer tools endorsed by the AUA, CUA, EAU, or NCCN guidelines, and the fact that “artificial intelligence can rapidly adapt to new data” is often not true:
To date, there is no artificial intelligence bladder cancer tools endorsed by the AUA, CUA, EAU, or NCCN guidelines, and the fact that “artificial intelligence can rapidly adapt to new data” is often not true
Conversely, prostate cancer is remarkably further along with incorporating scalable artificial intelligence, notably the ArteraAI Prostate Test, which is now in the NCCN guidelines: prostate cancer is remarkably further along with incorporating scalable artificial intelligence, notably the ArteraAI Prostate Test, which is now in the NCCN guidelines
Dr. Kwong notes that integrating artificial intelligence into healthcare systems is more than just an algorithm. In a recent NPJ Digit Med editorial, he notes that there is an importance of shifting the focus towards evaluating clinically relevant outcomes when adopting artificial intelligence tools.7 He suggests the following algorithm for continuous monitoring systems to ensure the adaptability of these tools in the ever-evolving healthcare landscape:
image-11.jpg
Dr Kwong concluded his presentation discussing the good, the bad, and the ugly of artificial intelligence in bladder cancer with the following take-home points:

  • The good: there are many exciting and new innovations in artificial intelligence in bladder cancer
  • The bad: there are common issues related to dataset limitations, outcome definitions, methodological flaws, suboptimal model evaluation, and reproducibility issues
  • The ugly: more work is needed before artificial intelligence is ready for primetime and available for all patients.

Presented by: Jethro Kwong, MSc Graduate Student, University of Toronto, Toronto, Ontario, Canada

Written by: Zachary Klaassen, MD, MSc – Urologic Oncologist, Associate Professor of Urology, Georgia Cancer Center, WellStar MCG Health, @zklaassen_md on Twitter during the 2024 International Bladder Cancer Network (IBCN) Annual Meeting, Bern, Switzerland, Thurs, Sept 19 – Sat, Sept 21, 2024

References:

  1. Shkolyar E, Zhou SR, Carlson CJ, et al. Optimizing cystoscopy and TURBT: Enhanced imaging and artificial intelligence. Nat Rev Urol. 2024 Jul 9 [Epub ahead of print].
  2. Slotman A, Xu M, Lindale K, et la. Quantitative Nuclear Grading: An Objective, Artificial Intelligence-Facilitated Foundation for Grading Noninvasive Papillary Urothelial Carcinoma. Lab Invest. 2023 Jul;103(7):100155.
  3. Kwong JCC, Khondker A, Lajkosz K, et al. APPRAISE-AI Tool for Quantitative Evaluation of AI Studies for Clinical Decision Support. JAMA Netw Open. 2023 Sep 5;6(9):e2335377.
  4. Kwong JCC, Wu J, Malik S, et al. Predicting non-muscle invasive bladder cancer outcomes using artificial intelligence: A systematic review using APPRAISE-AI. NPJ Digit Med. 2024 Apr 18;7(1):98.
  5. Kwong JCC, McLoughlin LC, Haider M, et al. Standardized reporting of machine learning applications in Urology: The STREAM-URO Framework. Eur Urol Focus. 2021;7:672-682.
  6. Collins GS, Moons KGM, Dhiman P, et al. TRIPOD+AI Statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ. 2024;385.e078378.
  7. Kwong JCC, Nickel GC, Wang SCY, et al. Integrating artificial intelligence into healthcare systems: More than just an algorithm. NPJ Digit Med. 2024 Mar 1;7(1):52.