ESMO 2024: First Preliminary Results of Artificial Intelligence Generated Treatment Recommendations for Urothelial Cancer Based on Multidisciplinary Cancer Conferences from the KITTU Project

(UroToday.com) The 2024 European Society of Medical Oncology (ESMO) Annual Congress held in Barcelona, Spain was host to the presentation of Poster 2013. Dr. Gregor Duwe presented the first preliminary results of artificial intelligence (AI) generated treatment recommendations for urothelial cancer based on multidisciplinary cancer conferences from the KITTU project.


KITTU (KI-unterstützte Therapiebegleitung am Beispiel der Urologie/AI-Supported Support in Treatment Recommendation for Cancer Patients in Urology) aims to develop an AI system that supports the treatment of patients with urological cancers. The system demonstrates each therapeutic option available in an individual's situation and supports them in their treatment decision-making process.

Dr. Duwe began their poster presentation by sharing that decisions on the best available treatment in clinical oncology are still mostly based on expert opinions from multidisciplinary cancer conferences (MCCs). Currently, there is a lack of support engines that could relieve the burden on physicians and patients, thereby improving treatment decision-making. They believe that AI could serve this purpose by increasing evidence-based treatment recommendations in clinical oncology, providing additional support in MCC groups. Dr. Duwe presented the first preliminary data for patients with urothelial carcinoma (UC) using AI-generated treatment recommendations from the KITTU project.

They identified analyzed and transformed patient data (107 individual features) of 1029 MCCs for UC from the years 2015 - 2022 into representations that can be used in software development. Then they developed a two-step process in order to train a classifier to mimic the MMC recommendations:

  • In the first step, they identified superordinate categories of the recommendations. In the second step, they specified the detailed recommendation.

Dr. Duwe and his colleagues used different machine learning (CatBoost, XGBoost, Random Forest) and deep learning (TabPFN, TabNet) approaches. The accuracy weights were determined by the F1-Score, a machine learning evaluation metric that measures a model’s accuracy by combining the precision and recall scores.

The accuracy metric computes how many times a model made a correct prediction across the entire dataset. The F1 score is calculated as the harmonic mean of the precision and recall scores, as shown below. It ranges from 0 to 100%, with a higher F1 score denoting a better-quality classifier:

image-0.jpg

They developed an AI system capable of determining which type of overarching recommendation should be applied. For example, the AI system can decide whether the patient needs surgery, anti-cancer drugs, or radiotherapy. 

image-1.jpg

The investigators reported the accuracy rates for the AI-generated treatment recommendations for urothelial carcinoma (UC) based on F1-Scores. Overall, for the first step of treatment (Surgery vs. drug-based cancer treatment vs. follow-up vs. best supportive care), the model showed an F1 score of 0.7912 overall.

image-2.jpg

The best performance was achieved by the XGB classifier the confusion matrix of this classifier is shown below (The rows are AI-recommended drugs and the columns are MCC recommende drugs). The AI-system was able to suggest the specific surgical treatment as well as the correct drugs in >70% of the cases.

image-3.jpg

For overall drug prediction, the F1-Score was 0.6880. The table below breaks down the F1-Scores for different anti-cancer drugs individually. Again, XGB was the classifier that performed the best.

image-4.jpg

Their results show that their selected deep neural network architectures were able to learn from the limited amount of data and the performance improves with an increasing amount of data.

The investigators deployed explanation techniques that are able to provide a local and a global explanation:

  • The local explanation indicates which features were relevant for the recommendation in a specific patient.
  • The global explanation indicates which features, based on the complete dataset, are relevant for a given class of recommendation (i.e. Gemcitabine/Cisplatin) – figure below: 

image-5.jpg

Dr. Duwe wrapped up his poster with the following conclusions: This is the first data showcasing fully automated AI-based treatment recommendations for multidisciplinary cancer conferences in urothelial cancer, achieving good accuracy rates (F1 Score 0.791). The new project aims to integrate clinical trial data into the AI-based treatment recommendation model, enabling explainable AI for the generated recommendations. They will prospectively validate the AI system and provide a study-based explanation, matching patient data with study criteria. Deep Learning techniques were able to outperform traditional approaches and additionally enable a prediction of multiple equivalent recommendations using the soft-labeling training given by the investigators.

Presented by: Gregor Duwe MD, Department of Urology and Pediatric Urology, University Medical Center of the Johannes Gutenberg-University Mainz, Langenbeckstrasse, Mainz, Germany.  

Written by: Julian Chavarriaga, MD – Urologic Oncologist at Cancer Treatment and Research Center (CTIC) Luis Carlos Sarmiento Angulo Foundation via Society of Urologic Oncology (SUO) Fellow at The University of Toronto. @chavarriagaj on Twitter during the 2024 European Society of Medical Oncology (ESMO) Annual Meeting, Barcelona, Spain, Fri, Sept 13 – Tues, Sept 17, 2024.