A Leap Forward in Prostate Cancer Care: AI Model Guides Individualized Hormone Therapy Decisions, Journal Club - Rashid Sayyid & Zachary Klaassen

August 3, 2023

Rashid Sayyid and Zach Klaassen discuss an artificial intelligence (AI) predictive model for hormone therapy use in prostate cancer developed by Dr. Daniel Spratt. The model addresses the issue of unnecessary Androgen Deprivation Therapy (ADT), a treatment with severe side effects, currently recommended to varying extents based on National Comprehensive Cancer Network (NCCN) risk groups. The AI model integrates clinical data and digital pathology slide information, outperforming the NCCN risk grouping in identifying men with localized prostate cancer who benefit from ADT. The discussion reveals that over 60% of intermediate-risk patients did not benefit from additional ADT in the study. The next steps involve further developing and testing this model using NRG Oncology Phase 3 trial data, potentially improving personalized treatment and sparing patients from unnecessary side effects.

Biographies:

Rashid Sayyid, MD, MSc, Urologic Oncology Fellow, Division of Urology, University of Toronto, Toronto, Ontario

Zachary Klaassen, MD, MSc, Urologic Oncologist, Assistant Professor Surgery/Urology at the Medical College of Georgia at Augusta University, Georgia Cancer Center, Augusta, GA


Read the Full Video Transcript

Rashid Sayyid: Hello, everyone. This is Rashid Sayyid. I'm a urologic oncology fellow at the University of Toronto and along with Zach Klaassen, Program Director and Associate Professor at Augusta University, we'll be discussing a recent publication looking at an artificial intelligence predictive model for hormone therapy use in prostate cancer. This paper was recently published by Dr. Daniel Spratt et al. in the New England Journal of Medicine Evidence.

So we know that additional of ADT to radiation is associated with improved survival outcomes among men with localized prostate cancer. And the duration of recommended ADT depends on the NCCN risk group. For intermediate-risk patients, currently the recommendation is for four to six months of concurrent ADT, and for high-risk patients, it's currently two to three years. Despite these improvement survival outcomes, the fact remains that the majority of men treated with radiation alone do not develop distant metastases, and as such, giving them ADT unnecessarily is associated with significant side effects, including hot flashes, declines in libido and erectile dysfunction, loss of muscle mass, osteoporosis, weight gain, and then cardiac and neurocognitive worsening, amongst many other adverse effects.

Currently, there's no available validated predictive models for the reliable selection of which prostate-cancer patients specifically derive a benefit from addition of ADT to radiation, and as I alluded to in the prior slide, currently NCCN risk grouping is commonly used for the selection of ADT users plus the duration of ADT if used. And there have been attempts to utilize Gleason grading and tissue-based gene-expression, serum and imaging biomarkers to try to parse which patients may benefit from this. However, today they've only demonstrated prognostic value, not predictive value. And the difference here is that with a prognostic tool, essentially you just are able to predict your outcomes irrespective of the treatment, whereas with a predictive biomarker, you're able to parse which patients benefit a specific benefit from which treatment versus not. And so clearly, there's an unmet clinical need for an easy and reliable approach to guide the individualized use of concurrent ADT in localized prostate-cancer patients receiving radiation.

And one such tool that may help us in the setting is artificial intelligence, and there's been a lot of development in this space over the last few years whereby digital, pathology-based AI tools have emerged with the diagnosis and grading of prostate cancer. However, these have been, for the most part, limited to human-interpretable and defined features such as the Gleason score. But in 2022, Esteva et al. developed the multimodal AI prostate prognostic model, which relies on digital histopathology and clinical data from five NRG oncology Phase 3 trials, and this was used to develop and validate prognostic models using this AI tool that outperformed the NCCN risk grouping to determine which men with localized prostate cancer benefited from ADT addition.

And if we look at the schematic here, let's start up top. We see that this model incorporates clinical data such as age, PSA, Gleason score and the clinical T-stage, all routinely and widely available variables, but also, this incorporates information from digital pathology slides. And as we see here, these get the process, without getting into the details, these two variables, the clinical data variables, as well as the information from the digital pathology slides, get incorporated together to generate an AI score.

And then you may ask, "Well, how does this model use these different variables? Does it give equal weights to all the different variables?" And the fact is that this model relies very heavily on the Gleason primary score, as we see here, and also relies on the imaging features as well, generated from the digital histopathology slides.

And so what is the next step here? The next step is to extend this approach by adapting this multimodal AI prostate prognostic model to develop and test a predictive model based on deep learning to identify which intermediate-risk patients would benefit from ADT addition to XRT.

And so for the purposes of this analysis, the authors utilize data from NRG Oncology Phase 3 RCTs, with the following eligibility criteria for all the inclusion RCTs. So all studies included patients with localized non-metastatic prostate cancer. The studies included must have had a subset of patients with intermediate-risk disease and must have included treatment with the radiation alone, plus or minus ADT. We needed a long follow-up to ascertain the outcomes, so the median follow-up was set at at least eight years. And all the eligible studies needed to have stored histopathology slides in the NRG Oncology Biospecimen Bank and were available for digitalization.

And so the first step with any model is developing the model. And so how do the authors develop the model? So they used studies from four NRG Oncology trials. So without getting too many details, it started with the NRG/RTOG 9202, which included patients who received radiation, plus four versus 28 months of ADT, and then the 9413, which looked at randomizing patients with four months of ADT with radiation and then looking at pelvic nodal versus no pelvic nodal radiation in intermediate/high-risk patients. And then 9910 looked at radiation plus 16 versus 36 weeks of ADT, and then the 0126 which, looked at lower versus higher doses of XRT without ADT in intermediate-risk patients. So the good thing is all four studies included available data for histopathology, so all four studies were used to extract the imaging features for model development.

Next, the authors had to use different imaging clinical and outcomes data, and they only utilized the outcomes data from the 9910 and 0126 because for the most part these include intermediate-risk patients, which is the target cohort here, trying to figure out which patients would benefit from ADT addition, whereas 9202 and 9413 included a lot of high-risk patients. So they only used the imaging data without the clinical data for those reasons. Next, you have to validate the model. So for that purpose, they used the NRG/RTOG 9408, and this study included low, intermediate or high-risk prostate-cancer patients, so a very variable heterogeneous population, which increases the fidelity of the cell model validation. In this study, patients received radiation plus or minus four months of ADT. So again, it addresses the question that we're trying to answer.

So the primary study objective was to develop and validate this AI-based predictive model to identify a differential benefit from the addition of short-term ADT to radiation in localized prostate cancer. For the purposes of this analysis, the primary endpoint was time to distant metastasis, and this was defined from randomization. They also looked at secondary outcomes such as prostate-cancer-specific mortality, metastasis-free survival, and overall survival.

So we're going to get a bit technical here and talk about how does this work? How does the AI model essentially generate the imaging-based features? So what they do is they use initially unannotated H&E-stained slides that are digitized using a digital pathology scanner at a 20-times magnification level. And so they do the image feature extraction by recognizing and defining tissue features. Next, after they do that, all the tissues are sliced and divided into 256 by 256 pixel patches, and then they use a specific platform, the Resnet-50 extraction model, to essentially train it on the imaging patches. And then afterwards, they run 2.5 million tissue patches. They feed it through the model 200 times to train this model over and over and over, and so this is where the deep-learning feature comes into play.

Next, what they do is they incorporate the clinical variable. So we had initially the digital pathology slides. Incorporate the clinical variables, and this is called the downstream multimodal predictive model development, and it incorporates all of these features together. Once it does that, the model development cohort is essentially split into a training set for 60%, then a tuning set, and the tuning set, essentially, what it does is a hyperparameter tuning. So it essentially optimizes the model even further. Again, very technical details, but it's important to have an understanding of how this works in practice.

Next, after they do that, essentially the model generates scores for you, and this score, or the delta is what it's called, essentially gives you an idea of which patients are most likely to benefit versus not. And so for practical reasons, you need to split the score into a binary output: yes, patients will benefit, or the model positive, and no, patients will not benefit, the model negative. Then you essentially see whether this cutoff is of clinical benefit. And so what they did is they plotted all the values, as you can see, in the histogram to the left, and they used a 67 percentile to help define positive versus negative. So if you're higher than 67 percentile for score, it's considered model positive; less than 67, it's model negative. And then after the model was logged, it was validated in the NRG/RTOG 9408 cohort.

Just a bit of this on the statistical analysis. So essentially, it's a time-to-event analysis and we can do it either in the Cox modeling, which doesn't account for the noise, the competing events, or you can do it using competing risk analysis accounts for competing events of interest. And so the authors here did the latter, whereby time to distant metastasis and prostate-cancer-specific mortality were adjusted for the competing risk of other-cause death. And then they used the Fine and Gray regression modeling, which is the equivalent of the Cox modeling for the competing risk analysis. Essentially, they did that to evaluate the association of short-term ADT with distant metastases and prostate-cancer-specific mortality rates.

And next, they tested for an interaction between the predictive model treatment to evaluate this predictive model, and essentially, the treatment effects of the predictive model positive and negative subgroups were similarly assessed as the overall validation cohort to measure the relative treatment effect between the groups, again trying to parse out whether a model-positive cohort versus model-negative cohort... Essentially, is there enough discrimination to see whether this multimodal AI model has any predictive utility?

And at this point, I'll turn it over to Zach to go over the results and discussion for this study.

Zach Klaassen: Thanks so much, Rashid, for that detailed and excellent introduction of this study. So this is the CONSORT diagram for NRG/RTOG 9408 in the validation set. You can see here there is 2028 registered patients and randomly assigned. This included 1013 to radiotherapy plus short-term ADT and 1,015 to radiotherapy alone, and for this particular analysis, this included 788 patients with histopathology imaging data available in the radiotherapy plus short-term ADT arm and 806 patients in the radiotherapy alone arm.

This is the baseline characteristics for the study, and we'll focus on the far right, which is the imaging cohort, the ones that are included in this analysis. And we can see here that the meeting age was quite well balanced between these groups with regards to radiotherapy or radiotherapy plus short-term ADT, median age of roughly 70 years. The majority of patients, roughly three quarters, were white. However, I will mention that nearly 20% of these patients were Black, which is higher than the majority of clinical trials in prostate cancer. All of these patients had excellent performance status, more than 90%. 90% to 100% Karnofsky performance status. Baseline PSA median was eight for each of these groups. The most common T-stage was one and two, roughly 50-50 between those two groups. Most patients did not have nodal staging, so they were deemed NX at over 95%. The most common Gleason scores were less than seven and seven, making up more than 90% of this cohort, and the most common risk group was intermediate risk at just over 50% in both of these arms.

This is the cumulative incidence in the validation cohort of histopathology-image patients of the AI predictive model subgroups, looking specifically at distant metastases. So on the left, this is looking at the overall cohort. You can see here that radiotherapy plus short-term ADT is in blue; radiotherapy alone is in red. The 15-year estimate of distant metastases for radiotherapy alone was 9.8% compared to 5.9%, so this was statistically significantly favoring radiotherapy plus short-term ADT, with a hazard ratio of 0.64 and a 95% confidence interval of 0.45 to 0.90.

Next, on the center of the screen is the predictive model positive, so this is the patients that were deemed to be benefiting from short-term ADT. So this is 543 patients. We see again a significant benefit to short-term ADT with radiotherapy; a 15-year estimate of distant metastases, 14.4% for radiotherapy alone compared to only 4% for radiotherapy plus short-term ADT; hazard ratio favoring short-term ADT of 0.34; and 95% confidence interval of 0.19 to 0.63.

Importantly, on the far right is the predictive model negative, so these are patients that the model said would not benefit from short-term ADT, and this makes up more than 60% of this cohort at 1,051 patients. And we see here no benefit to short-term ADT: 15-year estimate of distant metastases, 6.9% compared to 7.4% with radiotherapy alone, with a non statistically significant hazard ratio of 0.92. Similar-looking figure, but looking at prostate-cancer-specific mortality, again in the overall cohort, a benefit to short-term ADT: 15-year estimate of 4.4% compared to radiotherapy alone of 8.6%, significant hazard ratio benefiting short-term ADT with radiotherapy 0.52, and a 95% confidence interval of 0.35 to 0.78.

Again, we see with the predictive model positive a benefit to short-term ADT with a statistically significant hazard ratio of 0.28, 95% confidence interval of 0.14 to 0.57, and similarly to distant metastases in the patients that had predictive model negative for benefiting from short-term ADT, there was no benefit to radiotherapy plus short-term ADT, with a hazard ratio of 0.74. So we see very comparable results whether the endpoint is distant metastases or prostate-cancer-specific mortality.

This is a forest plot for all of these endpoints, again highlighting that for the model-positive patients for both distant metastases and prostate-cancer-specific mortality, the positive model, basically these patients benefiting from short-term ADT with these corresponding hazard ratios, which we've seen on the previous slides.

This is additional forest plots for metastasis-free survival and overall survival in both the positive and negative predictive model group. So this is additional, even further downstream endpoints. However, we see that while we did see a benefit for distant metastasis and prostate-cancer-specific mortality, there is no benefit to positive or negative modeling using this dataset for both metastasis-free survival and overall survival.

So by way of discussion, the current standard of care for men with intermediate-risk, particularly unfavorable intermediate-risk localized prostate cancer, is radiotherapy plus short-term ADT. Despite the improvements of outcomes in all comers, the majority of men will not develop distant metastases with radiotherapy alone, although many will experience the side effects from ADT. Although there was a clear benefit of ADT in unselected patients in RTOG 9408, the majority of patients had no benefits identified. These results from the AI model indicate that more than 60% of patients at intermediate risk in this trial did not derive a benefit from ADT in addition to their radiotherapy.

With any model, specifically AI-derived model, generalizability is key, and one thing we see in the RTOG 9408 trial is several points that lead to generalizability. First, patients were enrolled from more than 500 centers, primarily across the United States and Canada. It included academic, community and VA centers. And as I mentioned in the demographics, roughly 20% of patients in the validation cohort were Black or African-American.

So in conclusion, this study developed and independently validated, in a completed Phase 3 randomized trial, an AI-based predictive model to guide ADT use with radiotherapy in localized prostate cancer using a novel multimodal digital pathology AI-derived platform. Using this predictive model, this study showed that the majority of patients at intermediate-risk prostate cancer did not benefit from ADT treatment.

Thank you very much for your attention. We hope you enjoyed this UroToday Journal Club discussion of the recently published paper in New England Journal of Medicine Evidence.