Correlation and Validation of Gene Expression Classifiers in Prostate Cancer: A Large-Scale Study - Daniel Spratt

November 7, 2023

Aditya Bagrodia hosts Daniel Spratt to discuss the complexities of gene expression classifiers in prostate cancer. Dr. Spratt's team conducted a study involving over 50,000 patients to evaluate the interchangeability of three major gene classifiers: Decipher, Oncotype, and Polaris. Contrary to common belief, the study reveals that these tests are not functionally interchangeable, showing only moderate to poor correlation. While the study doesn't rank the tests, it does emphasize the need for clinicians to exercise caution when considering them as interchangeable options. Both doctors agree that staying updated on the latest research is crucial for making informed decisions in patient care. The discussion underscores the importance of robust validation and the need for personalized treatment approaches in prostate cancer.

Biographies:

Daniel Spratt, MD, Chair and Professor of Radiation Oncology, UH Cleveland Medical Center, Seidman Cancer Center, Case Western Reserve University, Cleveland, OH

Aditya Bagrodia, MD, FACS, Urologic Oncologist, Associate Professor of Urology, UC San Diego, San Diego, CA


Read the Full Video Transcript

Aditya Bagrodia: Hi, my name's Aditya Bagrodia. I'm a urologic oncologist at UC San Diego Health. And it's a pleasure to talk with Dr. Dan Spratt, who's the Chairman at UH Seidman Cancer Center, Case Western Reserve University Department of Radiation Oncology. Dan, how are you doing today?

Daniel Spratt: I'm doing fantastic. Thanks so much for having me.

Aditya Bagrodia: Perfect. Perfect. Well, really looking forward to kind of picking your brain on the work that you've most recently done, which is actually a piece of a body of work that you've really done over the last decade or so. Looking at really digging into some of the biological underpinnings of prostate cancer, and how that impacts our decision-making.

So maybe we could just jump into it. You're presenting on behalf of your team a fairly large experience, kind of comparing and contrasting some of the more commonly used gene expression classifiers. Can you tell us a little bit about that, Dan?

Daniel Spratt: Absolutely. So my team and this work was presented by Dr. Angela Jia, who's our Director of GU Radiation Oncology at our program. We took over 50,000 patients who had whole transcriptome testing with the Decipher's microarray platform. And as many of the audience knows, there are three main commercial classifiers that are in the national guidelines today. There's the Decipher, or 22 gene classifier, there's the Oncotype or GPS classifier, and there's the Polaris, or CCP, or some CCR. The acronyms get kind of confusing. But there are three main ones, and people sometimes don't know, which do I use? Is there one that's better, or is there one that's worse? And so recently, NCCN implemented this levels of evidence to try to say, "Well, we're not going to tell you which one's better, but we'll tell you which one has the strongest validation for clinical use."

And Decipher was the only test, based on about 11 randomized trials, that post hoc, after they were done, were analyzed with Decipher. And so, it has level one biomarker evidence for use. And the others were level three evidence.

And so, the question we wanted to ask is, are these tests all interchangeable? Right? If the actual gene expressions of the various genes in these tests measure functionally the same thing, even though they're different genes, maybe this level of evidence doesn't matter so much. And so what we did, given we are not going to send a biopsy sample, and commercially pay for three companies to each profile it, and bill it to the patient... What we did is, we used all these patients, had the Decipher test, but it measured all the genes from these other assays, and had derived scores for the Polaris and a derived score for the Oncotype.

Aditya Bagrodia: Yeah, it's fascinating. I mean, over the course of my education and training, you hear about these tests interchangeably. There's certainly, I think, some institution specific predilections. There's marketing, obviously this is big business that plays into it.

Familiarity with the reports, a lot of things which I think ultimately, lead to people going with what they know, for better or for worse.
And I think the second part that's interesting is, sometimes it feels like it's a black box, right? You're getting this 22 gene expression classifier, or it's based on proliferation, or it's based on inflammation, and they may not correlate how they were able to derive their predictive and prognostic values, and they could have synergistic values. Who knows? But your study really was trying to dial in to, are they similar, is the readout similar? Is the kind of prognostic value similar? Is that fair?

Daniel Spratt: Yeah. It is really, we took the Decipher test, given it had the level one evidence, to say, "Let's use that as kind of the gold standard." And with these derived scores of the other two tests, do they strongly correlate, right? Because if the scores themselves, if it's one-to-one, then it really doesn't matter. They are functionally interchangeable, even if there's not necessarily the same validation.

But what we found, and I was actually quite surprised, is there's very poor to moderate correlation. One of the metrics is this R squared test, but it's only about 0.3. So really, the majority of information was unique to Decipher or unique to the other tests. They were not necessarily fully correlated. So I think the take home message is, at least in localized prostate cancer, where this was run in over 50,000 patients, these tests are not correlated. They're not giving you the same readout. There's patients with high Decipher and low derived GPS or CCP scores, and vice versa. And so I think there just needs to be some caution by saying they're interchangeable. It's not telling you what you should use, but I wouldn't say they're interchangeable in what they're measuring.

Aditya Bagrodia: Yeah. So maybe digging into a little bit about the study, it's a large study, about 50,000 patients.

Daniel Spratt: Yeah.

Aditya Bagrodia: I think fairly representative of 21st century prostate cancer cohorts.

Daniel Spratt: Yeah.

Aditya Bagrodia: 20% or so were low risk. 40, 60% kind of favorable, intermediate. I know the unfavorable, intermediate, and then the small 10, 12% were high risk. Is that about right?

Daniel Spratt: Yeah, about that.

Aditya Bagrodia: And so for Decipher, that platform was actually based on looking at metastasis free survival, correct?

Daniel Spratt: Yeah. Predicting if you may see development of metastatic disease. Exactly.

Aditya Bagrodia: So in your study, we were not really able to say whether one of these tests performs better or worse, in terms of looking at meaningful endpoints, metastasis-free survival, cancer specific survival, overall survival. We're just saying that these tests behave differently.

Daniel Spratt: Correct. And I think that's an important point. I mean, there are two things. One, you are spot on. We're not comparing these head-to-head to say this prognosticate is better or not. So that's important to note.

Two, these are derived scores. And I think that there's very, I can't actually think of a point in time where the same tissue, the RNA is extracted, and two companies have a randomized trial head-to-head, right? So I don't think that's ever going to happen. But I think that the way we typically decide how we treat patients, the tests we use, usually is based upon the robustness and confidence we have in their performance. And so, I think that this sort of lends itself to say that, if you're going to use tests that have the strongest validation, you can let the NCCN or whatever guideline you use to sort of guide how your practice is, but I wouldn't view these simply as interchangeable. But it doesn't mean the test, that the one test performs worse than the other.

Aditya Bagrodia: Yeah. And it is complicated, right?

Daniel Spratt: Yeah.

Aditya Bagrodia: Is there value added and getting multiple tests, and are we just making this more and more-

Daniel Spratt: Yeah.

Aditya Bagrodia: Making this more and more complex, I suppose, is one way to phrase it. Again, could there be synergistic information coming from these?

And in my opinion, these really are kind of allowing us to personalize decisions from the relatively rudimentary tools that we have. PSA, stage, grade, family history, and some information we can derive, of course, from the pathology report, core involvement, percentage, core involvement pattern for cribriform ductal. So we're looking at really layering on something a bit more objective that can impact broad strokes, treat, not treat. If you're going to treat, how you treat. I mean, really some exciting clinical trials looking at either scaling back, or intensifying androgen deprivation therapy, kind of across the unfavorable intermediate risk, favorable intermediate risk, and high risk spectrum.

So I think these tests are going to be useful, massively important. And digging into which ones ultimately might be the best in helping us make these decisions is critical. And I think this is another really, really valuable piece of information, as we dissect out and tease out some of these important elements.

So how do you take this data that you all generated, and what are the next steps? Is it implementable into your practice? Are there kind of clinical takeaways as you see it?

Daniel Spratt: Yeah, I think so. I think that because you have multiple choices for tests out there, I think what our group has tended and shifted towards over the years is the use of the Decipher tests. Just simply because of the robustness of that validation. And I think that this, for various groups out there that, let's say, for like you said, logistical reasons, or their familiarity with a test, I think that this data, they may question to say, "Well, maybe these tests are not all the same." And it's really up to them that, do they feel the validation in the context they use the test, how important is that to them? And I think now that NCCN guidelines has recognized that differential, or grading of validation, this may be used. But I do believe that we do need to take it with a grain of caution from our study, that we cannot prove one is better, but we can say that the gold standard, we'll say Decipher test level of evidence, the other tests that are derived don't correlate strongly with it.

Aditya Bagrodia: Perfect. Perfect. And I think it also is incumbent on all of us to stay up to date with what's coming through the pipelines. And just because we adopted it three, five, seven years ago, if it's static, or not developing the same degree of confidence in the testing, that it behooves us to at least look at what's out there, and make the best decision for ourselves, of course, for our patients.

Daniel Spratt: 100%.

Aditya Bagrodia: Well, Dan, thanks for sharing all of this incredible information, continuing to move the needle for our patients with prostate cancer. Really appreciate your time and coming on today.

Daniel Spratt: Thank you so much. Appreciate it.