Assessing Reader Variability in PSMA-PET/CT Scan Interpretation: A Substudy of the VISION Trial, Journal Club - Rashid Sayyid & Zachary Klaassen
July 21, 2023
Rashid Sayyid and Zach Klaassen analyze an ad hoc analysis from the VISION study, published in the Journal of Nuclear Medicine. The study investigates the agreement between readers in the interpretation of 68Ga-PSMA-11 PET/CT scans to determine patient eligibility for 177Lu-PSMA-617 Radioligand Therapy in the treatment of prostate cancer. They elaborate on the study's methodology, including the reader selection, training, and proficiency assessment. The discussion further underscores the critical need to understand the interreader variability and intrareader reproducibility of visual assessments of 68Ga-PSMA-11 PET/CT scans. Dr. Klaassen presents the findings, emphasizing moderate-to-substantial interreader agreement and substantial-to-almost perfect intrareader agreement, suggesting the VISION read rules can be readily learned and reproduced among independent reviewers. They conclude by highlighting the study's potential implications, especially in non-academic settings.
Biographies:
Rashid Sayyid, MD, MSc, Urologic Oncology Fellow, Division of Urology, University of Toronto, Toronto, Ontario
Zachary Klaassen, MD, MSc, Urologic Oncologist, Assistant Professor Surgery/Urology at the Medical College of Georgia at Augusta University, Georgia Cancer Center, Augusta, GA
Biographies:
Rashid Sayyid, MD, MSc, Urologic Oncology Fellow, Division of Urology, University of Toronto, Toronto, Ontario
Zachary Klaassen, MD, MSc, Urologic Oncologist, Assistant Professor Surgery/Urology at the Medical College of Georgia at Augusta University, Georgia Cancer Center, Augusta, GA
Read the Full Video Transcript
Rashid Sayyid: Hello, everyone. This is Rashid Sayyid. I'm a urologic oncology fellow at the University of Toronto and along with Zach Klaassen, program director and associate professor in Augusta University. We'll be discussing a recent publication, which was an ad hoc analysis from the VISION study that looked at reader agreement on 68Ga-PSMA-11 PET/CT interpretation to determine patient eligibility for 177Lu-PSMA-617 Radioligand Therapy. This study was recently published in the Journal of Nuclear Medicine by Dr. Phillip Kuo as the first author.
So PSMA is a transmembrane glutamate carboxypeptidase what's highly expressed in prostate cancer cells. And 68Ga-PSMA-11 is the radioligand agent that's used to identify the PSMA-positive tumors on PET scan. The findings on 68Ga-PSMA-11 PET/CT were used to determine patient eligibility in the phase III VISION trial. Whereby, all patients had at least one PSMA-positive metastatic lesion defined as the lesion uptake rate of the liver, and no PSMA-negative metastatic lesions. Essentially, that's defined as having no lesions greater than 2.5 centimeters that don't light up in PSMA or if you have a bone metastasis with a soft-tissue component, you have 1-centimeter short axis or if you have a solid organ greater than or equal to 1 centimeter in the short axis diameter.
So essentially, you can't have that detecting conventional imaging and have no PSMA-ligand uptake on a PET/CT scan. And so the results of this scan were used to select patients who were most likely to benefit from Lutetium. And so this tells us that appropriately selecting patients based on the result of PET/CT was very important for the purpose of this study. And just by way of an overview of the refresher, this is the study designed for VISION. VISION was a phase III trial included patients who were treated... mCRPC patients that were treated with both 1 androgen receptor pathway inhibitor and 1 or 2 taxane regimens.
Patients were all planned for protocol-permitted standard of care before randomization. And then they were randomized to either standard of care alone versus standard of care plus Lutetium. All patient in good performance status with a life expectancy of at least six months. And again, importantly, PSMA-positive mCRPC on PET/CT with the Ga-PSMA-11 scan. And this was a positive trial. This showed both OS and rPFS benefits. As we see, the median overall survival improved from 11.3 to 15.3 months, so about a four-month improvement with Lu-PSMA.
But again, one of the challenges is determining the patient eligibility. And so there are some concerns that there may be some both interreaders, so between different readers, and intrareader disagreements in reading these scans and determining eligibility. And so we need to better understand this, and this is important implications for practice when trying to determine which patients may benefit from Radioligand Therapy. And so there's some data from the proPSMA trial which looked at 68Ga-PSMA in the staging as a setting. And I looked at how are we reading these scales? Are we agreeing amongst each other and so? What they did was they looked at the interreader agreement between local and central review and they found that it's quite high actually for nodal metastases, the Kappa is about 0.87. So again, that's very impressive, and for this metastasis also is quite high. And so we need to transfer these results to the patient population that may be eligible for Radioligand Therapy.
And so the reliability of the read rules in this setting is yet to be determined. So that's the study objective for this analysis. Again, the objective was to investigate the interreader variability and intrareader reproducibility of visual assessments of 68Ga-PSMA-11 PET/CT scans using the VISION read criteria for Lu-PSMA-617 therapy eligibility. And so, in VISION, the 68Ga-PSMA-11 PET/CT scan was read by one reader from a pool of three-board certified nuclear medicine physician/radiologist. And in this substudy, the scans were read by three independent central readers each from a different institution. None of these three readers in this analysis were involved in VISION. So again, this is a different subset. And all were US board-certified nuclear medicine physicians and two out of the three were board-certified in daily radiology.
While all were experienced in reading PET/CT scans, none of them was experienced in reading 68Ga-PSMA-11 PET/CT scans or with the VISION read rules. Again, this is important implications for practice. In centers of excellence, you're going to find many specialists, but as we start thinking about broadening the applicability of this treatment for our mCRPC patients, it's very important to understand how we can transfer these results to different centers and that can increase the uptake and appropriate use of this medication for our patient. Again, this is important clinical implications, particularly in non-academic settings. And then the nuclear medicine radiologist involved in the development VISION read criteria and training of the central reader was assigned as a trainer for this study. So essentially, they used the nuclear medicine physician from the VISION study to train these patients by a virtual teleconference platform. And then how do they ascertain that these nuclear medicine physicians were able to read them or proficient enough? So at the training completion, the readers are required to correctly assess 80% of 10 allocated training cases to ascertain proficiency.
What about scan selection and coding these things. So using a random generator, 125 scans along with their corresponding CT and MRI scans from VISION were selected. So out of the 125, 75 were deemed as inclusion cases, 60%, and 50 were exclusion of 40%. Comparatively, in VISION, 85% of the screened case were inclusion eligible. So you may ask why did they include 60% was, in VISION, 85%. So they intentionally varied that to allow for more robust analysis and interreader variability. And then out of these 125 scans, a subset of 20 cases, 12 inclusion, 8 exclusion cases were also recoded for the evaluation of the intrareader reproducibility. PET scans were uniquely ordered for each reader, so they couldn't cheated off each other. These were read over three consecutive days, about eight hours each day. The interreader variability, so between different readers, was assessed using the Fleiss K-statistics.
So the lower the value, the more disagreement is. The higher the value, the more agreement there is. And this varies between zero and one. So essentially, you always want to look at something at 0.6 or higher to deem this a good result. And then the overall average agreement calculated as the average agreement rate across the three readers for each of the 125 cases.
Conversely, the pairwise variability and intrareader reproducibility were assessed using the Cohen K-statistics. And then the agreement rate between the independent reads in the substudy and the VISION eligibility reads were calculated as the number of cases assessed as inclusion by each reader divided by the inclusion cases in VISION. So again, trying to see what is the agreement between the new readings and the original reads during the VISION study.
And at this point, I'll turn it over to Zach to go over the results of discussion for this paper.
Zach Klaassen: Thanks so much, Rashid. So this is the concordance combination among the three readers, and I've highlighted two important discussion points. So when all three readers said that the images should be included, this occurred in 61% of cases. And importantly, when all three readers decided that the images should be excluded, this occurred in 16% of patients. And you can see a smattering of the other combinations with lower proportions in between the three inclusion and the three exclusion highlighted rows in this table.
So this is the pairwise interreader agreement for the 125 cases. On the left, we see the pairwise concordance combinations and I've highlighted the Cohen Kappa-Analysis for the various combinations of reader one versus two, one versus three, and reader two versus three. And essentially, we see a good interreader agreement between these radiologists with Cohen's Kappa ranging from 0.54 to 0.67. Switching to intrareader agreements, so this is 20 cases among individual readers. We see the concordance combinations here on the left and again the Cohen Kappa-Analysis with excellent interreader agreement ranging from 0.76 to 0.89. This is the agreement between substudy read results and VISION eligibility read results for these 125 cases. And basically, the take-home message from this slide is the dark blue, concordant with VISION. So for reader one, two, and three, we see that this concordance with VISION range from roughly 76% to 79.5%.
Next, we'll talk about two cases that were discordant cases reviewed in this study. So this is the first case looking at a left thoracic vertebral metastasis. You can see highlighted by the blue arrow, roughly 1 centimeter detected on CT scan. However, when we look at the right side, particularly at the bottom on the PSMA PET, this is a PSMA-negative vertebral lesions. So this is a discordant case between CT and PSMA. Similarly, this is case number two. This is a left lung nodule on the left, which you can see identified on CT scan roughly 1.5 centimeters. But when we look at the corresponding PSMA-negative lung nodule, we see that this is not light up. And this was a negative PSMA study, again, highlighting discordance between CT scan and PSMA PET.
So a way of discussion, overall, the moderate-to-substantial level of inter agreement in this Ga68-PSMA-11 scan interpretation in this VISION substudy was similar to what has been previously reported. Although other studies use different study criteria and are in different disease settings. Intrareader reproducibility for repeated reads by the same reader was 90% to 95% with a corresponding Cohen Kappa of 0.76 to 0.89, showing excellent agreement for all three readers. Comparison of read results in the study and VISION eligibility read results demonstrated an agreement rate of 76% and 80% with better concordance among inclusion cases. Importantly, all readers agreed on the assessment of 66 of 75 VISION inclusion cases.
In clinical practice, selection of patients for treatment with Lu-PSMA may require multidisciplinary consultation for borderline or difficult-to-interpret scans by a single reader. The authors did provide recommendations to minimize discordance in case assessment in clinical practice. First, a careful reading of the diagnostic CT scan using region-anatomic and organ-specific windows is recommended. Secondly, the often very high uptake of metastatic lesions in mCRPC can tempt readers to view the PET imaging at too wide of a window. Since the threshold of PSMA positivity or negativity is the liver, active windowing of the PET imaging with the liver initially placed in the middle of the window, is also recommended. And finally, for extensive PSMA-positive adenopathy, focal areas of decreased uptake should be carefully assessed for negative nodes.
So in conclusion, this VISION substudy demonstrated moderate-to-substantial interreader agreement and substantial-to-almost perfect intrareader agreement on visual assessment of Ga-PSMA-11 PET/CT scans according to predefined VISION rules. Finally, the read rules used in VISION to determine patient eligibility for treatment with Lu-PSMA were readily learned and demonstrated good reproducibility among independent reviewers despite the limitations of this substudy.
We thank you very much for your attention and we hope you enjoyed this UroToday Journal Club discussion.
Rashid Sayyid: Hello, everyone. This is Rashid Sayyid. I'm a urologic oncology fellow at the University of Toronto and along with Zach Klaassen, program director and associate professor in Augusta University. We'll be discussing a recent publication, which was an ad hoc analysis from the VISION study that looked at reader agreement on 68Ga-PSMA-11 PET/CT interpretation to determine patient eligibility for 177Lu-PSMA-617 Radioligand Therapy. This study was recently published in the Journal of Nuclear Medicine by Dr. Phillip Kuo as the first author.
So PSMA is a transmembrane glutamate carboxypeptidase what's highly expressed in prostate cancer cells. And 68Ga-PSMA-11 is the radioligand agent that's used to identify the PSMA-positive tumors on PET scan. The findings on 68Ga-PSMA-11 PET/CT were used to determine patient eligibility in the phase III VISION trial. Whereby, all patients had at least one PSMA-positive metastatic lesion defined as the lesion uptake rate of the liver, and no PSMA-negative metastatic lesions. Essentially, that's defined as having no lesions greater than 2.5 centimeters that don't light up in PSMA or if you have a bone metastasis with a soft-tissue component, you have 1-centimeter short axis or if you have a solid organ greater than or equal to 1 centimeter in the short axis diameter.
So essentially, you can't have that detecting conventional imaging and have no PSMA-ligand uptake on a PET/CT scan. And so the results of this scan were used to select patients who were most likely to benefit from Lutetium. And so this tells us that appropriately selecting patients based on the result of PET/CT was very important for the purpose of this study. And just by way of an overview of the refresher, this is the study designed for VISION. VISION was a phase III trial included patients who were treated... mCRPC patients that were treated with both 1 androgen receptor pathway inhibitor and 1 or 2 taxane regimens.
Patients were all planned for protocol-permitted standard of care before randomization. And then they were randomized to either standard of care alone versus standard of care plus Lutetium. All patient in good performance status with a life expectancy of at least six months. And again, importantly, PSMA-positive mCRPC on PET/CT with the Ga-PSMA-11 scan. And this was a positive trial. This showed both OS and rPFS benefits. As we see, the median overall survival improved from 11.3 to 15.3 months, so about a four-month improvement with Lu-PSMA.
But again, one of the challenges is determining the patient eligibility. And so there are some concerns that there may be some both interreaders, so between different readers, and intrareader disagreements in reading these scans and determining eligibility. And so we need to better understand this, and this is important implications for practice when trying to determine which patients may benefit from Radioligand Therapy. And so there's some data from the proPSMA trial which looked at 68Ga-PSMA in the staging as a setting. And I looked at how are we reading these scales? Are we agreeing amongst each other and so? What they did was they looked at the interreader agreement between local and central review and they found that it's quite high actually for nodal metastases, the Kappa is about 0.87. So again, that's very impressive, and for this metastasis also is quite high. And so we need to transfer these results to the patient population that may be eligible for Radioligand Therapy.
And so the reliability of the read rules in this setting is yet to be determined. So that's the study objective for this analysis. Again, the objective was to investigate the interreader variability and intrareader reproducibility of visual assessments of 68Ga-PSMA-11 PET/CT scans using the VISION read criteria for Lu-PSMA-617 therapy eligibility. And so, in VISION, the 68Ga-PSMA-11 PET/CT scan was read by one reader from a pool of three-board certified nuclear medicine physician/radiologist. And in this substudy, the scans were read by three independent central readers each from a different institution. None of these three readers in this analysis were involved in VISION. So again, this is a different subset. And all were US board-certified nuclear medicine physicians and two out of the three were board-certified in daily radiology.
While all were experienced in reading PET/CT scans, none of them was experienced in reading 68Ga-PSMA-11 PET/CT scans or with the VISION read rules. Again, this is important implications for practice. In centers of excellence, you're going to find many specialists, but as we start thinking about broadening the applicability of this treatment for our mCRPC patients, it's very important to understand how we can transfer these results to different centers and that can increase the uptake and appropriate use of this medication for our patient. Again, this is important clinical implications, particularly in non-academic settings. And then the nuclear medicine radiologist involved in the development VISION read criteria and training of the central reader was assigned as a trainer for this study. So essentially, they used the nuclear medicine physician from the VISION study to train these patients by a virtual teleconference platform. And then how do they ascertain that these nuclear medicine physicians were able to read them or proficient enough? So at the training completion, the readers are required to correctly assess 80% of 10 allocated training cases to ascertain proficiency.
What about scan selection and coding these things. So using a random generator, 125 scans along with their corresponding CT and MRI scans from VISION were selected. So out of the 125, 75 were deemed as inclusion cases, 60%, and 50 were exclusion of 40%. Comparatively, in VISION, 85% of the screened case were inclusion eligible. So you may ask why did they include 60% was, in VISION, 85%. So they intentionally varied that to allow for more robust analysis and interreader variability. And then out of these 125 scans, a subset of 20 cases, 12 inclusion, 8 exclusion cases were also recoded for the evaluation of the intrareader reproducibility. PET scans were uniquely ordered for each reader, so they couldn't cheated off each other. These were read over three consecutive days, about eight hours each day. The interreader variability, so between different readers, was assessed using the Fleiss K-statistics.
So the lower the value, the more disagreement is. The higher the value, the more agreement there is. And this varies between zero and one. So essentially, you always want to look at something at 0.6 or higher to deem this a good result. And then the overall average agreement calculated as the average agreement rate across the three readers for each of the 125 cases.
Conversely, the pairwise variability and intrareader reproducibility were assessed using the Cohen K-statistics. And then the agreement rate between the independent reads in the substudy and the VISION eligibility reads were calculated as the number of cases assessed as inclusion by each reader divided by the inclusion cases in VISION. So again, trying to see what is the agreement between the new readings and the original reads during the VISION study.
And at this point, I'll turn it over to Zach to go over the results of discussion for this paper.
Zach Klaassen: Thanks so much, Rashid. So this is the concordance combination among the three readers, and I've highlighted two important discussion points. So when all three readers said that the images should be included, this occurred in 61% of cases. And importantly, when all three readers decided that the images should be excluded, this occurred in 16% of patients. And you can see a smattering of the other combinations with lower proportions in between the three inclusion and the three exclusion highlighted rows in this table.
So this is the pairwise interreader agreement for the 125 cases. On the left, we see the pairwise concordance combinations and I've highlighted the Cohen Kappa-Analysis for the various combinations of reader one versus two, one versus three, and reader two versus three. And essentially, we see a good interreader agreement between these radiologists with Cohen's Kappa ranging from 0.54 to 0.67. Switching to intrareader agreements, so this is 20 cases among individual readers. We see the concordance combinations here on the left and again the Cohen Kappa-Analysis with excellent interreader agreement ranging from 0.76 to 0.89. This is the agreement between substudy read results and VISION eligibility read results for these 125 cases. And basically, the take-home message from this slide is the dark blue, concordant with VISION. So for reader one, two, and three, we see that this concordance with VISION range from roughly 76% to 79.5%.
Next, we'll talk about two cases that were discordant cases reviewed in this study. So this is the first case looking at a left thoracic vertebral metastasis. You can see highlighted by the blue arrow, roughly 1 centimeter detected on CT scan. However, when we look at the right side, particularly at the bottom on the PSMA PET, this is a PSMA-negative vertebral lesions. So this is a discordant case between CT and PSMA. Similarly, this is case number two. This is a left lung nodule on the left, which you can see identified on CT scan roughly 1.5 centimeters. But when we look at the corresponding PSMA-negative lung nodule, we see that this is not light up. And this was a negative PSMA study, again, highlighting discordance between CT scan and PSMA PET.
So a way of discussion, overall, the moderate-to-substantial level of inter agreement in this Ga68-PSMA-11 scan interpretation in this VISION substudy was similar to what has been previously reported. Although other studies use different study criteria and are in different disease settings. Intrareader reproducibility for repeated reads by the same reader was 90% to 95% with a corresponding Cohen Kappa of 0.76 to 0.89, showing excellent agreement for all three readers. Comparison of read results in the study and VISION eligibility read results demonstrated an agreement rate of 76% and 80% with better concordance among inclusion cases. Importantly, all readers agreed on the assessment of 66 of 75 VISION inclusion cases.
In clinical practice, selection of patients for treatment with Lu-PSMA may require multidisciplinary consultation for borderline or difficult-to-interpret scans by a single reader. The authors did provide recommendations to minimize discordance in case assessment in clinical practice. First, a careful reading of the diagnostic CT scan using region-anatomic and organ-specific windows is recommended. Secondly, the often very high uptake of metastatic lesions in mCRPC can tempt readers to view the PET imaging at too wide of a window. Since the threshold of PSMA positivity or negativity is the liver, active windowing of the PET imaging with the liver initially placed in the middle of the window, is also recommended. And finally, for extensive PSMA-positive adenopathy, focal areas of decreased uptake should be carefully assessed for negative nodes.
So in conclusion, this VISION substudy demonstrated moderate-to-substantial interreader agreement and substantial-to-almost perfect intrareader agreement on visual assessment of Ga-PSMA-11 PET/CT scans according to predefined VISION rules. Finally, the read rules used in VISION to determine patient eligibility for treatment with Lu-PSMA were readily learned and demonstrated good reproducibility among independent reviewers despite the limitations of this substudy.
We thank you very much for your attention and we hope you enjoyed this UroToday Journal Club discussion.