Development of a Longitudinal Prostate Cancer Transcriptomic and Clinical Data Linkage.

Although tissue-based gene expression testing has become widely used for prostate cancer risk stratification, its prognostic performance in the setting of clinical care is not well understood.

To develop a linkage between a prostate genomic classifier (GC) and clinical data across payers and sites of care in the US.

In this cohort study, clinical and transcriptomic data from clinical use of a prostate GC between 2016 and 2022 were linked with data aggregated from insurance claims, pharmacy records, and electronic health record (EHR) data. Participants were anonymously linked between datasets by deterministic methods through a deidentification engine using encrypted tokens. Algorithms were developed and refined for identifying prostate cancer diagnoses, treatment timing, and clinical outcomes using diagnosis codes, Common Procedural Terminology codes, pharmacy codes, Systematized Medical Nomenclature for Medicine clinical terms, and unstructured text in the EHR. Data analysis was performed from January 2023 to January 2024.

Diagnosis of prostate cancer.

The primary outcomes were biochemical recurrence and development of prostate cancer metastases after diagnosis or radical prostatectomy (RP). The sensitivity of the linkage and identification algorithms for clinical and administrative data were calculated relative to clinical and pathological information obtained during the GC testing process as the reference standard.

A total of 92 976 of 95 578 (97.2%) participants who underwent prostate GC testing were successfully linked to administrative and clinical data, including 53 871 who underwent biopsy testing and 39 105 who underwent RP testing. The median (IQR) age at GC testing was 66.4 (61.0-71.0) years. The sensitivity of the EHR linkage data for prostate cancer diagnoses was 85.0% (95% CI, 84.7%-85.2%), including 80.8% (95% CI, 80.4%-81.1%) for biopsy-tested participants and 90.8% (95% CI, 90.5%-91.0%) for RP-tested participants. Year of treatment was concordant in 97.9% (95% CI, 97.7%-98.1%) of those undergoing GC testing at RP, and 86.0% (95% CI, 85.6%-86.4%) among participants undergoing biopsy testing. The sensitivity of the linkage was 48.6% (95% CI, 48.1%-49.1%) for identifying RP and 50.1% (95% CI, 49.7%-50.5%) for identifying prostate biopsy.

This study established a national-scale linkage of transcriptomic and longitudinal clinical data yielding high accuracy for identifying key clinical junctures, including diagnosis, treatment, and early cancer outcome. This resource can be leveraged to enhance understandings of disease biology, patterns of care, and treatment effectiveness.

JAMA network open. 2024 Jun 03*** epublish ***

Michael S Leapman, Julian Ho, Yang Liu, Christopher P Filson, Xin Zhao, Alexander Hakansson, James A Proudfoot, Elai Davicioni, Darryl T Martin, Yi An, Tyler M Seibert, Daniel W Lin, Daniel E Spratt, Matthew R Cooperberg, Ashley E Ross, Preston C Sprenkle

Department of Urology, Yale University School of Medicine, New Haven, Connecticut., Veracyte, Inc, San Francisco, California., Department of Urology, Emory School of Medicine, Atlanta, Georgia., Department of Therapeutic Radiology, Yale School of Medicine, New Haven, Connecticut., Department of Radiation Medicine and Applied Sciences, University of California, San Diego, La Jolla., Department of Urology, University of Washington, Seattle., Department of Radiation Oncology, Case Western Reserve University, Cleveland, Ohio., Department of Urology, University of California, San Francisco, San Francisco., Department of Urology, Northwestern University Feinberg School of Medicine, Chicago, Illinois.