HPV+ OPSCC cohort. We identified 191 patient tumors previously profiled through the BD2Decide Project that met the criteria of definitive RT (12) (for details on patient selection, see Supplemental Figure 1; supplemental material available online with this article; https://doi.org/10.1172/JCI194073DS1). The characteristics for these HPV+ OPSCC patients are detailed in Supplemental Table 1.

GARD reveals underlying genomic heterogeneity in RT effect. We have previously shown that GARD reveals underlying heterogeneity in radiation treatment effect within groups presumed to have been treated uniformly (with approximately equivalent physical dose) (8, 9, 13, 14). In this cohort, we demonstrate again that GARD reveals wide heterogeneity in predicted RT effect despite relatively uniform RT dose prescribed. As shown in Figure 1, delivered GARD ranged from 15.4 to 71.7 (median = 39.1, IQR = 12.6). Plotted along the edges of the joint plot between GARD and equivalent dose in 2 Gy fractions (EQD2) are kernel density estimates for the entire cohort, revealing wide heterogeneity in delivered GARD (IQR = 12.6) in the setting of near homogeneity in RT dose (IQR = 0.04). The difference between GARD and EQD2 is best exemplified by the patients who received the whole course of standard radiation dose, with EQD2 measures between 69 and 71 Gy (see Figure 1). The range of GARDs for those patients was 19.7–71.7 (IQR = 12.7) even though they all were treated to (approximately) the same RT dose (EQD2), highlighting the wide differential in the predicted effect of our uniform clinical dosing strategies. The distributions of RSI and GARD by AJCC stage (AJCC Eighth Edition) did not differ significantly (Supplemental Figure 2 and Supplemental Table 3).

Figure 1 GARD exhibits large underlying genomic heterogeneity in radiation effect compared with radiation dose alone. Left: EQD2 (median = 70.0, IQR = 0.7) is plotted against associated GARD (median = 39.1, IQR = 12.6) for each patient in the cohort. Kernel density estimates are plotted on each edge to show the distributions of the individual variables. Patients that received an EQD2 of 69–71 Gy (standard dosing) are indicated in yellow. Right: The GARD value of all patients receiving an EQD2 of 69–71 Gy (SOC) highlight GARD’s ability to stratify patients by their genomic heterogeneity. Data points are overlaid with a box-and-whisker plot, with boxes representing quartiles and whiskers extending to 1.5 times IQR.

GARD is continuously associated with OS in RT-treated HPV+ OPSCC patients. Previously, we demonstrated that GARD was associated with OS and recurrence risk and was predictive of RT benefit in a pooled pan-cancer analysis of 1,615 patients, including cohorts with HSNCC (9). Since RT therapeutic benefit is a critical factor impacting clinical outcome in HPV+ patients, we hypothesized that GARD would be associated with clinical outcome in this analysis of HPV+ OPSCC patients collected through the BD2Decide Project (12). To test this hypothesis, we performed a Cox proportional hazards analysis of GARD and OS in patients that were treated with definitive primary RT (n = 191) and those treated with SOC definitive primary RT (EQD2, 69–71 Gy) (n = 174). As shown in Figure 2, GARD is associated with OS as a continuous variable for patients treated with primary definitive RT and censored at 60 months. We found that for each unit increase in GARD there was an improvement in OS [HR (95% CI) = 0.941 (0.888, 0.998) per unit GARD, P = 0.041]. This association of GARD with OS also held when including only patients treated with primary definitive RT at SOC doses (EQD2, 69–71 Gy), as shown in Figure 2 [HR (95% CI) = 0.920 (0.857, 0.986) per unit GARD, P = 0.019]. These findings suggest that GARD can stratify patients by predicted effect even when radiation dose is approximately uniform.

Figure 2 GARD is a continuous predictor of OS in radiation-treated patients with HPV+ OPSCC. (A) Cox proportional hazards analysis demonstrates significant continuous association between GARD and OS for patients treated with primary definitive RT and censored at 60 months [P = 0.041, HR = 0.941 (0.888, 0.998) per unit GARD]. (B) Cox proportional hazards analysis demonstrates significant continuous association between GARD and OS for the subset of patients treated with SOC primary definitive RT (EQD2, 69–71 Gy) [P = 0.019, HR = 0.920 (0.857, 0.986) per unit GARD].

GARD is associated with OS in RT-treated HPV+ OPSCC patients. The BD2Decide cohort includes clinical variables for performance status (Eastern Cooperative Group [ECOG] > 0), T stage (T4 vs. T1–3), N stage (N2–N3 vs. N0–N1), and smoking pack years (>10). We performed univariable and multivariable analysis using these variables and GARD for statistical associations with OS. In the definitive primary RT cohort, N stage was significantly associated with OS in univariate analysis (P = 0.048), but in multivariate analysis, GARD was the only variable statistically associated with OS [HR = 0.943 (0.891, 0.999), P = 0.046]. These results are summarized in Table 1 and Supplemental Table 2.

Table 1 Multivariable analysis of definitive primary RT patients

In addition, we developed and evaluated a Cox regression model including the previously known prognostic clinical variables (T stage, N stage, smoking, and ECOG performance status) to determine whether a model including GARD improves overall model performance. As shown in Figure 3, the Cox model including clinical variables achieved an AUC of 71.20, whereas GARD alone achieved a superior AUC (3-year OS) of 78.26. Integrating GARD with the clinical variables improved the prognostic ability of the model with AUC 83.81. We also evaluated the previously developed 3-cluster model (12), which, by itself, achieved an AUC similar to the clinical variable model (AUC = 72.83). However, integration of the 3-cluster information into the GARD plus clinical variable model did not improve the overall prognostic ability, as measured by AUC 83.81. This lack of improvement may be related to the relationship between clusters and GARD values (see Supplemental Figure 3 and Supplemental Table 4).

Figure 3 AUC analysis shows GARD outperforms standard clinical variables. In this AUC analysis, we compared Cox regression of clinical variables (red) with GARD alone (blue) and a combined model using all factors (black). The combined model shows dramatic improvement compared with clinical variables alone. This model included all 191 definitive primary RT patients and analyzed outcome at 3 years.

GARD predicts that empiric dose de-escalation would result in inferior clinical outcome. Although HPV+ patients have excellent prognosis, the interim analyses of HN005 have emphasized the importance of developing clinical tools to identify patient subsets with differential risk of clinical failure. We hypothesized that GARD could identify a subpopulation of HPV+ patients at differential risk of failure that may explain the failure of unselected empiric dose de-escalation as tested in HN005. In addition, understanding the differential risks of failure can lead to a better clinical strategy for dose de-escalation in selected patients. To develop this, we performed an exploratory discrete analysis based on an optimized cut point analysis. Minimizing the log-rank score at 1 discrete value revealed 2 groups with maximally different outcomes (see Supplemental Figure 4). This analysis revealed 1 cut point at GARD < 42 that optimally stratified patients, as shown in Figure 4A.

Figure 4 GARD predicts that uniform RT dose de-escalation in HPV+ patients would result in an inferior clinical outcome compared with SOC. (A) GARD-identified HPV+ patient subsets with differential risk of failure. An exploratory analysis identified 1 cut point that grouped patients in 2 risk levels. Patients that achieved the lowest GARD (<42) had a higher risk of failure (3-year OS = 90.5%). Statistical significance was evaluated using a log-rank test from Kaplan-Meier estimates. (B) In silico clinical trial designs. Left: Simulated unselected RT dose de-escalation (cisplatin + 60 Gy vs. cisplatin + 70 Gy). We utilized the RSI distribution of the BD2Decide cohort to generate GARD for 400 virtual patients randomized to either 60 or 70 Gy and repeated this 100 times. Right: GARD-based de-escalation. In 1 example of a potential trial, we simulated a GARD-selected trial where only patients with GARD ≥ 42 were eligible for randomization to de-intensification. Fx, fractions. (C) Simulation of unselected RT dose de-escalation (cisplatin + 60 Gy) resulted in inferior OS compared with SOC (cisplatin + 70 Gy). The unselected in silico clinical trial predicted that patients treated with RT dose de-escalation experience a statistically significantly worse OS when compared with SOC (3-year OS of 92.7% vs. 94.6%, nonoverlapping CIs). (D) Selective de-intensification produced similar OS.

Patients in the GARD-high group (GARD ≥ 42) had a 3-year OS of 100% (CI: 1–1) compared with 90% (CI: 0.85–0.96) for the GARD-low group (GARD < 42). These differences are statistically significant with P = 0.0045, though this analysis should be interpreted carefully as it was performed for hypothesis generation and the groups were chosen by maximizing differences post hoc.

One possible explanation for the HN005 results is that empiric dose de-escalation results in a small number of patients falling from the GARD high cohort (GARD ≥ 42) to the GARD-low cohort (GARD < 42), leading to an inferior result for empiric dose de-escalation. To test this hypothesis, we performed an in silico clinical trial to evaluate GARD-based predictions of clinical outcome for empiric dose de-escalation to 60 Gy (with concurrent chemotherapy) as in HN005 (see Figure 4B for schema). We found that GARD predicts that empiric (unselected) dose de-escalation would result in an inferior clinical outcome. The predicted 3-year OS for patients modeled at 70 Gy was 94.6% compared with 92.7% for patients modeled at 60 Gy (Figure 4C). Empiric unselected dose de-intensification is predicted to increase the proportion of patients in the GARD-low group while decreasing the proportion of patients in the GARD-high group. The 70 Gy in silico arm had an average of 126 and 74 patients in the low and high GARD groups, while the 60 Gy in silico arm had 168 and 32 patients in those groups.

Next, we determined whether we could use GARD to develop a clinical trial strategy that would predict an equivalent outcome at 70 or 60 Gy. In 1 approach, GARD can identify patients that would remain at or above the GARD-high cut point (≥42) at 70 or 60 Gy. Based on simulations, approximately 16% of the HPV+ trial population would be eligible for dose de-escalation in this scenario. It should be noted that this approach excludes GARD-low patients and patients that fall from GARD high to GARD low at 60 Gy. The predicted OS curve for this approach to de-escalation is shown in Figure 4D. The 36-month survival proportion is equivalent (94.6%) in both arms of this simulated trial.

Another way to think about dose de-escalation is to ask the question: Can GARD identify a personalized target dose with the goal of maintaining current outcomes? This is fundamentally different than our previous approach, which asked if we could use GARD to select patients for stratified de-escalation to standard dose levels. In this approach, we instead asked what GARD cut point would provide equipoise to current SOC? Analyzing our cohort through this lens, we found that a substantially lower GARD cut point equal or higher than 32 (see Supplemental Figure 5) would provide outcomes in line with current SOC. Figure 5A shows the outcome of patients that achieved GARD 32 compared with unselected patients in the BD2Decide cohort. As shown, the patients that achieved a GARD of at least 32 had the same OS as the whole unselected cohort, thus achieving equipoise with current SOC in unselected patients.

Figure 5 GARD-based opportunities for equipoise targeted RT dose reduction: maintaining equivalent outcomes to SOC. (A) GARD targeted equipoise RT dose reduction. Kaplan-Meier curves show that patients in the BD2Decide cohort that had a GARD of at least 32 achieved isocurative outcomes compared with the unselected cohort. Each patient could then have a prescription RT dose to match the target GARD (at least 32). Comparing this to SOC provided equivalent outcomes. Statistical significance was evaluated using a log-rank test from Kaplan-Meier estimates. (B) A histogram depicting the difference between the dose predicted to be required for each patient compared with the dose delivered. Red indicates patients that were underdosed (39 of 175), offering opportunities to increase oncologic outcomes, and blue indicates patients that were overdosed, suggesting opportunities to decrease toxicity. (C) Calculating the difference between fractions delivered in SOC compared with the number predicted on a per patient basis revealed a large potential for toxicity reduction at the population level. This averaged approximately 5 fractions (1 week of radiotherapy) per patient, but with large heterogeneity. XRT, radiation therapy.

In a trial designed like this then, each patient would be assessed for their RSI and then a physical radiation dose would be calculated such that they would achieve a prescribed GARD of at least 32. While exploratory and nonstandard, this analysis of a genomic prescription paradigm offers us a window into the future where dose is truly personalized, allowing exactly enough radiation to be delivered for tumor cure, minimizing toxicity. In Figure 5B shows the minimum dose required for each patient in the BD2Decide cohort to achieve a GARD of at least 32. Interestingly, the average dose needed aligns well with our clinical intuition — approximately 60 Gy — but with large heterogeneity across individual patients. Of note as well is the large number of patients (22.3% in this cohort, 39 of 175) who we predict require between 60 and 70 Gy, revealing which patients would have inferior outcomes when de-escalated to 60 Gy.

This analysis also suggests that on average the toxicity (financial and clinical) of nearly 5 fractions/patient can be spared while maintaining similar outcomes (Figure 5C). However, the potential toxicity reduction for each patient is variable, with some patients predicted to only require 30 Gy, while a small minority may require higher doses than standard. Of note, there was a peak in the distribution between 60 and 70 Gy, meaning that as we reduced dose from 70 to 60 Gy without genomic guidance, we underdosed a substantial portion of patients, worsening our outcomes. This stands in contrast to our findings in non-small-cell lung cancer (15), where the dose escalation from 60 to 74 Gy (as in RTOG 0617) spanned a valley in the distribution, meaning that the escalation resulted in very few patients having a treatment benefit, while all received the increased toxicity.