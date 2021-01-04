Ganciclovir reduced CMV disease and mortality at least 20 years after the original RCT. In a single-center study performed at the Fred Hutchinson Cancer Research Center from 1989 to 1990 (17), 72 allogeneic HCT recipients who were either CMV seropositive or who had received marrow from CMV seropositive donors were screened weekly for CMV with viral cultures and were randomized to receive either ganciclovir or a placebo at the time of first positive culture. A description of the study design is provided in Figure 1, A and B, and baseline patient characteristics are shown in Supplemental Table 1; supplemental material available online with this article; https://doi.org/10.1172/JCI133960DS1 A schematic of the VL analysis is shown in Figure 1C.

Figure 1 CONSORT diagram and study designs for the early treatment trial. Study design for Goodrich et al. RCT (A and B) and for VL kinetic analysis (C). (A) The reconstructed CONSORT diagram for the original RCT. (B) The original study design with surveillance and screening beginning at HCT and randomization beginning at the time of first positive surveillance culture. (C) The VL kinetics study design with analysis beginning at randomization (receipt of study drug) and ending at day 100 after HCT or a study endpoint of CMV disease or death, whichever occurred first.

The original trial was designed to enroll 116 patients, but it was terminated early after the interim analysis showed a large reduction in tissue-invasive CMV disease by 100 days after HCT. Ganciclovir was found to have reduced significantly the cumulative incidence of CMV disease and overall mortality at 100 and 180 days after HCT (Figure 2A). Extending the follow-up of results observed in the original RCT through chart review, we found that the cumulative incidence of CMV disease and of the composite endpoint of CMV disease or death remained significantly lower in the ganciclovir group after 20 years (Figure 2B). Overall mortality was also lower in the ganciclovir group after 20 years (Figure 2B), although the trend in mortality was no longer statistically significant by 10 years. When outcomes were counted from randomization rather than transplantation, the results were similar (Supplemental Figure 1). Detailed methods and results from the original study and extended follow-up are included in the supplemental material. By providing evidence of a successful intervention in an RCT, these results demonstrate that the Prentice definition can be applied to our data.

Figure 2 CMV disease and death clinical outcomes in the early treatment trial. CMV disease (right-censored for death), overall mortality, and first event of CMV disease or mortality in the placebo and ganciclovir (GCV) groups at time points defined in the original study (A) and at extended follow-up times out to 20 years (B). In all plots, the ganciclovir group is shown in red; the placebo group is shown in blue. Numbers at risk are shown below their respective plots (PLAC indicates the placebo group. GCV indicates the treatment group). Survival and first event of CMV disease or death curves were estimated using Kaplan-Meier methods. The cumulative incidence of CMV disease with death as a competing risk was estimated using the Aalen-Johnson method. Survival distributions and times to the composite endpoint of CMV disease or death were compared using a log-rank test. Cumulative incidence distributions for CMV disease with death as a competing risk were compared using Gray’s test.

Ganciclovir lowered CMV VL kinetics in the first 5 weeks after randomization. Validation of surrogate endpoints requires the measurement of candidate biomarkers at intermediate time points after randomization. Frozen serum samples left over from clinical testing were stored prospectively in a biorepository at Fred Hutchinson Cancer Research Center for all study participants. CMV DNA PCR VLs were measured from available samples collected at approximately weekly intervals up to day 100 after HCT. VLs collected near the time of randomization until the first event of CMV disease were included in the surrogate analysis. All 72 patients had VL samples available near the time of randomization. Sixty-five patients had at least 1 VL measured in weeks 1 through 5, and there was a median of 4 measurements per patient in both treatment groups. Detailed sample availability information is provided in Supplemental Table 2.

CMV VL kinetics, including mean VL (log 10 IU/mL), maximum change in VL from randomization (log 10 IU/mL), peak VL (log 10 IU/mL), and percentage of positive VLs (viral shedding rate) were calculated from VLs measured in the first 5 weeks after randomization. Only early VLs (weeks 1–5) were included because surrogate endpoints are more useful when measured early after interventions and because many patients in the placebo group died or developed CMV disease soon after randomization. Weekly mean VLs, changes in VL, and all VL kinetics are shown in Figure 3.

Figure 3 Weekly CMV VL kinetics in the early treatment trial. CMV VL kinetics from time of randomization (Week 0). (A and B) VL data are shown for patients who had not reached an endpoint of CMV disease or death by that week. GCV indicates patients in the ganciclovir treatment group (shown in red). Placebo indicates patients in the placebo treatment group (shown in blue). Error bars indicate 95% CI. The dashed horizontal line represents the limit of detection (LOD) of the CMV VL assay. VL kinetics summary calculations (C) were performed with the data shown in A and B. Box-and-whisker plots show the middle 50% of VL kinetics in gray boxes with a horizontal black line at the median. Whiskers extend upward from the third quartile at the top of the box to 1.5 times the IQR (the distance between first and third quartiles) and downward from the first quartile at the bottom of the box to 1.5 times the IQR. P values were calculated from 2-tailed t tests comparing the means of the viral kinetics in GCV versus placebo groups.

CMV VL kinetics fulfill the Prentice definition for valid surrogate endpoints. We evaluated whether each of the 4 VL kinetics defined above (mean, maximum change, peak, and percentage positive) is a valid surrogate for the clinical endpoints of tissue-invasive CMV disease or the composite endpoint of CMV disease or death by 8, 24, and 48 weeks after randomization (Figure 1C). In the main text, we focus on results for week 48. Results for weeks 8 and 24 are provided in Supplemental Results. Because patients were randomized at varying times from HCT based on positive viral culture results, we chose weeks rather than days to describe outcomes in the surrogate analysis to help differentiate time from randomization rather than time from transplantation.

We validated each VL kinetic based on fulfillment of the Prentice definition for valid surrogate endpoints. The Prentice definition requires that a hypothesis test of the treatment effect (e.g., ganciclovir effect) on the surrogate endpoint (e.g., VL kinetic) is a valid hypothesis test of the treatment effect on the clinical endpoint (e.g., CMV disease). In other words, if a clinical trial assessing the effect of a treatment was performed with the primary outcome being an effect on the surrogate marker rather than the clinical outcome, the overall conclusion would be the same as a study performed using the clinical endpoint (22, 23).

Prentice criterion 1. To satisfy the Prentice definition, surrogates must fulfill 3 main criteria. The first criterion requires that treatment (ganciclovir) affects the candidate surrogate endpoint (e.g., peak VL). This criterion was met for all VL kinetics as reported above and in Figure 3C, in that mean, maximum change, peak, and percentage of positive VLs were significantly lower in the ganciclovir group.

Prentice criterion 2. The second Prentice criterion is met if there is an association between candidate surrogates (VL kinetics) and clinical outcomes (CMV disease or death). Logistic regression models adjusted for aGVHD, CMV donor serostatus, and VL at randomization, but not for treatment group assignment, demonstrated that all VL kinetics met this criterion for all clinical endpoints at weeks 8, 24, and 48 (Supplemental Table 3), i.e., higher values of the VL kinetics correlated with significantly higher odds of CMV disease or death.

Prentice criterion 3. The third Prentice criterion states that for a given value of the candidate surrogate (e.g., maximum change in VL), the probability of the clinical outcome (e.g., CMV disease) is the same in each treatment group (ganciclovir or placebo group). We tested for fulfillment of this criterion with logistic regression models adjusted for aGVHD, CMV donor serostatus, VL at randomization, and treatment group. Because we adjusted for treatment group assignment and VL kinetics in these models, we were able to determine whether the treatment group assignment correlated with outcomes after adjustment for the VL kinetic. Thus, to fulfill Prentice criterion 3, the OR for the VL kinetic should be significantly greater than 1 at the P = 0.05 level, and the OR for the treatment assignment should not differ significantly from 1 at the P = 0.2 level (P value threshold higher to demonstrate similarity in values rather than difference). Figure 4 illustrates with asterisks that mean VL, peak VL, and maximum change in VL met Prentice criterion 3 (P < 0.05 for VL association; P ≥ 0.20 for treatment group association) for CMV disease by week 48 with no evidence of a treatment by marker interaction (P ≥ 0.20). Percentage of positive VLs nearly satisfied Prentice criteria (P = 0.07 for VL association). Mean, peak, and percentage of positive VLs also satisfied Prentice criteria for the composite outcome by week 48. Maximum change in VL did not meet Prentice criterion 3 for the composite outcome (P = 0.14 for treatment group association). Results for clinical endpoints occurring by weeks 8 and 24 were similar and are shown in Supplemental Figure 2.

Figure 4 Prentice criteria evaluation using multivariate logistic regression and proportion of treatment effect captured in the early treatment trial. (A) Forest plots of the ORs for associations of VL kinetics with risk for CMV disease and CMV disease or death by week 48 after randomization were calculated from logistic regression models adjusted for baseline characteristics and treatment group. OR for VL kinetics are indicated by navy dots surrounded by 95% CI indicated with navy lines; OR with 95% CI for treatment group assignment are shown with light green dots and lines. Asterisks (*) indicate VL kinetics that met the Prentice criteria by multivariable logistic regression testing, i.e. the coefficient for VL kinetic was significantly different from 0 (P < 0.05), whereas the treatment group assignment coefficient was not significantly different from 0 (P ≥ 0.20). The treatment by marker interaction coefficient was not significantly different from 0 (P ≥ 0.20) for any kinetic. The percentage positive did not meet Prentice criteria for CMV disease with P = 0.07 for VL kinetic association. Max change did not meet Prentice criteria for CMV disease with P = 0.14 for GCV association. For mean, max change, and peak, ORs were calculated as the ratio of odds of the clinical outcome in groups differing by log 10 IU/mL. For percentage positive, the OR was calculated as the ratio of odds of the clinical outcome in groups differing by 25% in percentage of samples with detectable VL. Dashed vertical lines indicate OR = 1. (B) The percentages of ganciclovir’s effect on clinical outcomes captured by the candidate surrogate were calculated using Kobayashi and Kuroki’s measure (23) and are shown for each of the VL kinetics.

VL kinetics capture a large proportion of ganciclovir’s effect on clinical outcomes. We quantified how much of ganciclovir’s effect on clinical outcomes could be attributed to its effects on VL kinetics using the proportion of treatment effect captured by candidate surrogate endpoints (23). For the week 48 clinical outcome of CMV disease, several VL kinetics captured nearly all of the effect of ganciclovir: mean (99.9%), change (96.6%), peak (98.5%), and percentage positive (95.8%) (Figure 4B). Mean, maximum change, and percentage positive captured at least 93% of ganciclovir’s effect on the composite outcome of CMV disease or death at week 48, whereas peak captured 84.5% (Figure 4B). Almost all VL kinetics were considered “moderate” (> 63%) or “substantial” (> 85%) for composite outcomes by weeks 8 and 24. Maximum change captured 83.5% of ganciclovir’s effect on CMV disease by week 8, but other kinetics did not perform well for CMV disease by weeks 8 and 24 (Supplemental Figure 3B).

Super Learner predicts clinical outcomes with high accuracy. The Super Learner is a cross-validation–based ensemble machine learning method for estimating the optimal weighted average of the predictions from a library of algorithms. Each of these algorithms estimates the conditional probability of an event (e.g., CMV disease or no CMV disease) given a set of potential risk factors (e.g., VL kinetics or baseline risk factors) using cross-validation (24, 25). For surrogate validation, in addition to providing optimal prediction accuracy, Super Learner predictions have the advantage of evaluating the ability of surrogate endpoints to predict clinical outcomes for individuals, rather than describing mean behavior on the population level (26). We built Super Learner models using baseline covariates (aGVDH, CMV donor serostatus, and VL at randomization) and all VL kinetics (mean, maximum change, peak, percentage positive). As an exploratory analysis, we also fit Super Learner models using absolute lymphocyte kinetics.

We constructed receiver operating characteristic (ROC) curves to evaluate the sensitivity and specificity of Super Learner predictions for clinical outcomes and assessed their performance with leave-one-out cross-validated area under the ROC curves (cv-AUCs). cv-AUCs can be interpreted as the probability that a randomly selected patient experiencing a clinical outcome will have a higher predicted risk than a randomly selected patient not experiencing the outcome. Models that predict at the same level of accuracy as random chance have cv-AUC equal to 50%. Super Learner model predictions of both week 48 clinical outcomes yielded cv-AUCs greater than 90% (Figure 5, A–D). All models built on mean, maximum change, peak, and percentage positive VLs, whether fit separately on treatment groups or on the combined data set, predicted both clinical outcomes (CMV disease/CMV disease or death) at all time points (weeks 8, 24, and 48) with better than 85% cv-AUCs (Supplemental Figure 3A). Our results suggest that VL kinetics measured during the first 5 weeks of antiviral treatment combined with an ensemble machine learning algorithm allow for excellent clinical outcome prediction. In addition, models built on the placebo, ganciclovir, and combined groups performed similarly, consistent with the Prentice definition.

Figure 5 Prediction accuracy for clinical outcomes with Super Learner in the early treatment trial. (A and C) Receiver operating characteristic (ROC) curves are shown for Super Learner predictions for CMV disease and CMV disease or death by 48 weeks after randomization. The diagonal line drawn at y = x indicates the boundary above which ROC curves describe a prediction that is better than chance. (B and D) Forest plots show cross-validated area under the ROC curves (cv-AUC) of Super Learner predictions for CMV disease and CMV disease or death. For A–D, predictions made only on data from the placebo group are in blue, from the ganciclovir group (GCV) in red, and from both treatment groups (All) in purple. In B and D, the vertical line indicates cv-AUC = 50%, the area under the diagonal line in A and C.

To evaluate the contributions of VL kinetics to the accuracy of the Super Learner predictions, we fit Super Learner models using baseline characteristics only versus baseline characteristics plus all VL kinetics. We found that adding all VL kinetics to the baseline characteristics increased prediction accuracy greatly for all time points and both clinical outcomes (Supplemental Figure 4). For example, the model built on baseline characteristics alone had a cv-AUC of 75.5% (95% CI, 61%–90%) for CMV disease or death by week 8, but the cv-AUC increased to 96.8% (95% CI, 93%–100%) when VL kinetics were included.

Including absolute lymphocyte counts in Super Learner models improves prediction of some clinical outcomes. We calculated absolute lymphocyte count (ALC) kinetics, including ALC peak, ALC nadir, and mean ALC, during the 5-week period after randomization to explore whether adding longitudinal measures of immunity to the machine learning models might improve prediction accuracy for clinical outcomes (27). In addition, we added ALC at randomization to the baseline risk characteristics (donor CMV serostatus, aGVHD, and VL at randomization). We found that adding ALC kinetics did not change the prediction accuracy of CMV disease by earlier time points (weeks 8 and 24), but improved prediction of CMV disease by week 48. ALC also improved prediction of CMV disease or death at all time points (Supplemental Figure 5). However, importantly, absolute lymphocyte kinetics did not consistently increase or decrease with ganciclovir administration (Supplemental Figure 6), and thus cannot be assessed as surrogates for antiviral treatment. A surrogate of treatment effect must be affected in a consistent direction by the intervention.

Validation analysis performed from the ganciclovir prophylaxis RCT demonstrates VL kinetics are valid surrogates in the prophylaxis setting. As follow-up to the early treatment trial, ganciclovir was studied as a prophylactic agent in a placebo-controlled RCT at the Fred Hutchinson Cancer Research Center from 1990 to 1991. Sixty-four CMV-seropositive allogeneic HCT recipients were randomized to receive ganciclovir or a placebo at engraftment and were followed for development of CMV infection (by culture) and CMV disease (18). The CONSORT diagram and trial design schematic are shown in Figure 6, A and B. Baseline patient characteristics are shown in Supplemental Table 4. We analyzed clinical outcomes by weeks 14, 24, and 48. The cumulative incidence of CMV disease was significantly lower in the ganciclovir treatment group by weeks 14 and 24, but no difference in mortality was found at these or later time points (Figure 7). The same results are shown in Supplemental Figure 7 in days and years from transplant rather than randomization. The same VL kinetics, mean, peak, maximum change, and percentage of positive VLs (shedding rate), were calculated for the first 5 weeks after randomization. As in the early treatment RCT, all VL kinetics were significantly lower in the ganciclovir group, fulfilling Prentice criterion 1 (Figure 8).

Figure 6 CONSORT diagram and study design for the prophylaxis trial. Study design for Goodrich et al. AIM 1993 RCT. (A) The reconstructed CONSORT diagram for the original RCT. (B) The original study design with surveillance and screening beginning at HCT and randomization beginning at the time of engraftment.

Figure 7 CMV disease clinical outcomes in the prophylaxis trial. CMV disease (right-censored for death), overall mortality, and first event of CMV disease or mortality in the placebo and ganciclovir groups at 14, 24, and 48 weeks after randomization. The ganciclovir group is shown in red; the placebo group is shown in blue. Numbers at risk are shown below their respective plots (PLAC indicates the placebo group. GCV indicates the treatment group). Survival and first event of CMV disease or death curves were estimated using Kaplan-Meier methods. The cumulative incidence of CMV disease with death as a competing risk was estimated using the Aalen-Johnson method. Survival distributions and times to the composite endpoint of CMV disease or death were compared using a log-rank test. Cumulative incidence distributions for CMV disease with death as a competing risk were compared using Gray’s test.

Figure 8 Weekly CMV VL kinetics in the prophylaxis trial. CMV VL kinetics from time of randomization (week 0). (A and B) VL data are shown for patients who had not reached an endpoint of CMV disease or death by that week. GCV indicates patients in the ganciclovir treatment group (shown in red). Placebo indicates patients in the placebo treatment group (shown in blue). Error bars indicate 95% CI. The dashed horizontal line represents the limit of detection (LOD) of the CMV VL assay. (C) VL kinetics summary calculations were performed with the data shown in A and B. Box-and-whisker plots show the middle 50% of VL kinetics in gray boxes with a horizontal black line at the median. Whiskers indicate 1.5 times the IQR of the VL kinetics. P values were calculated from 2-tailed t tests comparing the means of the viral kinetics in ganciclovir (GCV) versus placebo groups.

Because no CMV disease events occurred in the treatment group during the first 14 weeks of the study, we were unable to perform the analyses at this time point. Thus, CMV disease by week 24 served as our primary clinical outcome. Prentice criterion 2 was met for all VL kinetics by week 24 (Supplemental Table 5). Only the percentage of positive VLs (shedding rate) met Prentice criterion 3, demonstrating a significant association between VL after adjustment for treatment group (Figure 9A). However, the remaining VL kinetics, mean, peak, and maximum change, nearly fulfilled this criterion with OR 2.4, 95% CI (1.0–6.7), P = 0.07 for mean; OR 1.7, 95% CI (1.0–3.2), P = 0.06 for peak; and OR 1.7, 95% CI (1.0–3.2), P = 0.06 for maximum change in VL. Also, CMV VL kinetics captured a large percentage of ganciclovir’s effect by week 24 — mean captured 86.3%, peak 82.7%, maximum change 94.5%, and shedding rate 93.8% (Figure 9B). Super Learner models built using baseline characteristics of aGVHD, donor CMV serostatus, and baseline VL plus all VL kinetics as in the main analysis were able to predict CMV disease by week 24 with cv-AUCs greater than 75% (Figure 9, C and D). The results of this validation procedure support not only the robustness of our findings that VL kinetics can serve as surrogate endpoints for clinical outcomes under different treatment settings, but also the applicability to the modern antiviral prophylaxis setting.