Demographics. Of 601 patients enrolled in SCRIPT between June 2018 and March 2022, 585 had an adjudicated pneumonia category and clinical endpoints at the time of analysis (Figure 1): 190 had COVID-19, 50 had pneumonia secondary to other respiratory viruses, 252 had other pneumonia (bacterial), and 93 were initially suspected of having pneumonia yet were subsequently adjudicated as having respiratory failure unrelated to pneumonia (nonpneumonia controls). Except for BMI, demographics such as age and sex were similar between the groups (Figure 2, A–C, and Supplemental Table 1, which also includes a description of patient comorbidities). Severity of illness, as measured by the Acute Physiology Score (APS) from Acute Physiology And Chronic Health Evaluation (APACHE) IV (31) and the Sequential Organ Failure Assessment (SOFA) score (24, 32) in the first 2 days of admission, did not differ between the groups (Figure 2, D and E). Patients across the pneumonia categories underwent intubation following a similar duration of time in the ICU, with a trend toward later intubation in patients with COVID-19 (Supplemental Figure 1A and Supplemental Table 2; supplemental material available online with this article; https://doi.org/10.1172/JCI170682DS1). At the time of intubation, the SOFA scores were similar for patients with COVID-19 compared with other patients in the cohort who were intubated after admission to our hospital (Supplemental Table 2). On the first day of intubation, patients with COVID-19 had lower oxygen saturation levels despite a higher fraction of inspired O 2 (FiO 2 ) (Supplemental Table 2). They required higher levels of positive end-expiratory pressure (PEEP) but had lower heart rates and were receiving lower doses of norepinephrine (Supplemental Table 2). Despite a similar overall severity of illness on ICU admission, the durations of intubation and ICU stays were more than twice as long among patients with SARS-CoV-2 pneumonia compared with any other group, reflected by a higher frequency of tracheostomy and longer ICU LOS (Figure 2, F–H, and Supplemental Figure 2). The longer ICU LOS persisted when patients who received extracorporeal membrane oxygenation (ECMO) support or who were received in external transfer (31.4% of the cohort) were excluded from the cohort (Supplemental Figure 1, B and C). Hospital mortality did not differ between groups (Figure 2I). A similar fraction of patients with SARS-CoV-2 pneumonia received corticosteroids during their ICU stay compared with the rest of the cohort, but patients with SARS-CoV-2 pneumonia received higher cumulative doses (Supplemental Table 1). Patients with SARS-CoV-2 pneumonia were more likely to receive IL-6 receptor antagonists and remdesivir (Supplemental Table 1).

Figure 1 CONSORT diagram of the SCRIPT study participants and analysis.

Figure 2 Demographics and outcomes of the cohort grouped by pneumonia category. Distribution of (A) patient age in years, (B) BMI in kg/m2 (1 patient did not have BMI data available), (C) sex, (D) APS, (E) SOFA score, (F) tracheostomy placement, (G) duration of intubation, (H) length of ICU stay, and (I) hospital mortality. Total days intubated (G) and total ICU days (H) include only days at our hospital and do not capture intubation duration or ICU LOS at a transferring hospital. Data on patients who lived include dispositions of discharge to home, acute inpatient rehabilitation, and admission to a long-term acute care hospital (LTACH) or skilled nursing facility (SNF) (see Supplemental Table 1). Data on patients who died include patients who died in the hospital, patients who underwent lung transplantation for refractory respiratory failure, and patients who were transferred to home or inpatient hospice. The APS score from APACHE IV was calculated from the worst value within the first 2 ICU days, and SOFA score was calculated from the worst value within the first 2 ICU days. In the box-and-whisker plots, the box shows quartiles and the median, and the whiskers show the minimum and maximum values except for outliers, which are shown as individual data points. Notches are bootstrapped 95% CI of the median. Numerical values were compared with the Mann-Whitney U test with FDR correction using the Benjamini-Hochberg procedure. Categorical values were compared using Fisher’s exact tests with FDR correction using the Benjamini-Hochberg procedure. A q value of less than 0.05 was the threshold for statistical significance. Numerical values and additional details are available in Supplemental Tables 1 and 2.

CarpeDiem: a machine-learning approach to time-series data in the ICU. To address the challenge of comparing intercurrent ICU events between groups with different ICU LOS, we developed a machine-learning approach, CarpeDiem, to discretize each patient-day in the ICU. For all 12,495 ICU patient-days for the cohort, we extracted clinical data from the EHR describing 44 key clinical parameters, including flags for organ failures requiring mechanical support (e.g., mechanical ventilation, renal replacement therapy, and ECMO), continuously recorded clinical parameters (e.g., vital signs and doses of norepinephrine), and commonly measured laboratory values (Supplemental Figure 3A). Variables used to calculate the SOFA score are a subset of these parameters. Importantly, patient-intrinsic variables (e.g., demographics, BMI, tracheostomy, and diagnosis), biochemical and microbiological analyses of BAL fluid studies, and adjudication of VAP episodes were not included in the model. Correlation analysis identified expected associations between mathematically or physiologically coupled variables (e.g., plateau pressure, PEEP, and lung compliance; partial pressure of CO 2 [PaCO 2 and bicarbonate) and revealed clinically recognizable correlated features (e.g., ECMO, D-dimer, and lactate dehydrogenase [LDH]) (Supplemental Figure 3B). After reducing the weight of these highly correlated features, we performed clustering using several methods, all of which yielded similar results (Supplemental Figure 4, A–C). We designed a clustering strategy based on the similarity between patient-days (see details in Supplemental Methods) and selected the number of clusters by choosing a near-maximal difference in mortality between pairwise comparisons of clusters (Supplemental Figure 5A) while limiting the number of cluster breaks to those that were determined to be clinically meaningful by 4 ICU physicians (CAG, GRSB, RGW, BDS). To explore the stability of our clustering approach, we randomly excluded patients from our cohort and independently reclustered this subset. While the overall patterns of clustering were similar, the assignment of patient-days to specific clusters differed (Supplemental Figure 5B). We visualized the resulting 14 clusters using heatmaps (Figure 3, A and B) and uniform manifold approximation and projection (UMAP) plots (Supplemental Figure 6). Median SOFA scores for the days in each cluster are shown in Supplemental Figure 7. Every cluster contained patients and patient-days from each pneumonia category, including COVID-19 (Supplemental Figure 8, A and B). Thus, the clinical states defined by CarpeDiem are useful to compare patient-days within a given cohort but do not represent a priori states to which patient-days can be prospectively assigned.

Figure 3 CarpeDiem groups patient-days into clusters representing clinical states associated with differential hospital mortality. (A) Heatmap of 44 clinical parameters with columns (representing 12,495 ICU patient-days for 585 patients) grouped into CarpeDiem-generated clusters (clinical states) ordered from the lowest to highest mortality rates. Rows are sorted into physiologically related groups. The top row signifies the hospital mortality outcome of the patient shown in the column (blue = lived, red = died). The hospital mortality rate associated with each cluster is shown above the heatmap. (B) Heatmap of the composite signal from each cluster and physiological group with ordering the same as in A.

As CarpeDiem uses physiological parameters and laboratory values evaluated by clinicians to develop a daily plan of care, the clusters generated by CarpeDiem are recognizable as clinical states. To visualize these data, we arranged the parameters into 6 physiological groups (neurologic, respiratory, shock, renal, inflammatory, and ventilator instability) and sorted the clusters in order of increasing mortality. The resulting heatmaps (Figure 3, A and B) and spider plots (Figure 4) revealed an association between patient-days characterized by multiple organ failure and mortality, findings consistent with published scoring systems (24–26). We compared mortality for each clinical state identified by CarpeDiem on the first, median, and last day in the ICU. On the first day of the ICU stay, only 2 of the clinical states were significantly associated with outcome (Supplemental Figure 9A). In contrast, the same analysis for the median and last ICU day for each patient revealed 8 and 9 significant associations, respectively, between clinical state and outcome (Supplemental Figure 9, B and C), supporting the construct validity of the CarpeDiem-generated clusters and the rationale to use CarpeDiem in an unsupervised fashion to evaluate all days of the ICU stay.

Figure 4 CarpeDiem clinical states have different patterns of organ dysfunction. Spider plots of minimum–maximum normalized composite features from Figure 3B for each clinical state. Circles indicate values of 0.2 (innermost), 0.4, 0.6, 0.8, and 1 (outermost).

Critical care physicians (CAG, GRSB, RGW, BDS) used these visualizations to interpret the clinical states. For example, clinical state 12 represents patient-days with very severe respiratory failure (mostly days spent receiving ECMO support), moderately high levels of sedation, an intermediate level of shock without substantial renal failure, and relatively stable ventilator settings. Importantly, while enriched for patients receiving ECMO support, clinical state 12 consisted of days spanning the duration of the ICU stay (Supplemental Figure 10), supporting the notion that ECMO is a marker of persistent, severe respiratory failure rather than a salvage or perimortem intervention applied at the end of the ICU stay. An illustration of time-series data and transitions between clinical states over a selected patient’s ICU course is provided in Supplemental Figure 6J.

Validation of the CarpeDiem approach in the MIMIC-IV data set. We next determined whether the CarpeDiem approach could be used to analyze an external data set. Within the Medical Information Mart for Intensive Care IV (MIMIC-IV) database of ICU patients (33), we identified the subset of 1,284 ICU stays similar to those in our cohort. The CarpeDiem approach applied to 15,642 ICU patient-days using 27 clinical parameters, a subset of the 44 used above that were readily available in the MIMIC-IV database, identified 12 clusters (Supplemental Figure 11, A–C). Similar to our observations in the SCRIPT cohort, CarpeDiem-generated clusters in MIMIC-IV were clinically recognizable with increasing organ failure associated with mortality (Supplemental Figure 11, D and E). Although these results support the generalizability of the CarpeDiem approach, the clinical states observed in the MIMIC-IV cohort were not identical to those in the SCRIPT cohort. This observation might be expected, as, for example, MIMIC-IV had very few patients who received ECMO, underscoring the concept that clinical states cannot be assigned a priori in a given cohort.

CarpeDiem reveals that the long LOS among patients with COVID-19 is associated with prolonged stays in clinical states characterized by severe respiratory failure. We reasoned that CarpeDiem could provide insight into the reasons why patients with severe SARS-CoV-2 pneumonia had longer ICU LOS relative to patients with pneumonia and respiratory failure secondary to other etiologies despite similar hospital mortality rates. We posited that this observation could result from (a) longer stays in a given clinical state with similar numbers of transitions between states, as would be observed for prolonged respiratory failure or (b) similar durations of stay in any given clinical state with a balanced increase in the number of transitions between favorable and unfavorable states, as might be observed in patients developing multiple organ dysfunction. Although the absolute number of transitions between clinical states was higher among patients with SARS-CoV-2 pneumonia when compared with all other patient groups in the cohort (Figure 5A and Supplemental Figure 12A), the frequency of transitions was significantly lower (Figure 5B and Supplemental Figure 12B). The longer ICU LOS experienced by patients with severe SARS-CoV-2 pneumonia resulted from significantly prolonged stays in 4 clinical states (Figure 5C). Clusters that were enriched in days from patients with COVID-19 had higher respiratory severity scores (Figure 3, Figure 4, and Figure 5D), illustrating that patients with COVID-19 spent a disproportionate amount of time in clusters characterized by hypoxemic respiratory failure. Time spent in clinical state 12, characterized by severe hypoxemic respiratory failure, accounted for 29.9% of the difference in ICU LOS experienced by patients with COVID-19. Overall, since some clusters were deficient in patients with COVID-19, time spent in the 4 clinical states that were significantly enriched in patients with COVID-19 accounted for over 100% of the difference in ICU LOS between patients with and without COVID-19.

Figure 5 The long LOS among patients with COVID-19 is driven by a lower frequency of transitions, resulting in longer durations of time spent in certain clinical states. (A) Distribution of transitions per patient. (B) Distribution of transitions normalized by ICU LOS. (C) Distribution of ICU days spent in each clinical state per patient. The y-axis is discontinuous to accommodate all data points. (D) Respiratory severity score per clinical state, which is numbered next to each point, split by whether that cluster was enriched in patient-days for patients with COVID-19. Green line indicates the median respiratory severity score for the cohort. For the box-and-whisker plots, the box shows quartiles and the median, and whiskers show the minimum and maximum values except for outliers, which are shown as individual data points. Numerical values were compared using Mann-Whitney U tests with FDR correction using the Benjamini-Hochberg procedure. A q value of less than 0.05 was our threshold for statistical significance.

To examine the robustness of our findings with regard to changes in the composition of the cohort, we randomly excluded 20% of the cohort and reclustered patient-days 500 times. As shown in Supplemental Figure 13, the main conclusions drawn from the full data set hold after random subsampling, including the finding that patients with COVID-19 experienced fewer transitions per day irrespective of outcome (as in Figure 5B) and experienced longer stays in clusters with high respiratory severity scores (as in Figure 5D).

To explore the potential utility of the CarpeDiem approach within the context of a randomized, controlled trial, we analyzed the 10 patients within SCRIPT who were also enrolled in a randomized, placebo-controlled trial of the IL-6 receptor antagonist sarilumab for the treatment of patients with respiratory failure secondary to COVID-19. The results of randomized, controlled trials of IL-6 receptor antagonists in patients with COVID-19 have been mixed (34), with some trials reporting benefit, while others, including this trial (35), did not. We calculated the sum of CarpeDiem-defined clinical state transitions occurring 3 and 5 days following randomization to sarilumab (n = 6) or placebo (n = 4). Even within this very small group, we observed significantly more favorable transitions in patients who received sarilumab compared with those who received placebo in the 3 days after drug administration (Supplemental Figure 14, A and B). In contrast, no statistically significant difference was evident 5 days after randomization (Supplemental Figure 14C).

Unresolving VAP drives poor outcomes in patients with severe pneumonia, including pneumonia due to SARS-CoV-2. Nearly all patients (97.4%) underwent transitions between clinical states over the course of their ICU stay (median [IQR] of 4[2,7] transitions per patient). We defined transitions as favorable if the mortality associated with the destination clinical state was lower than the originating state and vice versa. While the number of unfavorable transitions was similar in patients with SARS-CoV-2 pneumonia and other patients in the cohort, the number of favorable transitions was nominally lower in patients with SARS-CoV-2 pneumonia (Figure 6 and Supplemental Figure 15, A and B).

Figure 6 Patients with SARS-CoV-2 pneumonia have a longer LOS and fewer transitions between clinical states per day compared with patients with non–COVID-19–related respiratory failure. Clinical states are ordered and numbered 1–14 according to their associated mortality (blue to red). Rectangle width reflects the median number of days spent in each clinical state. Green arrows indicate transitions to a more favorable (lower mortality) clinical state; yellow arrows mark transitions to a less favorable (higher mortality) clinical state. Numbers at the arrow bases represent the number of transitions between the 2 clinical states connected by the arrow. Only transitions that occurred more than 30 times are shown.

We hypothesized that VAP would, at least in part, explain the disconnect between ICU LOS and mortality in patients with COVID-19. Overall, 35.5% of patients in the cohort developed at least 1 episode of VAP during their ICU stay (25.0% among patients without COVID-19 compared with 57.4% among patients with COVID-19, P < 0.001) (Figure 7A). A total of 8.7% of patients in the cohort experienced more than 1 episode of VAP (3.5% among patients without COVID-19 compared with 19.5% among patients with COVID-19, P < 0.001) (Figure 7B). Mortality for patients with VAP has been reported to increase substantially with each ensuing episode, approaching 100% in patients with 3 or more episodes (36). In contrast, we found that the mortality rate associated with a single VAP episode did not differ from the mortality rate associated with multiple VAP episodes (48.6% with a single episode, 53.6% with 2 episodes, 50.0% with 3 episodes; P = NS) (Figure 7C), suggesting that a cure can be achieved even in patients with multiple VAP episodes. Nevertheless, the relatively small number of patients with multiple VAP episodes limited the power to detect small differences (Figure 7D).

Figure 7 Patients with COVID-19 experience more VAP episodes than do patients without COVID-19. (A) Proportion of patients with at least 1 VAP. (B) Proportion of patients with more than 1 VAP. (C) Outcomes for patients experiencing different numbers of VAP episodes. Outcomes are displayed in 2 columns: the first column aggregates favorable discharge dispositions (home, rehabilitation, SNF, LTACH); the second column aggregates unfavorable discharge dispositions (hospice, died). (D) Sankey diagram of VAP episodes and outcomes for each VAP episode. Categorical values were compared using Fisher’s exact test with FDR correction using the Benjamini-Hochberg procedure. A q value of less than 0.05 was the threshold for statistical significance.

Overall, mortality was not significantly different in patients who developed VAP compared with those who did not (Figure 8A). To further explore the association between VAP and ICU outcomes, we used the validated clinical adjudication results from the SCRIPT study to compare patients with successful treatment of VAP (cured) with those who experienced an indeterminate outcome or unsuccessful treatment (not cured). Examining these endpoints among patients who had only a single VAP episode, we found that mortality was lowest among patients with successful treatment (cured), intermediate among those with an indeterminate outcome, and highest among those with unsuccessful treatment (not cured) (Figure 8B). Among these patients, the rate of unfavorable outcomes (hospice or death) was 17.6% in patients with a cured episode and 76.5% in patients with unsuccessful treatment (intermediate or not cured episode, P < 0.001). We also observed a similar pattern among the subset of patients with COVID-19 (Supplemental Figure 16A). Patients with COVID-19 experienced longer durations of VAP episodes (Figure 8C). Unresolving VAP episodes (patients with an indeterminate outcome or who were not cured) were of longer duration than cured episodes (Figure 8D). Since survival is included in our definition of successful VAP treatment, we performed a sensitivity analysis on VAP episodes experienced by patients who survived for at least 14 days after their VAP diagnosis. Even in this group, biased toward better outcomes, we found that unresolving VAP was associated with a higher mortality rate (Supplemental Figure 16B).

Figure 8 Unresolving VAP is associated with worse outcomes. (A) Mortality associated with at least 1 episode of VAP. (B) Outcomes for patients who experienced 1 episode of VAP that was cured, of indeterminate cure status, or that was not cured by day 14 following diagnosis. Outcomes are displayed in 2 columns: the first column aggregates favorable discharge dispositions (home, rehabilitation, SNF, LTACH); the second column aggregates unfavorable discharge dispositions (hospice, died). (C) VAP episode duration for patients with COVID-19 compared with patients without COVID-19. (D) VAP episode duration for patients who were cured or not cured or of indeterminate cure status. For the box-and-whisker plots, the box shows quartiles and the median, and whiskers show minimum and maximum values except for outliers, which are shown as individual data points. Numerical values were compared using the Mann-Whitney U test with FDR correction using the Benjamini-Hochberg procedure. Categorical values were compared using Fisher’s exact test with FDR correction using the Benjamini-Hochberg procedure. A q value of less than 0.05 was the threshold for statistical significance.

CarpeDiem corroborates the clinical adjudication analysis, identifying an association between unresolving VAP episodes and transitions to unfavorable clinical states associated with a higher hospital mortality rate. We then used the transition analyses provided by CarpeDiem to test whether unresolving VAP was associated with a subsequent trajectory toward progressively unfavorable clinical states. To visualize the transitions surrounding the diagnosis of VAP, we generated Sankey diagrams that show the clinical state and transitions encountered before and after the diagnosis of VAP. Successful treatment of VAP was associated with a higher likelihood of favorable subsequent transitions (Figure 9A). In contrast, indeterminate episodes demonstrated a flat trajectory (Figure 9B). Not-cured episodes were associated with a greater risk of unfavorable subsequent transitions (Figure 9C and Supplemental Figure 17). The robustness of the propensity for patients with cured VAP to undergo more favorable transitions than patients without cured VAP was confirmed in subsampling analysis (Supplemental Figure 13E). We then used the sum of transitions occurring in the 7 days following a diagnosis of VAP as a summative measure of trajectory and examined the distribution of trajectories to define favorable, intermediate, and unfavorable trajectory categories (Figure 10A). Favorable trajectories were significantly enriched in cured VAP episodes with significantly higher proportions of indeterminate and not-cured episodes in intermediate and unfavorable trajectory categories, respectively (Figure 10B). Finally, we examined the trajectory categories preceding a VAP diagnosis compared with the average inter-day trajectory across the cohort. We identified an increase in unfavorable transitions 1 day ahead of a VAP diagnosis, presumably reflecting the clinical events that prompted the diagnostic BAL procedure, that was not associated with the duration of the ensuing VAP episode (Supplemental Figure 18, A and B).

Figure 9 Trajectory analysis reveals that unresolving VAP is associated with transitions to progressively unfavorable clinical states. On these Sankey diagrams, day 0 represents the day that a BAL procedure was performed to evaluate VAP adjudicated as (A) cured, (B) indeterminate, or (C) not cured. More favorable (lower mortality) clinical states are at the top of the graphs, with leaving the ICU alive being the highest, and less favorable (higher mortality) clinical states are at the bottom, with death being the lowest. Graphs start at 2 days prior to the onset of the episode; patients who were not in our ICU are labeled as “Other” (patients who were received in external transfer or chronically ventilated patients) or “Floor” (within 48 hours of extubation or chronically ventilated patients). readm., readmission.

Figure 10 Unresolving VAP episodes are associated with unfavorable clinical states. (A) Distribution of the sum of transitions for the 7 days following VAP diagnosis by episode outcome, identifying a breakpoint of 0.1 in the middle of the distribution (shown by the cumulative data histogram along the right axis). Higher sums of transitions reflect transitions to unfavorable (higher mortality) clusters. (B) Proportion of VAP episode outcomes in each trajectory category. Trajectories were grouped into favorable (sum of transitions < –0.1), indeterminate (–0.1–0.1), and unfavorable (>0.1) categories. For box-and-whisker plots, the box shows quartiles and the median, and whiskers show minimum and maximum values except for outliers, which are shown as individual data points. Numerical values were compared using the Mann-Whitney U test with FDR correction using the Benjamini-Hochberg procedure. Categorical values were compared using χ2 tests with FDR correction with the Benjamini-Hochberg procedure. A q value of less than 0.05 was the threshold for statistical significance.

To assess whether the same associations could be revealed to be revealed independently of the CarpeDiem approach, we added flags denoting the development of VAP and its outcome to a standard model of ICU mortality prediction based on clinical parameters measured early (in the first 2 days) of ICU admission. Using gradient boosting, we found only a nominal increase in the predictive ability of early clinical parameters with addition of the VAP flags (Supplemental Figure 19A). These findings are possibly explained by the disconnect between the clinical parameters measured early in a clinical course and the fact that VAP, by definition, occurs later in an ICU stay. Expectedly, the same clinical parameters applied to the median 2 days or final 2 days of the ICU stay had intermediate and excellent predictive capability, respectively, but were similarly unmodified by the addition of the VAP flags (Supplemental Figure 19, B and C).