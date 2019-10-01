Observation of a subpopulation of CRC with high CD8+ T cell infiltration and poor outcome. In melanoma, stratification of TME based on tumor-infiltrating lymphocytes (TILs) and CD274 expression by IHC is prognostic and can predict response to checkpoint blockade immunotherapies (28, 29). However, analogous analysis has not been systematically established for other cancer types. To gain further insight into these IHC-based studies of melanoma and compare the immune contextures of melanoma versus CRC, we reanalyzed public genomic data sets. We started with TCGA RNA-Seq data, which followed consistent RNA extraction and analysis protocols across cancer types to avoid potential variation introduced by different tissues of origin and IHC-based assays across studies. Figure 1A shows the scatter plot of CD8A and CD274 expression in melanoma. We applied the optimal cut-off in CD8A and CD274 gene expression (shown in Supplemental Figure 1 and Supplemental Figure 2, A and B; supplemental material available online with this article; https://doi.org/10.1172/JCI127046DS1) to best dichotomize patients into risk populations (blue lines in Figure 1A) and subsequently defined 4 groups, denoted as group I (CD8AloCD274lo), group II (CD8AloCD274hi), group III (CD8AhiCD274lo), and group IV (CD8AhiCD274hi). Consistently with melanoma IHC studies described in the literature, group IV showed the most favorable prognosis while group I showed the worst OS outcomes (Figure 1B). These 2 groups accounted for the majority (75%) of patients due to the observed correlation between CD8A and CD274 expression (shown in Figure 1A; r = 0.71), compared with 79% IHC-determined TIL+PD-L1+ and TIL–PD-L1– (29).

Figure 1 Comparison of TME stratification based on CD8A and CD274 gene expression between TCGA melanoma and CRC. Scatter plots of log 2 -transformed CD8A and CD274 gene expression values are shown (A and C) for melanoma (n = 459) and CRC (n = 599), respectively. A linear regression line is plotted with the gray shaded region showing the 95% confidence interval. Pearson’s correlation coefficient r and P values are given at the bottom. MSI (black triangles) and MSS (gray circles) statuses are labeled for CRC samples. Median values of CD8A and CD274 expression are indicated with dashed gray lines. log-rank statistics were applied to identify the optimal cut-off for transforming the continuous variable of gene expression into categorical high- and low-expression groups in a survfit model. The test score at each candidate cut-off across the log-transformed gene expression values was plotted. The highest test score (indicated with a blue arrow) was applied for best separating patients into 4 different risk groups (using solid blue lines; named groups I to IV). To compare risk groups between melanoma and CRC, we also applied a secondary peak of test scores (red arrow with an asterisk, which revealed a reverse pattern of survival in CRC as shown in Supplemental Figure 2) for CD274 stratification (indicated with a solid red line instead of a blue line; named groups I, II, III* and IV*). Each stratified risk group is labeled with its population fraction in percentages. Kaplan-Meier survival curves for the 4 risk groups are plotted for melanoma (B) and CRC (D and E). The log-rank test P values are shown for each plot.

In contrast to melanoma, CRC showed a very different, multiple-peak pattern based on log-rank tests across CD8A and CD274 expression (Supplemental Figure 1C and Supplemental Figure 2C). Although dichotomizing patients using the highest peak (42.07 percentile for CD8A and 28.54 percentile for CD274) agreed with the general understanding that the high-expression group was associated with a favorable prognosis (Supplemental Figure 1D and Supplemental Figure 2D), the bimodal distribution with opposite survival pattern revealed a second population of high-risk subjects at the higher expression end (Supplemental Figure 1E and Supplemental Figure 2E). Since the majority of samples with the expression level above the secondary CD274 peak were within the CD8Ahi group, we applied this secondary cut-off (indicated with red line in Figure 1C; 83.14 percentile) for stratifying a new CD8AhiCD274hi* population as group IV* (denoted with an asterisk for the higher cut-off for CD274 expression). This yielded reverse survival trends as melanoma (Figure 1D). While group I (CD8AloCD274lo) still showed unfavorable outcomes as in melanoma, group IV* (CD8AhiCD274hi*; accounted for 16.5% of total population) now showed unfavorable outcomes despite having high levels of CD8A (Figure 1E). Group III* (CD8AhiCD274lo*) showed the most favorable outcomes in CRC, with the majority of these samples having high CD8A without intense CD274 expression. To further demonstrate the absence of a poor risk group IV* in melanoma, we applied the same CRC cut-off (based on CD274 percentile) to a TCGA melanoma CD8Ahi population. As shown in Supplemental Figure 3, clinical outcome of this virtual group IV* remains the most favorable.

Identification and validation of the CRC risk subpopulation with additional cohorts. To validate the existence of this CD8AhiCD274hi* risk population in CRC, we assembled additional cohorts and established a validation approach applicable to all cohorts. Figure 2 shows that bimodal distribution of log-rank test scores showing opposite CD274 prognostic behaviors is dependent on CD8A gene expression levels. We applied the commonly used median value (near the optimal CD8A cut-off) to define CD8Ahi and CD8Alo populations, followed by OS analysis based on optimal cut-off of CD274 expression independently determined in each population (Figure 2, B and D). The optimal cut-offs determined in CD8Ahi and CD8Alo populations are close to the previously observed bimodal distribution peaks (compared with Figure 2A). Intriguingly, CD274 expression levels serve as a prognostic marker with opposite outcomes for the CD8Ahi and CD8Alo groups (Figure 2, C and E). Therefore, we adopted this approach to identify the poor outcome group IV* from the CD8Ahi population and to validate in additional cohorts. Figure 3, A–C, show the OS comparison among the 2 CD8Ahi groups (group IV* and III* for CD274hi* and CD274lo*) and the CD8Alo population (group I+II) in the TCGA data set. In both analyses, considering either all patients (stage I to IV) or patients diagnosed at stage II and III, group IV* showed poor outcomes despite high CD8+ T cell infiltration.

Figure 2 Prognostic significance of CD274 is dependent on CD8A gene expression levels in CRC. The log-rank test score at each candidate cut-off across the log-transformed CD274 gene expression values was plotted (A). A bimodal score distribution was observed, and 2 cut-offs (indicated with blue and red arrows) were tested for dichotomizing the patients for survival analysis (as shown in Figure 1 and Supplemental Figure 2). Scatter plots of log 2 -transformed CD8A and CD274 gene expression values are shown for TCGA data set (B and D). Median value of CD8A expression was applied to test prognostic significance of CD274 expression in CD8Alo (B) and CD8Ahi (D) populations (n = 299 and 300, respectively; boxed by blue lines). Kaplan-Meier survival curves are plotted (C and E) for the risk groups stratified by optimal CD274 cut-offs shown (B and D). THe log-rank test P values are shown for each plot.

Figure 3 Validation of the CRC risk subpopulation using NCBI-GEO data set. Scatter plots of log 2 -transformed CD8A and CD274 gene expression values are shown for TCGA (A) and NCBI-GEO GSE39582 (D) data sets, with risk groups indicated (group I+II as CD8Alo, III* and IV* as CD8Ahi dichotomized by CD274 expression as shown in Figure 2). For OS analysis, Kaplan-Meier survival curves for the 3 risk groups are plotted for TCGA stages I to IV (B) (n = 599) and TCGA stages II to III samples (C) (n = 391), NCBI-GEO GSE39582 stages I to IV (E) (n = 557) and GSE39582 stages II to III samples (F) (n = 461). The log-rank test P values are shown for each plot.

We then employed additional independent data sets to validate the existence of group IV*. The first validation cohort was GSE39582 from the NCBI-GEO data repository. This data set included CRC patients across clinical stages with information on both OS and relapse-free survival (RFS) outcomes, along with microarray-derived gene expression profiles of corresponding tumors. Based on OS analysis (Figure 3, D–F), we employed the same approach to identify group IV* and validated that this group has poor OS outcomes. Based on RFS analysis using the identical cohort, initial investigation taking the same approach also revealed that group IV* had a higher risk of relapse compared with group III* (Supplemental Figure 4, A and B). To further strengthen the RFS investigation, we conducted a metaanalysis based on additional data sets (detailed in Methods) focusing on stage II or III CRC patients. As shown in Supplemental Figure 4, C and D, group IV* consistently showed less favorable outcomes compared with other risk groups.

Using a Cox regression model, association between survival and clinicopathologic variables in TCGA and NCBI-GEO data sets were analyzed for stage II and III patients (Supplemental Table 1). In a multivariate analysis, group IV* (CD8AhiPD-L1hi*) remains an independent prognostic variable for TCGA OS, NCBI-GEO GSE39582 OS, and NCBI-GEO metaanalysis RFS analysis (Table 1), with hazard ratios against group III* (CD8AhiCD274lo*) of 3.27 (1.53~7.00, P = 0.0023) for TCGA (OS), 2.08 (1.13~3.84, P = 0.019) for NCBI-GEO GSE39582 (OS), and 1.69 (1.08~2.64, P = 0.021) for NCBI-GEO meta-analysis (RFS).

Table 1 Multivariable survival model

To further validate these genomic findings, we collected 71 stage III CRC patients from the City of Hope tumor registry. We utilized high-dimensional (4 to 7 colors) IHC staining with multispectral image analysis (PerkinElmer Vectra) of archival CRC tumors from our City of Hope cohort to crossvalidate genomic findings. An initial staining panel was developed to visualize CD8, PDCD1 (PD-1), CD274, and KRT20 (CK20) simultaneously on the same tissue slide. PDCD1 colocalized on CD8+ T cells, and CD274 was mainly present on KRT20– (nontumor) cells (Figure 4A and Supplemental Figure 5A). To further delineate the cell-type source of CD274 expression, a second multicolor IHC panel was developed that included CD274, CD68, and KRT20. Out of these 71 primary tumors, 5 samples were observed to have CD274 expression on KRT20+ (tumor) cells. For these 5 samples, only 5%–8% of CD274+ cells were observed with KRT20 coexpression, while over 90% of CD274+ cells were observed with CD68 coexpression. In the remaining 66 samples, CD274 expression was observed exclusively on CD68+ cells. Thus, CD274 expression was almost entirely observed on CD68+ tumor-associated macrophages (TAMs) within CRC tumors (Supplemental Figure 5B). To validate the combined impact of CD8 and CD274 on patients’ clinical outcomes, we applied the same TCGA data-defined stratification. As shown in Figure 4B, low CD8+ T cells were predictive of disease relapse, with the majority of relapse events observed in CD8lo Group I+II. Favorable prognosis was again observed with group III* patients. Importantly, all 7 patients with group IV* profiles (10% of this cohort) relapsed. In addition, group IV* had the highest density of CD68+ TAMs compared with the other groups (Figure 4C). Furthermore, the high density of infiltrated CD68+ TAMs in group IV* was a result of CD68+CD274 coexpressed cells (Figure 4, D and E), but not CD68+CD274 negative cells (Figure 4F). Overall, group IV* had the highest levels of CD274+ TAMs (Supplemental Figure 5C).

Figure 4 Histological analysis of archival CRC tumors. (A) Representative multiplex fluorescent image of a stage III colorectal tumor using a panel of markers including CD8, PD-1, CD274, KRT20 (CK20), and DAPI on FFPE tumor specimen in a City of Hope cohort (n = 71). Original magnification, ×200. (B) Scatter plot of log 2 -transformed CD8 and CD274 (Stroma) cell density (cells/mm2) across the entire cohort. Median values of CD8 and CD274 cell density are indicated with solid blue and dashed gray lines, respectively, along with relapse and MMR status. CD68+ TAM infiltration and the CD274 expression among CRC risk groups were quantified using a second panel of markers, including CD68 (representative images shown in Supplemental Figure 4). Standard boxplots (horizontal lines at the 25th percentile, the median, and the 75th percentile) are applied to visualize the distribution of log 2 -transformed cell density (cells/mm2) of (C) CD68+ macrophages, (E) CD274+CD68+ macrophages, and (F) CD274–CD68+ macrophages across the 3 observed risk groups. Fraction of CD68+ macrophages with CD274 expression for samples across the 3 observed risk groups is compared in D. MMR-deficient (black triangles) and -proficient (gray circles) samples are labeled. Statistical P values between groups were determined by Welch’s t tests after Bonferroni’s correction for multiple comparisons. ***P < 0.001; **P < 0.01; *P < 0.05.

Consistent with our IHC observations, both TCGA and NCBI-GEO GSE39582 data sets showed that CD68 expression in group IV* (CD8AhiCD274hi*) was significantly higher than in the other 2 groups (Supplemental Figure 6, E and F; illustration focuses on stage II and III samples from this point). This analysis also revealed a strong positive correlation between CD274 and CD68 genes, but a negative correlation between CD274 and KRT20 genes (Supplemental Figure 7). We also found a strong positive correlation between CD8A and PDCD1 genes, supporting our IHC observations that CD274 expression was mainly from macrophages rather than cancer cells. In addition, we examined the expression of IFNG, which was reported as the most potent inducer of CD274 expression in various cancer types, within TME (30). As shown in Supplemental Figure 8, A–H, the IFNG gene was highly expressed in group IV*, as were genes involved in downstream signaling after IFN-γ binding, exemplified by JAK2, STAT1, and IRF1. In contrast, 2 recently identified posttranslational regulators of CD274, CMTM4 and CMTM6 (31), were either negatively correlated with CD8A expression levels or not different between the groups (Supplemental Figure 8, I–L).

Enrichment in MSI tumors. Based on the multivariable survival model across our investigated data sets, there was no significant association for MSI status (Table 1), which is known to associate with lymphocytic infiltrate and good prognosis (32). To reconcile the generally lower-risk profile of MSI tumors with CD274 expression, we further investigated the MSI status across the investigated data sets. As shown in Table 2 (also visualized in Figure 1C and Figure 3A using triangles for TCGA data set), the 2 CD8Ahi groups IV* and III* showed a higher fraction of MSI samples (in combination, a total of 25.1%, 27.7%, and 27.8% in TCGA, NCBI-GEO GSE39582, and NCBI-GEO metaanalysis, respectively), compared with group I (6.9%, 8.4% and 11.1%). The higher MSI proportion was particularly remarkable for group IV* (48.4%, 65.6%, and 57.5% in TCGA, NCBI-GEO GSE39582, and NCBI-GEO metaanalysis, respectively), given the normally low-risk profile of MSI. Similarly, despite the smaller sample size in our City of Hope data set, over half of the group IV* patients (4 of 7; 57%) were MSI (mismatch-repair [MMR] deficient).

Table 2 Microsatellite instability status across CD8A/CD274-stratified risk groups

Immune characteristics of the CD8hiCD274hi* risk group. Strong correlation of RNA expression among a set of Th-1 immune response genes with other immune-regulatory markers, including CD8A and CD274, has been demonstrated in multiple cancer types (exemplified in Bedognetti et al.; ref. 12). The coactivation of these proinflammatory and regulatory genes has been shown to associate with favorable outcomes. To investigate the counterbalancing mechanisms in CRC beyond CD8A and CD274, we first examined the expression correlation among 20 Th-1 immune response and regulatory genes and compared it with the expression correlation among the same gene set in melanoma (Supplemental Figure 9, A and B). Such correlation was also observed for the majority of these signature genes across the TCGA samples applied to this study (mean Spearman’s correlation coefficient 0.59 or 0.63 with exclusion of 3 genes showing relatively lower correlation). Supplemental Figure 9C shows a heatmap for the coexpression of Th-1 signature genes for the risk groups stratified by CD8A and CD274 using TCGA data. Although these genes individually expressed at different levels, group IV* samples, regardless of MSI status, showed the highest overall expression across the gene set. We then expanded the gene list of interest to a well-annotated panel of immune-related genes (NanoString nCounter PanCancer Immune Profiling Panel). Supplemental Figure 9D illustrates the expression pattern across a total of 625 genes, including cell-type specific, immune response, and checkpoint genes. A similar coexpression pattern across the risk groups was again observed, extending beyond the Th-1 signature genes.

To further investigate the relative proportions of immune infiltrates in group IV* tumors, we employed 2 different deconvolution methods, CIBERSORT (33) and TIMER (34), using TCGA data. As these 2 methods were developed for inferring numbers of different immune subsets using different algorithms (35, 36), we aggregated relevant CIBERSORT results into the same 6 major cell types from TIMER: B cells, CD4+ T cells, CD8+ T cells, macrophages, dendritic cells, and neutrophils. We observed similar patterns showing that all 6 major cell-type infiltrates were highly enriched in group IV* (Supplemental Figure 10). We also examined the expression of markers commonly used for identifying immune cell types, e.g., CD19 and MS4A1 (CD20) for B cells; CD3D, CD3E, and CD4 for T cells; CD163 and CD68 for macrophages; ITGAX (CD11c), CD209, and HLA-DRB1 for dendritic cells; FCGR3A (CD16), FCGR2A (CD32), and CSF3R (CD114) for neutrophils; and FOXP3 for regulatory T cells. As exemplified in Supplemental Figure 6, these markers were all highly upregulated in group IV* samples in both TCGA and NCBI-GEO GSE39582 data sets. Notably, total immune infiltrates estimated by CIBERSORT (sum of all 22 inferred immune subsets) in both data sets were highest in group IV* (Figure 5, A and B). In contrast to total immune infiltrates, group IV* had relatively lower levels of cancer cells (based on KRT20 expression) than group I+II or III* (Figure 5, C and D).

Figure 5 Comparisons of total immune infiltrates and expression levels of cancer cell and representative checkpoint markers across the CRC OS risk groups in TCGA and NCBI-GEO GSE39582 stage II and III samples. Standard boxplots (horizontal lines at the 25th percentile, the median, and the 75th percentile) are applied to visualize total immune infiltrates and overall gene expression of cancer cells and representative checkpoint markers for each risk group, with MSI (black triangles) and MSS (gray circles) samples labeled. Total immune infiltrates estimated by the tumor deconvolution algorithm CIBERSORT (sum of absolute scores across 22 immune cell types) are shown in panels A and B. KRT20 is applied to represent CRC cells (C and D), and CTLA4 represents immune checkpoint genes (E and F). Statistical P values between groups were determined by Welch’s t tests after Bonferroni’s correction for multiple comparisons: ***P < 0.001, **P < 0.01, *P < 0.05. (A, C, and E) TCGA, n = 391; (B, D, and F) NCBI-GEO GSE39582, n = 461.

To further reconcile why group IV* has poor outcome despite high CD8+ T cell infiltration, we analyzed the expression of representative checkpoint genes, as they are known to associate with T cell dysfunction (37) and may act to limit antitumor immune responses (38). Group IV* had the highest expression of all checkpoint genes we examined, exemplified by CTLA4 (Figure 5, E and F), CD274, HAVCR2 (TIM-3), TNFRSF9 (4-1BB or CD137), LAG3, TIGIT, and ICOS (Supplemental Figure 11). Moreover, recent studies proposed that elevation of transforming growth factor TGF-β signaling is the primary mechanism of immune evasion (39–41). As shown in Figure 6A, TGF-β–encoding genes (as the average of TGFB1, TGFB2, and TGFB3) were highly expressed in group IV* irrespective of their MSI status. Also, a recent pan-cancer study identified a set of 30 upregulated extracellular matrix genes in cancer (referred to as C-ECM genes) which significantly associated with poor prognosis (42). Figure 6B demonstrates elevated expression of these genes in group IV*.

Figure 6 Expression of TGF-β–encoding and C-ECM signature genes and the distribution of CMSs across the CRC OS risk groups in TCGA stage II and III samples. Standard boxplots (horizontal lines at the 25th percentile, the median, and the 75th percentile) are applied to visualize the expression levels of (A) TGF-β–encoding genes (log 2 -transformed averages of TGFB1, TGFB2, TGFB3 genes) and (B) C-ECM genes (log 2 -transformed average of 30 upregulated signature genes). Median expression value is indicated with a dashed line. Statistical P values between groups were determined by Welch’s t tests after Bonferroni’s correction for multiple comparisons: ***P < 0.001. (C) Fractions of CMS subtypes (CMS1, MSI immune; CMS2, canonical; CMS3, metabolic; CMS4, mesenchymal) in each of our stratified risk groups. (D) Kaplan-Meier survival curves for CMS1 patients further separated into CD8Ahi risk groups III* and IV*. (A–C) TCGA, n = 301; (D) TCGA CMS1, n = 48.

Together, these results demonstrate that all major immune cell types and checkpoint genes are overrepresented in group IV* CRC tumors regardless of MSI status, reflecting an immune overdrive phenotype. Furthermore, TGF-β–encoding genes are upregulated, reflecting the immunosuppressive nature of group IV*.

Finally, our risk group stratification is distinct from the recent CMS classification of CRC tumors (27). Figure 6C shows comparison between CMS classification and our CD8A/CD274-stratified risk groups: group IV* largely overlaps with CMS1 (referred to as MSI immune subtype; featuring immune infiltration and activation) and CMS4 (mesenchymal; stromal infiltration and TGF-β activation) subtypes (47.5% and 36.1% respectively; for a total of 83.6%). Furthermore, CD8A/CD274 can further stratify CMS1 into different risk groups (Figure 6D): group IV* CMS1 patients carried a 3.8-fold higher OS risk over group III* CMS1 patients. Similar patterns were also validated in the NCBI-GEO GSE39582 data set (Supplemental Figure 12). This demonstrates that CD8A/CD274 stratification has additional and independent prognostic implications beyond CMS classification.