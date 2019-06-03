Distinct cell surface coding DEG profiles from ccRCC CD8+ and CD19+ PBLs and TILs. To investigate pan-cancer immunity, we performed comprehensive microarray analyses on matched case-control pairs of CD8+ TILs and CD19+ TIL-Bs from ccRCC tumors, CD8+ and CD19+ TIICs from normal tumor-adjacent tissues, and CD8+ and CD19+ PBLs from patients with ccRCC along with CD8+ and CD19+ PBLs from matched healthy control donors (Supplemental Figure 1; supplemental material available online with this article; https://doi.org/10.1172/JCI125301DS1). Study patient clinicopathologic characteristics are presented in Supplemental Table 1. Quality control experiments for yield and quality of various rapidly isolated immune cell subsets from tumors were performed (Supplemental Figure 2, A–D) in addition to stringent bulk total RNA quality testing prior to its amplification and application to comprehensive microarrays interrogating greater than 67,000 transcripts (Supplemental Figure 2E). The Affymetrix Transcriptome Analysis Console was used to observe prominent DEGs in TILs and ptPBLs relative to TIICs and cdPBLs (Supplemental Figure 2F), totaling 7300 (i.e., CD8+ and CD19+ TIL-Bs/TIICs and ptPBLs/cdPBLs; 1.5-fold change; P < 0.05) (Supplemental Figure 1). Principal component analyses (PCAs) were generated using the Partek Genomics Suite for all paired CD8+ or CD19+ biospecimens and PBL controls (Figure 1, A and B). Venn diagrams were generated to demonstrate overlaps in DEGs represented by CD8+ and CD19+ ptPBLs (20.4%) and TILs (37.8%) (Figure 1C) and to show overlaps of possible splice junctions generating spliceoforms common to CD8+ ptPBLs and TILs. This was made possible by using comprehensive HTA 2.0 microarrays (Figure 1D) and suggesting that patient-inherent posttranslational modification programs generating distinct RNA isoforms may also influence the behavior of TILs. To assess the feasibility of pursuing DEGs more easily amenable to therapeutic interventions such as ICB (i.e., actionable targets), we used unsupervised clustering and PCA to examine DEGs coding for molecules expressed on the plasma membranes (PMs). These analyses efficiently stratified immune isolates, with the largest differences maintained between TILs and TIICs (Figure 1E) and also permitted efficient stratification of ptPBLs and cdPBLs (Figure 1, E and F).

Figure 1 Distinct comprehensive transcriptomics from paired CD8+ and CD19+ profiles from ccRCC blood, tumors and tissues, and control donor blood isolates. (A and B) PCA demonstrating distinct DEG profiles from comprehensive HTA 2.0 microarray analyses of (A) CD8+ (n = 15) and (B) CD19+ (n = 15) immune cell subsets from TILs and TIL-Bs, TIICs, and circulating ptPBLs, and cdPBLs (n = 10). (C) Four-way Venn diagram demonstrating percentage overlaps of DEGs identified by microarrays across different source biospecimens analyzed. (D) Venn diagram showing that ptPBLs have greater numbers of differentially represented exon-exon PSR junctions compared with TILs, relative to TIICs from paired CD8+ samples (P < 0.05; ANOVA, Transcriptome Analysis Console v.3, Affymetrix). Thirteen percent of shared PSR junctions exist between ptPBLs and TILs, representing 33% of total genes common to ptPBLs and TILs having shared isoform identity. (E) GO PM proteins identified by Partek and unsupervised hierarchical clustering algorithm-generated heatmaps demonstrating that the 4 different CD8+ isolates are stratified according to PM, using log 2 expression values applying the Euclidean distance metric and complete linkage clustering method (R programming language; R-studio). Heatmaps demonstrate the unsupervised clustering of PBL isolates as most closely related, with TILs and TIICs at their boundaries, suggesting that their profiles may be influenced by the cancer microenvironment. (F) Feasibility of using PM-associated proteins toward identifying pan-cancer DEGs that can stratify patients is demonstrated by PCA biplots of PM DEGs from CD8+ cdPBL and ptPBL isolates created on log 2 values using the biplot function (R; R-studio). diff., differential; id., identity; PSR, probe-selected region.

Prognostic ccRCC DEGs have pan-cancer prognostic potential. To identify prognostically important ccRCC DEGs, we generated Kaplan-Meier plots and P values for the 7300 significant DEGs using TCGA KIRC RNA-seq and associated clinical data sets (n = 534 tumor, n = 72 healthy control donors). This step resulted in detecting 2257 prognostic DEGs (Supplemental Figure 1). To further refine prognostic DEGs and find the most feasible actionable targets, we focused on PM-associated proteins, or those having known targeting compounds. Partek and PANTHER Gene Ontology (GO) were both used to identify PM proteins, ensuring most PM-associated DEGs would be retained. ChEMBL target searches were used to identify proteins with known targeting compounds. Together, these 2 approaches reduced target DEGs to 779, which were then investigated for their pan-cancer potential using more than 11,500 patients with lung, breast, gastric, and ovarian cancer from an online Kaplan-Meier plotter, generating 467 (i.e., 62%) target DEGs with pan-cancer potential. This refined list represented pan-cancer FIR biomarkers, grouped as either (a) agonistic targets decreased in tumors relative to normal tissues and having a positive prognosis or (b) antagonistic targets increased in tumors relative to normal tissues and having negative prognosis (Supplemental Figure 1). PCA analyses permitted the visualization of how these pan-cancer FIR-DEGs identified from ptPBLs (Figure 2A and Supplemental Figure 3A) or TILs (Figure 2B and Supplemental Figure 3B) were distributed across the 5 cancers and how they correlated with each other, and where many were found to be common to both CD8+ TILs and CD19+ TIL-Bs relative to their TIIC counterparts (Figure 2C; see full gene list in Supplemental Figure 3C). A subset of pan-cancer FIR-DEGs was also found to be common between TILs and ptPBLs (Supplemental Figure 3D).

Figure 2 A subset of prognostic ccRCC DEGs have pan-cancer prognostic potential. (A and B) PCAs nominal derivatives of combined modulation of expression and effects on prognosis to visualize CD8+ and CD19+ DEGs from (A) ptPBLs and (B) TILs with significant gene modulation and effect on prognosis across the 5 cancers tested. Genes on the far left are more highly expressed in normal tissues than tumors and have positive prognostic effects (N/T pos prog), representing agonistic targets. Genes on the far right are more highly expressed in tumors than normal tissues and have negative effects on prognosis (T/N neg prog), representing antagonistic targets. PCAs also illustrate linkage between gene coexpression and cancer types, in which breast cancer (BC) ptPBLs and NSCLC TILs are most related to other cancers. In (A), all ptPBL DEGs are shown. In (B), DEGs unique to CD8+ TILs or CD19+ TIL-Bs are shown. (C) DEGs common to CD8+ TILs and CD19+ TIL-Bs are shown, where dark highlighted gene names represent best antagonistic targets, and green highlighted gene names represent best agonistic targets. (D) Correlograms representing linkage between the 5 cancers from nominal derivatives demonstrating that NSCLC and BC are most related to ccRCC, independently of patient sample number (Spearman method, coexpression coefficient ladder on right). (E) Graph demonstrating similar expression patterns of pan-cancer DEGs and genes representing infiltrating immune cell subsets used: CD45, CD3, CD4, CD8, CD20, CD56, and CD68 across pan-cancers (n = 11,577). (F) Graph demonstrating distributions of relative ratios of 483 agonistic vs. agonistic pan-cancer genes, in which TILs have higher percentages of genes that are lower in tumors and have positive prognostic value. GI, gastrointestinal; OV, ovarian.

Correlograms reflected increased correlations between the 5 cancers used to refine for pan-cancer target FIR-DEGs (compare Supplemental Figure 3E and Figure 2D). Because the selected 467 pan-cancer FIR-DEGs were discovered using whole tumor TCGA data sets, we compared percentages of correlations between 5 cancers to that of their immune infiltrates (n > 11,500) providing similar trends, suggesting a strong likelihood that global FIR-DEG signatures were immune based (Figure 2E). Of these 467 pan-cancer FIR-DEGs, proportions of agonistic and antagonistic targets derived from ptPBLs were equal, whereas those derived from TILs had increased agonistic target representation (Figure 2F).

To further refine pan-cancer FIR-DEG targets, nominal derivatives (binomial values) were generated to integrate quantitative and nonquantitative, and thus nonharmonizable data sets and analyses, and were used to acquire an overall score representing their (a) coupled expression and effect on overall survival (n = 5 cancers, Kaplan Meier-plotter), (b) coupled RNA and protein expression in myeloid and lymphoid cells relative to 12 other cancers, (c) modified in expression levels in cancers relative to normal tissues (n = 17 cancers; The Human Protein Atlas), and (d) direct published literary evidence of DEG expression in the immune subtypes from which they were identified (Supplemental Table 2). The top 200 scoring pan-cancer ptPBLs and TIL FIR-DEGs were subjected to PPI analyses using the rudimentary search engine (STRING) (Supplemental Figure 3F and ref. 25), providing a PPI enrichment value (P = 1.85 × 10–10) warranting further investigation. For more comprehensive PPI analyses, we used IID, pathDIP, and NAViGaTOR, providing new evidence of interactions (Figure 3), with most the highly associated pathways to antagonistic targets including the immune system, TNF signaling, NF-κB, and agonistic pathways WNT signaling, chemokine signaling, proteoglycans, and GPCRs (P < 1 × 10–10) (Supplemental File 1A). Finally, the top-scoring 200 ptPBLs and TIL pan-cancer FIR-DEGs were further refined by retaining those that were the most correlated in differential pan-cancer gene expression toward discovery of novel mechanistic pathways not deciphered from the above analyses (Figure 4). The combination of these scoring methods was used to select pan-cancer FIR-DEG for validation on a new RCC patient cohort (Supplemental Figure 1).

Figure 3 Pan-cancer DEGs have extensive PPI. PPI networks of the top 200 DEGs. A high PPI enrichment value (P = 1.85 × 10–10) indicating interactions among these DEGs is very significant relative to proteins drawn at random, indicating a biological connection as groups in defined pathways. Pan-cancer agonistic (red) and antagonistic (green) DEGs (nodes/circles) and their interactions (edges/lines) demonstrate groupings of these 2 pan-cancer DEG subclasses, and gray lines highlight interactions between them. Noninteracting DEGs are on the right (NAViGaTOR v3 and IID v04-2018). DEG nodes are colored according to GO Molecular Functions listed in the top left legend. Larger node circles represent the highest degree of DEGs interactors within the network, and blue DEG names represent centrality of interactors (as determined by the all-pairs shortest path algorithm in NAViGaTOR).

Figure 4 Pan-cancer DEGs have extensive coexpression dynamics. Correlograms of the top 200 selected prognostic pan-cancer DEGs demonstrate extensive coexpression dynamics in CD8+ ccRCC isolates (Spearman method, expression ladder on right) (n = 20). Predominant pathways of the 4 most highly correlating pan-cancer gene groups included GO biological processes — cellular responses to stimulus, receptor signaling, and regulation of metabolic processes and Kegg pathways — adherens junctions and colorectal, endometrial, blood and pancreatic cancers for the top right correlating ptPBL gene group. The bottom left ptPBL gene groups had extracellular matrix disassembly. For TILs, the bottom left gene group was stronger for GO biological processes such as receptor signaling, developmental processes, cell communication, and signal transduction, while the top right TIL gene group was dominated by cell cycle regulation processes (P = 4.98 × 10–05) and also regulation of T cell activation and cytokine production.

Pan-cancer and polarizing DEGs stratify CD8+, CD19+, PBLs, TILs, and TIICs. Twenty-eight pan-cancer FIR-DEGs and 62 commonly used T cell–polarizing genes defining known T cell subsets were selected for validation on a new, independent 74-patient RCC cohort, using TaqMan Gene Expression Assays on 96.96 microfluidic BioMark HD Real-Time PCR system dynamic arrays (Fluidigm), providing the advantage of DEG coexpression analysis. Total CD8+ ptPBL RNA from 41 patients with ccRCC, 8 with RCC, and 6 patients with papillary renal cell carcinoma (pRCC), and CD8+ cdPBL RNA from control donors were analyzed, with 3 ccRCC patient duplicates added as inter-assay RNA extraction controls. Five total ptPBMC and five total ndPBMC RNA preparations were also included. Finally, to maximize use of the microfluidics chip and to determine whether these could provide a baseline for DEG expression, pooled total RNA samples from CD8+ (n = 50 patients) and CD19+ (n = 50 patients) ptPBLs, CD8+ (n = 15 patients) and CD19+ (n = 15 patients) cdPBLs, ccRCC PBMCs (n = 10 patients), pRCC PBMCs (n = 10 patients), ndPBMCs (n = 10 control donors), and paired ccRCC TILs (n = 8) and TIICs (n = 8) were also included. BioMark HD–generated heatmaps, housekeeping genes, and loading controls are shown in Supplemental Figure 4, A–D.

Following normalization, correlograms were used to visualize coexpression dynamics between all DEGs (Supplemental Figure 4E). Unsupervised clustering demonstrated that pooled RNA fractions were stratified as expected, with CD8+ and CD19+ isolates stratifying furthest apart, and total PBMC isolates stratifying independently, but remaining closer to CD8+, as a function of T cells (7%–24%) representing a larger frequency of total PBMCs than B cells (1%–7%) (Figure 5A). Also expected, TILs stratified closest to total PBMCs, yet remained close to TIICs—reflecting tissue-infiltrating immune profiles. Finally, ccRCC ptPBLs and cdPBLs from either CD19+ or CD8+ isolates clustered closely, at opposite ends of the heatmap. Unsupervised clustering was also used to observe that individual ccRCC ptPBMCs were efficiently stratified from ndPBMCs (Figure 5B). PCA was used to visualize coupling of pooled RNA fractions and DEG coexpression, here demonstrating that patient TILs, PBMCs, and CD8+ ptPBLs were distantly stratified from both TIICs and CD19+ ptPBLs (Figure 5C). This 3-dimensional view also provided evidence of coexpressing groups of pan-cancer FIR-DEGs and polarizing genes.

Figure 5 Coexpressing pan-cancer and polarizing DEGs stratify CD8+, CD19+, PBLs, TILs, and TIICs. Using all genes from qRT-PCR validation (A and B), unsupervised hierarchical clustering algorithms using –ΔCt normalized qRT-PCR expression values applying the Euclidean distance metric and complete linkage clustering method were used to generate heatmap clustering and associated dendrograms (R programming language; R-studio). Heatmaps demonstrated that (A) pooled fraction used for validation experiments can efficiently stratify all isolates as expected from their genetic linkages and immune cell subset ratios of PBL populations (n = 74 patients, n = 176 samples, n = 9 sample pools) and that (B) total individual ptPBLs and cdPBLs cluster separately (n = 10). Pooled fractions are also used for PCA in (C), using –ΔCt normalized qRT-PCR expression values, applying the Euclidean distance metric and complete linkage clustering method (R programming language; R-studio) (n = 176 samples), demonstrating that PBLs are most closely linked to circulating CD8+ T cells and are different in DEG composition relative to circulating CD19+ B cells, TIL-Bs, and TIICs. PCA presented also demonstrates the common and differing coexpression of certain T cell–polarizing and pan-cancer DEGs in TIL and CD8+ ptPBL isolates. Pan-cancer genes are highlighted in green throughout. n, number of patients in pool; N, normal tissues DEG; nd, normal donor; neg, negative; pos, positive; prog, prognosis; pt, patient; T, tumor tissues DEG; tot, total.

Pan-cancer DEGs stratify patients with RCC from control donors. Differential expression and correlation analyses were coupled to identify pan-cancer FIR-DEG combinations most efficiently stratifying patients. Several pan-cancer FIR-DEGs (ICOS, PF4V1, IFNG, LAG3, TIGIT, CDA, PDK4, KLF4, PIM2, TIMP1, IGF2BP3, IL23A, LEF1, and TCF7), in combination with other T cell genes, efficiently stratified patients from control donors to an accuracy of 90.1% (Figure 6, A and D). The absence of novel discovered pan-cancer FIR-DEGs uncommon to T cell polarization caused loss of patient stratification (Supplemental Figure 5A); however, control donors still stratified with an LEF1- and NT5E-expressing population, which included other biomarkers of activation, and immune checkpoint BTLA, which we and others believe marks T cells having enhanced survival properties (26, 27).

Figure 6 Iterative DEG combination testing defining minimal gene sets required for stratifying patients from control donors according to circulating CD8+ T cells. Normalized –ΔCt qRT-PCR DEG expression values from individual and pooled CD8+ ptPBLs and cdPBLs were used for PCA using applying the Euclidean distance metric and complete linkage clustering method (R programming language; R-studio) (n = 69). (A) Patients are stratified using 32 DEGs including pan-cancer (ICOS, PF4V1, IFNG, LAG3, TIGIT, CDA, PDK4, KLF4, PIM2, TIMP1, IGF2BP3, IL23A, LEF1, and TCF7), T cell–polarizing (FASLG, ZEB2, EOMES, CCR5, TOX, PRDM1, BATF, FOXO1, CD28, and CD27), adhesion (JAM3, SELP), and immune checkpoint DEGs (CD160, CD244, PDCD1, TIM-3, BTLA, and NT5E). (B) Patients are stratified using 12 DEGs including pan-cancer (CDA, PDK4, KLF4, and IGF2BP3) and adhesion (JAM3, SELP) DEGs. (C) Patients are stratified using 3 DEGs (pan-cancer, MMP9, and LEF1; T cell polarizing, FASLG). Boxes with a pale yellow background highlight PCA-stratified control donors used to calculate the percentage of patient stratification. (D) Graph representing the percentage of patient stratification from DEG groups in (A–C) and in Supplemental Figure 5, with representative numbers of pan-cancer genes among groups at the bottom (n = 66, nonduplicate samples). (E and F) Venn diagrams demonstrating overlaps between (E) CD8+ ccRCC ptPBL DEGs, CD8+ ccRCC TIL DEGs, CD8+ HIV elite controllers, and PBMC from patients infected with bacteria and (F) effect of pan-cancer pipeline on enhancing CD8+ DEG identity. dupe, duplicate sample; misclas., misclassified benign kidney lesion; n, number of pooled samples; other, other DEGs; pan-can, pan-cancer; ub-fig., associated sub-figure.

Combination testing revealed that a smaller set of these patient-stratifying pan-cancer genes (IFNG, CDA, PDK4, KLF4, IGF2BP3, and LEF1) could also stratify patients to an accuracy of 89.1% (Figure 6, B and D), which could not be met in their absence (Supplemental Figure 5, B and C). Additional combination testing identified a minimal set of 3 DEGs (MMP9, FASLG, and LEF1) stratifying patients to an accuracy of 79.3% (Figure 6, C and D). Interestingly, aside from stratifying patients from control donors, pan-cancer FIR-DEG PCAs revealed 2 dominant CD8+ ptPBL populations containing either FASLG or LEF1 together responsible for triggering cell death or cell activation. In addition, the 3 internal patient duplicates remained closely clustered throughout PCAs, whereas pooled RNA factions were centralized among their counterparts (Figure 6, A–C). Further correlation analyses performed on patients with RCC populating yellow PCA quadrants occupied by control donors demonstrated these to have increased CXCR3 (P = 0.0021; r = 0.4898; CI, 0.1874–0.7074) and CXCR5 (P = 0.0029; r = 0.4764; CI, 0.1705-0.6988) (Spearman method), suggesting that these may be 2 key RCC fitness genes, also recently linked to increased abilities of broadly neutralizing antibody production by HIV-1 elite controllers (28).

Pan-cancer FIR-DEGs common to RCC and HIV. This link between RCC ptPBL DEGs and HIV-1 controllers prompted us to examine other pan-cancer FIR-DEGs commonly expressed by HIV-1 controllers. Intriguingly, the majority of our validated pan-cancer FIR-DEGs were represented in an HIV-1 elite DEG screen (29). As such, we searched the literature to elucidate which of these DEGs were useful to both cancer and HIV-1 when expressed by PBLs, demonstrating that 60% of these similarly polarized T cells toward permissiveness to cancer development and HIV-1 infection (Supplemental Table 5). In comparing HIV-1 controller DEGs similarities to ccRCC DEGs, we observe that the pan-cancer DEG prioritization pipeline increased identity to HIV-1 controller DEGs from 17% (467 pan-cancer DEGs) to 50% (top 100 pan-cancer DEGs) (Figure 6E). This finding led us to consider whether the pan-cancer FIR-DEG pipeline was actually identifying pan-pathology genes. We thus compared our pan-cancer DEGs to data sets from another study aimed at identifying frontline biomarkers common to numerous pathologies (30). Strikingly, 82.1% of our ptPBL-based and 42.8% of our TIL-based top 200 pan-cancer FIR-DEGs were confirmed by their findings, with 51% of 467 pan-cancer DEGs and 59% of the top 100 pan-cancer DEGs present (Figure 6F). A total of 71.1% of DEGs were commonly reflected by bacterial infection data sets. Potentially revealing pan-pathology T cell biomarkers, we then compared our lists to cancer patient data sets of response to anti–PD-1 immunotherapy (31, 32), highlighting a few of our pan-cancer FIR-DEGs (Supplemental Table 6), notably including our MMP9, FASLG, and LEF-1 minimal triad stratifying patients with ccRCC from control donors (Figure 6C; see Supplemental Table 7 for a summary of validated DEGs common to other data sets).

Pan-cancer DEGs are associated with pan-cancer recurrence and T cell activation. Within the validation cohort, 10 of 28 patients with ccRCC (35.7%) were recorded as having been previously treated for other malignancies including kidney, bladder, blood, breast, colon, liver, melanoma, ovary, prostate, rectal, and uterine cancers, in which a few had suffered from 3 different malignancies with no recorded metastases. We used this opportunity to compare validated DEGs across control donors, patients with RCC, and those positive or negative for pan-cancer recurrence. Strikingly, MMP9 expression best stratified patients with pan-cancer recurrence (P < 0.0001, t test; P = 0.007, 2-way ANOVA with Tukey’s post test) (Figure 7A). All patients categorized as MMP9hi had previously suffered from RCC, along with blood, breast, colon, melanoma, ovarian, prostate, or uterine cancers with a high proportion of adenocarcinomas (80.0%). A disproportionate number of patients categorized as MMP9lo had previously suffered from bladder or prostate cancer (66.6%). Other DEGs stratifying patients with recurring pan-cancer were KLF4, RORC, PDK4, and CCR4; yet these genes were decreased in these patients with recurrence.

Figure 7 Additive prognostic pan-cancer DEGs stratify multi-cancer recurring ccRCC patients having activated CD8+ T cell profiles. (A) DEGs from the validation cohort were compared among cdPBLs (n = 12) and ptPBLs with (n = 10) or without (n = 18) recurring multi-cancers. P, 2-way ANOVA with Tukey post test; *, P < 0.05; **, P < 0.01; ****, P < 0.0001; boxes, upper and lower quartiles; whiskers, all points maxima to minima; +, mean; line, median. Functional classifications of DEG groups are listed above and below, and the literature was used to (B) segregate DEGs according to tolerance or activation phenotypes. Correlograms (Spearman method) using normalized –ΔCt qRT-PCR expression values for visualization of 2 groups of pan-cancer and T cell–polarizing DEGs, with differences observed between all patients with ccRCC vs. control donors and patients with vs. without recurring cancers (Student t test, P < 0.05) (red, increased expression; green, decreased expression). Only MMP9 is significantly increased in multi-cancer patients relative to all others. (C) Pan-cancer DEG combinations tested for additive prognostic effects using TCGA KIRK data set. Only MMP9, LEF1, PF4V1, TIMP1, and TMEFF1 demonstrate additive prognostic effects, and these cluster in correlograms (as above) enquiring pan-cancer DEGs with combinatorial effects on prognosis. Kaplan-Meier plots P, log-rank. (D) Venn diagram illustrating that ptPBLs have more differentially represented exon-exon PSR junctions relative to TILs; both are relative to TIICs (P < 0.05; ANOVA, Transcriptome Analysis Console v.3, Affymetrix), with 8% overlap of total PSR junctions between ptPBLs and TILs and 47% of all pan-cancer DEGs having shared ptPBL and TIL PSR junction identity (see Supplemental Table 8).

From PCA analyses demonstrating that pan-cancer DEGs stratified 2 CD8+ T cell pools in addition to individual patients (Figure 6), along with observations of their ability to stratify patients according to pan-cancer recurrence (Figure 7A), we applied correlograms to observe whether their combined expression could tip the balance between tolerant/anergic and activated/effector T cell profiles. The merging of correlograms providing a split in DEG populations, expression levels of DEGs between isolates, and balance of DEGs formerly documented in the literature as being associated to activation or tolerance phenotypes, suggested existence of a dominant activated effector CD8+ ptPBL population (Figure 7B). The majority of these effector DEGs (69.2%) providing an activation phenotype were downregulated in patients with pan-cancer recurrence, suggesting these patients may lack the ability to mount an immune response.

Pan-cancer DEGs synergize toward prognosis and are subject to splicing defects. The TCGA KIRC–probing prognostic algorithm was modified to test all combinations of additive effects of pan-cancer FIR-DEGs on patient prognosis. The only DEGs with marked additive effects on patient survival were MMP9, LEF1, PF4V1, and TIMP-1 — this observation gained additional support from correlograms providing evidence of their coexpression (Figure 7C). Additionally, relative to the 3-DEG signature stratifying patients (Figure 6C), although FASLG was not associated with prognosis (P = 0.401), patients categorized as MMP9hi LEF1hi FASLGhi KIRC had reduced survival rates (HR, 0.0988-0.6324; P = 3.71 × 10–05). While the TCGA KIRC data set represents whole tumor RNA expression, the Human Protein Atlas showed that unlike the others having additive effects on prognosis, expression of MMP9 RNA and protein is strictly associated with lymphoid and myeloid systems, thus, possibly enhancing prognostic effects by identifying immune-relevant signature populations from whole tumor data sets. Inverse correlation observed between oncostatic melatonin receptor 1A (MTNR1A), extensively expressed by lymphocytes and MMP-9 expression in RCC as a plausible mechanism for our findings (33), prompted us to reexamine microarray data sets to see that MTNR1A was reduced in ccRCC CD8+ TILs (P = 8.3 × 10–04) and CD19+ TIL-Bs (P = 1.4 × 10–04).

To gain insight on other possible mechanisms behind effects of pan-cancer FIR-DEGs on patients, and because we used the Human Transcriptome Array (HTA) 2.0 microarray able to distinguish between differential gene expression and transcript isoform modulation, we used the microarray data set to observe whether these differed at the isoform level. In paired patient CD8+ ptPBLs and TILs, with the exception of HIST1H2BG, ICOS, and IFNG, all other validated pan-cancer FIR-DEGs had modified isoforms, and 47.36% of these were found to be mirrored between CD8+ ptPBLs and TILs relative to TIICs (Figure 7D) (Supplemental Table 8). Additionally, as determined by Affymetrix Transcriptome Analysis Console software, there were many more distinct transcript isoforms present and heightened splicing indices for ptPBLs than for TILs, relative to TIICs (i.e., ptPBLs vs. TIICs, 71.97%; avg. splicing index = 18.432, avg. splicing event score = 0.224; TILs vs. TIICs, 28.57%, avg. splicing index = 1.727, avg. splicing event score = 0.306). Thus, the transcript isoform repertoire of CD8+ ptPBLs is much larger than that of CD8+ TIL, likely due to similarities for tissue infiltrates but with a few notable differences including higher isoform numbers for immune checkpoints TIGIT and LAG3. Both MMP9 and TCF7 common isoforms were further increased in ptPBLs relative to TILs, and both CD69 and IQGAP1 common isoforms were modified in ptPBLs relative to cdPBLs. For MMP9, TIMP1, IQGAP1, MPHOSPH8, CD69, TCF7, LAG3, and TIGIT, the same isoforms are repeatedly represented among isolate types (i.e., CD8+ ptPBLs and TILs, relative to CD8+ TIICs and cdPBLs) (Supplemental Table 8). Together, these results suggest that prognostic effects of pan-cancer FIR-DEGs may also be the result of deficiencies in transcript isoforms required for optimal T cell fitness.

Enrichment of pan-cancer–disrupted MMP9 pathways in ccRCC ptPBLs. Our initial strategy to use PPI analyses for refining ccRCC DEGs for validation was only partially useful. Now armed with validation experiments and strength in statistics for individual DEGs by repeating PPI analysis using a rudimentary search engine (STRING), the importance of MMP9 having the highest combined interaction annotation score (14.91) and its positioning as a central interacting node of pan-cancer FIR-DEGs (TIMP1, PDK4, LEF1, CDA, KLF4, PF4V1, SELF, PIM2, ICOS, IFNG, IL23A, IL6ST, TCF7, SELL, SERPINE1, OSM, CXCL5, HBA1, COLA1, MAB2, LIFR, IQGAP1, MAPK8, PIK3CA, BLC2, LAG3, and TIGIT) with associated cytokine production and immune cell migration and adhesion cellular processes held more weight (Supplemental Figure 6).

MMP9 in CD8+ PBLs was 1 of 3 DEGs able to stratify patients with RCC from control donors, and MMP9 was increased in patients with RCC who had recurring pan-cancer. We thus used advanced PPI and pathway analyses (IID and pathDIP, using NAViGaTOR) to reexamine the microarray data sets with the aim to decipher the significant role of MMP-9 in signaling cascades at play in patients with ccRCC. A comprehensive pathway enrichment analysis using all 1036 nonredundant ptPBL DEGs identified pathways including amyloid fiber formation, platelet activation, sirtuin (SIRT) and histone deacetylase (HDAC) activation, leukocyte transendothelial migration, alcoholism, SUMOylation, androgen receptor, and TNF-α (P < 1 × 10–10); all had links to MMP9 regulation (Supplemental File 1B). To identify the most relevant MMP-9 pathways of the 235 revealed by pathDIP in ccRCC ptPBLs, we performed correlation analyses revealing that 216 of the 1036 DEGs were significantly correlated with MMP-9–positive pathways (Supplemental Table 9). Generating physical PPI networks using NAViGaTOR demonstrated that all but 6 of these 216 DEGs (97.71%) do interact (Figure 8A, see Supplemental Figure 7 for full PPI). From pathway-enrichment analysis using pathDIP, many disease, cancer, and immunity pathways could be repeatedly observed in MMP-9–significant DEG-associated pathways (Figure 8B, see Supplemental Figure 8 for full analyses). Tissue-specific disrupted PPI networks among MMP-9 interactors in 13 cancers were examined. The majority of identified genes represented in cell/leukocyte migration and adhesion processes and extracellular matrix disassembly and collagen metabolism (Supplemental Figure 8) as recently reported represent pretreatment serum biomarkers in response to ICB (34). Genes common to ccRCC ptPBLs are involved in immune response and activation, apoptosis regulation, and migration in response to bacteria (35). Interestingly, cancers having the highest MMP-9 gained and lost PPIs were colon, mouth, and lung (Figure 8C). Finally, an independent differential correlation analysis and organization of MMP-9 pathways and their significantly associated DEGs was used to validate that although extracted from ccRCC ptPBL expression signatures, the majority of MMP-9 pathways filtered on ccRCC DEGs were most linked to a variety of renal diseases; numerous viral, bacterial, and parasitic infections; numerous cancers; immunity and antigen recognition and activation; differentiation; and cellular survival pathways (Figure 9; see Supplemental Figure 9 for expanded pathway DEG names).

Figure 8 Enrichment of disrupted MMP9 pathways in ccRCC patient circulating cells and various cancers. (A) PPI network linking pan-cancer proteins from significant MMP-9 pathways-associated ccRCC ptPBL DEGs. DEGs (nodes/circles) and their interactions (edges/lines) are shown in red (high expression) and green (low expression), and gray edges highlight interactions between them (NAViGaTOR v3 and IID v04-2018). Noninteracting proteins are listed on the top right. DEG nodes are colored according to GO Molecular Functions listed in the legend. Larger node circles, represent the high degree of interactions with all other DEGs, and blue DEG names represent centrality of interactors. (B) Pathway enrichment analysis graphs depicting results of pathDIP analysis for MMP-9 pathway interactors from correlation analyses. Upper panel shows significance of enrichment obtained for individual pathways (P value, –log 10 ) adjusted for multiple testing using FDR and Bonferroni methods. Lower bar plot shows overlap between query genes and members of individual pathways. Respective numbers of known and predicted pathway members are distinguished by opacity, and fill color indicates source of given pathway. Plots are restricted to the top 100 most significant (see Supplemental Figure 7A for the full pathways). (C) Tissue-specific disrupted PPI networks among MMP-9 interactors in cancer. Gained and lost MMP-9 PPIs in 13 nonmalignant and pretreatment tumors highlighting its tissue-specific role in cancer, where 106 disrupted MMP-9 PPIs were identified (n = 2801) (see Supplemental Figure 7). A total of 1814 disrupted PPIs were found, 81% of which are disrupted in only 1 or 2 tissues and only less than 5% are present in more than 3 tissues.