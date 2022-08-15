Total RNA-Seq profiling reveals distinct clusters of repeat RNA expression across different epithelial cancers. In order to comprehensively define the expression of repeat RNAs in EOC and compare this with expression in other cancers, we applied our previously established computational alignment methods for total RNA-Seq (19) to 31 patient-derived low-passage EOC, 17 commercially available ovarian, 26 pancreatic ductal adenocarcinoma (PDAC), and 11 colorectal cancer (CRC) cell lines (Figure 1A). In EOC models, this confirmed high contribution of noncoding transcripts to the total transcriptome (Figure 1B), with all major subclasses of repeats represented (Figure 1C). Expression levels of individual repeat RNAs varied, with some repeat RNAs (e.g., L1HS and HERVH) expressed at high levels that were comparable to those of traditional housekeeping genes, such as ACTB and GAPDH (Supplemental Figure 1A; supplemental material available online with this article; https://doi.org/10.1172/JCI155931DS1). Across EOC, PDAC and CRC, clustering of cell lines by coding genes segregated the samples by cancer type with high accuracy (Figure 1D). As expected, given unique transcriptional programs associated with different cancer types, major clusters comprising nearly entirely EOC (cluster 1 and cluster 5), CRC (cluster 3), and PDAC (cluster 4) cell lines emerged, with 1 additional cluster comprising samples of all 3 cancer types (cluster 2). Notably, clustering by repeat RNA expression alone was able to similarly distinguish between cancer types with few exceptions (Figure 1E), suggesting that, despite common overall repeat dysregulation across epithelial cancers, some repeat RNA species are cancer-type specific and may have important biological roles or consequences in these tumors.

Figure 1 Diverse repeat RNA expression profiles are present in epithelial cancers and cluster tumors by tissue of origin distinctly compared with coding gene-based clustering. (A) Graphical abstract of experimental strategy. (B) Proportion of the total transcriptome represented by mRNA, ribosomal RNA/transfer RNA (rRNA/tRNA), annotated repeats, and nonannotated repeats, averaged across all epithelial ovarian cancer (EOC) cell lines. (C) Quantification of subclasses of repeat RNAs across EOC models using total RNA-Seq expressed as proportion of total transcription, including coding and noncoding reads in each cell line or in patient-derived cells. (D) Heatmap and hierarchical clustering of EOC (green), pancreatic ductal adenocarcinoma (PDAC; purple), and colorectal cancer (CRC; gold) cell lines by coding gene expression, including all coding genes that were differentially expressed between any 2 cell lines (adjusted P < 0.05 and |log 2 fold change| >1). Expression is plotted as scaled log 2 (normalized counts per million). Pie charts C1–C5 depict the cancer-type composition of each cluster as labeled. (E) Heatmap and hierarchical clustering of EOC (green), PDAC (purple), and CRC (gold) cell lines by repeat RNA expression, including all repeat species that were differentially expressed between any 2 cell lines (adjusted P < 0.05 and |log 2 fold change| >1). Expression is plotted as scaled log 2 (normalized counts per million). Major clusters defined by similar repeat expression profiles are outlined by black boxes. Pie charts R1–R5 depict the cancer-type composition of each cluster as labeled.

Satellite repeat RNAs cluster distinctly from other repeat elements and display variable expression across cancer models. To identify subclasses of repeat RNAs with biological relevance across tissue types, consensus clustering analysis of repetitive elements across all cell line samples was performed (see Methods). We detected 5 distinct clusters of coexpressed repetitive elements within cluster 2, demonstrating the strongest consensus correlation across samples (Figure 2A, denoted by the red asterisk, and Supplemental Figure 1B). Subclass analysis revealed an enrichment for satellite (SAT) repeats in cluster 2 (Figure 2B). Notably, SAT expression was found to be highly variable in EOC cell lines, with SAT RNAs representing the highest proportion of the 50 most variant transcripts (Figure 2C). In line with this, clustering of consensus expression profiles for each major subclass of repeats showed the distinct expression patterns of the SAT subclass across cell lines (Figure 2D). Furthermore, hierarchical clustering of EOC, PDAC, and CRC cell lines based on SAT RNA expression demonstrated unique clustering of samples that was not driven by tissue of origin (Figure 2E). This indicated that SAT RNA expression patterns could have shared transcriptional programs across diverse epithelial cancers.

Figure 2 Repeat RNAs are coregulated in discrete clusters, with satellite repeat RNAs exhibiting unique expression patterns in epithelial cancers. (A) Heatmap for consensus clustering of repeat elements based on normalized expression. The red asterisk highlights satellite repeat–driven (SAT-driven) cluster 2, which has the strongest consensus correlation of the analyzed clusters. (B) Mosaic plot demonstrating relative repeat element subclass composition of each consensus cluster from A. The red box indicates SAT representation in cluster 2. (C) Proportion of total repeat expression for each subclass within the top 50 variant repeat RNAs across cell lines. (D) Hierarchical clustering of consensus expression of each repeat subclass across EOC (green), PDAC (purple), and CRC (gold) cell lines, depicting SAT consensus expression distinct from consensus expression of other repeat subclasses. (E) Heatmap and hierarchical clustering of EOC (green), PDAC (purple), and CRC (gold) cell lines by SAT RNA expression. Expression is plotted as scaled log 2 (normalized counts per million). Major clusters defined by similar SAT expression profiles are outlined by black boxes. Pie charts S1–S5 depict the cancer-type composition of each cluster, highlighting clusters distinct from tissue of origin.

SAT repeat RNA expression is linked with an immunosuppressive epithelial-mesenchymal transition gene expression pattern in EOC. To better characterize the relationship of repeatome profiles with coding gene behavior in EOC, we first applied gene set enrichment analysis (GSEA) with the hallmark gene set from the Broad Institute’s Molecular Signatures Database (24, 25) to a gene list ranked based on correlation with the consensus expression calculated for each repeat subclass across EOC cell lines. This demonstrated high positive correlation of SAT repeats with the epithelial-mesenchymal transition (EMT) gene set and anticorrelation with several immune and IFN-response sets, including the IFN-α and IFN-γ gene sets (Figure 3A). A parallel analysis separating EOC cell lines into SAT-high and SAT-low cell lines on the basis of median consensus expression identified enrichment for the hallmark EMT gene set in SAT-high cell lines, while IFN-α, IFN-γ, and inflammatory-response gene sets were enriched in SAT-low cell lines (Supplemental Figure 2A), further validating these associations.

Figure 3 Satellite repeat expression is associated with upregulation of epithelial-mesenchymal transition and downregulation of innate immune-response genes in EOC models. (A) Heatmap of enriched Gene Ontology terms identified using gene set enrichment analysis (GSEA) plotted based on normalized enrichment score. GSEA was applied to a ranked gene list based on correlation with the consensus expression calculated for each repeat subclass, with the FDR set at 0.05. Positive enrichment scores (red) indicate functions that positively correlate with repeat subclass expression. Negative enrichment scores (green) indicate functions that negatively correlate with repeat expression. (B) Hierarchical clustering of consensus expression calculated for each repeat subclass in EOC cell lines. Major clusters are outlined by black boxes. (C) Representative RNA-ISH images with HSATII-specific probes in 2 EOC cell lines and correlation (Pearson’s r2) between HSATII RNA expression as determined with RNA-Seq by log(reads per million[RPM]) (as determined with RNA-Seq, with log(reads per million) as units) and percentage of tumor cells with a positive staining for HSATII by RNA-ISH. Original magnification, ×40 (D) Heatmap for consensus clustering of all repeat elements except HSATII, which was removed from analysis, based on normalized expression. The asterisk highlights SAT-driven cluster 1, which shows the highest consensus correlation of analyzed clusters, similar to clustering when HSATII was included. (E) GSEA of hallmark terms ranked on the basis log 2 FC of coding genes for pathways containing genes that are upregulated and downregulated in HSATII-high compared with HSATII-low ovarian cancer cell lines, based on highest (Q4) and lowest (Q1) quantile (see Supplemental Figure 3C). Colored boxes represent the pathways indicated in F. Circle size represents gene set size, and circle color represents adjusted P value. (F) Volcano plot depicting differentially expressed coding genes between HSATII-high and HSATII-low EOC cell lines. EMT, IFN-α, IFN-γ, and inflammatory hallmark pathways are highlighted.

To further investigate this observation, hierarchical clustering of EOC models by consensus expression of repeat RNA subclasses was performed; this separated EOC cell lines into 3 major clusters, as depicted in Figure 3B. Repeat-high (Rep-H) cell lines displayed high expression of all subclasses, while repeat-low (Rep-L) cell lines had relatively low repeat RNA expression in general. A third distinct cluster also emerged; it exhibited high expression of all subclasses of repeat RNAs except for SAT RNAs, which we referred to as SAT-depleted (SAT-D) cell lines. To further characterize the specific contribution of SAT repeats specifically, GSEA was performed on Rep-H and SAT-D EOC cell lines (Supplemental Figure 2B). GSEA demonstrated enrichment of EMT-related genes and downregulation of genes related to innate immune and IFN-response pathways in the Rep-H cell lines compared with SAT-D cell lines, confirming the association observed in the total cohort when analyzed by correlation with SAT expression (Figure 3A and Supplemental Figure 2A). Rep-L EOC had higher enrichment of cell cycle– and replication-related pathways (hallmark E2F, G2M, MYC targets), indicating an anticorrelation of repeat expression with mitotic activity. Collectively, this refined repeat subtyping identifies unique characteristics, including high EMT expression in Rep-H cell lines, activation of IFN-response genes in SAT-D cells lines, and high proliferative activity in Rep-L cells lines.

Human SAT II is a representative SAT repeat RNA that correlates with worsened clinical outcomes in human EOC. In order to further investigate the biological implications of high SAT expression in EOC, we selected human SAT II (HSATII) as a representative repeat species within the SAT subclass; we had previously found it enriched across epithelial cancers (3). As expected, HSATII expression was significantly higher in Rep-H cell lines (Supplemental Figure 3A); this was validated in a subset of cell lines by RNA-ISH (Figure 3C). In EOC samples in which HSATII has been removed, consensus clustering analysis of repetitive elements across samples (Figure 3D and Supplemental Figure 3B) yielded a similar SAT-driven cluster that displayed the strongest consensus correlation; this implies that HSATII is not the sole driver of the SAT subclass but is instead a representative member. Differential gene expression analysis was run on HSATII-low and HSATII-high EOC cell lines to determine coding gene expression patterns linked with HSATII expression. This revealed that genes related to EMT were upregulated and genes related to IFN-response and inflammatory pathways were downregulated in HSATII-high samples (Figure 3, E and F). These results are similar to those from the same comparisons based on total SAT expression (Supplemental Figure 2A).

To interrogate the association of SAT repeats with transcriptional programs in patients, we investigated patterns of HSATII expression in total RNA-Seq data from a cohort of 96 human primary ovarian carcinomas (26). Similar to EOC cell lines, tumors with high levels of HSATII expression (Supplemental Figure 3C) demonstrated upregulation of genes related to EMT and downregulation of genes in the IFN-α, IFN-γ, and inflammatory pathways (Figure 4, A and B) compared with those with low HSATII expression. We then performed HSATII RNA-ISH with quantitative image analysis in a separate cohort of patients with advanced high-grade serous ovarian cancer (HGSOC) from the Dana-Farber Cancer Institute to segregate primary tumors into those with low or high HSATII expression (Figure 4C). Notably, separating tumors by HSATII expression revealed significantly shorter overall survival of patients with HSATII-high tumors (Figure 4D). Taken together, this work shows that repeats are a diverse set of RNA species; some are associated with tumor cell IFN response (9, 10, 12, 13, 27, 28), while others, such as SAT repeats, are associated with EMT and low IFN signaling that is typically seen in more aggressive tumors.

Figure 4 High satellite repeat expression is linked with upregulation of epithelial-mesenchymal transition, suppressed immune response, and worsened clinical outcomes in primary human EOC. (A) GSEA results ranked by normalized enrichment score for pathways containing genes that are upregulated (right) and downregulated (left) in HSATII-high compared with HSATII-low early-stage human ovarian carcinoma samples (n = 96). Colored boxes represent the pathways indicated in B. Circle size represents gene set size, and circle color represents adjusted P value. (B) Volcano plot depicting differentially expressed coding genes between HSATII-high and HSATII-low early-stage human ovarian carcinoma samples (n = 96). Genes driving the enrichment of EMT, IFN-α, IFN-γ, and inflammatory Hallmark pathways are highlighted. (C) Representative images of RNA-ISH with an HSATII-specific probe, depicting an example of an HSATII-low (left) and HSATII-high (right) primary human EOC tumor. Scale bar: 100 μm. (D) Kaplan-Meier survival curves for HSATII-high (red) and HSATII-low (blue) in a cohort of 16 primary human EOC tumors using quantified RNA-ISH. All data points and 95% CI are shown (dotted lines). Number at risk is number of patients in the analysis at that time point. Number censored are those who did not experience an event but had their last data point at that time interval. Log-rank, P = 0.0016.

Given the correlation of HSATII with low IFN response, high EMT, and worsened survival, we next evaluated the relationship between HSATII expression and the immune microenvironment in ovarian cancer. Cellular deconvolution analysis of the 96 total RNA-Seq ovarian tumors using xCell (29) was performed to estimate percentages of specific immune populations and then calculate their correlation with HSATII expression (Figure 5A and Supplemental Figure 4A). Immune cells positively correlated with HSATII included immature dendritic cells, Tregs, and myeloid cells (monocytes, macrophages, neutrophils), indicating an immune microenvironment dominated by innate immune cells. Given our prior work demonstrating that some noncoding RNAs expressed in cancer cells can directly activate cells of the mononuclear phagocytic system (17), we hypothesized that extracellular vesicles (EVs) could serve as a vehicle to deliver HSATII and other repeats with the ability to modulate innate immune cells within the tumor microenvironment. To test this, we first collected EVs released by PDAC and EOC cell lines and confirmed that isolated EVs expressed typical EV-associated cell surface markers (Supplemental Figure 4B). RNA was then purified from tumor cell–derived EVs and subjected to total RNA-Seq. Compared with the RNA profile of each parental cell line, a robust enrichment of a diverse set of repeat RNAs was detected in EVs isolated from each cell line (Figure 5B), with HSATII being one of the most prevalent RNAs (Figure 5C). To then test the effect of HSATII-enriched EVs on human myeloid cells, we purified EOC-derived EVs and applied them to flow cytometry–sorted CD14+ PBMCs collected from healthy human donors (Figure 6A). CD14+ PBMCs exposed to EOC EVs demonstrated upregulation of genes related to the activation of the innate immune and IFN responses (hallmark IFN-α, IFN-γ, and inflammatory response) compared with unexposed CD14+ cells (Figure 6, B and C). A similar activation of genes within these pathways was observed in response to both PDAC and EOC tumor cell–derived EVs and in CD14+ cells from multiple individual healthy donors in separate experiments (Supplemental Figure 4C), suggesting a common response of monocyte-derived cells to repeat RNA–enriched EVs in the tumor microenvironment.

Figure 5 Repeat RNAs enriched in tumor cell–derived extracellular vesicles can induce changes in the tumor immune microenvironment. (A) Pearson’s correlation coefficients between normalized HSATII expression and the relative frequency of immune cell types in 96 human early-stage ovarian carcinoma tumor samples as identified by the xCell algorithm. Red asterisks indicate correlations with Q < 0.1. *P < 0.05. (B) RNA content of tumor cells (C) and tumor cell–derived extracellular vesicles (E) in PDAC (top) and EOC (bottom) cell lines, as determined by total RNA-Seq and plotted as a fraction of the total transcriptome. (C) Expression heatmap of representative repetitive elements in extracellular vesicles released by EOC cell lines.

Figure 6 Repeat RNA-enriched extracellular vesicles can induce changes in the tumor immune microenvironment. (A) Schema of experimental design relating to data in B–E. (B) Gene set enrichment analysis of IFN-response signatures and inflammatory response in extracellular vesicle–treated (EV-treated) versus untreated samples. NES, normalized enrichment score. (C) Volcano plot depicting the differential expression of coding genes between EV-treated and untreated EOC cell lines. Genes driving the enrichment in IFN-α, IFN-γ, and inflammatory hallmark pathways are noted. (D) Quantitative RT-PCR of IFN-response genes from THP-1 monocyte cell line treated with high-dose or low-dose EVs from ovarian cell lines, OAW28 (left) and IGROV1 (right). (E) Schema of THP-1 cells treated with HSATII or GFP RNA transfection and quantitative RT-PCR of IFN-response genes without transfection (TF) or with transfection of GFP RNA or HSATII RNA. For RT-PCR, all data points are shown as mean ± SD. One-way ANOVA analysis was performed with Tukey’s multiple comparisons test; significance is shown between EV treatment and PBS or HSATII and GFP RNA. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.

Given the presence of multiple classes of repeat RNAs in EOC-derived EVs (Figure 5B) and the distinct effects on tumor and immune cells conferred by different repeat RNA species, we sought to determine if HSATII specifically can stimulate myeloid cells. To test this, we used THP-1 monocytic cells to evaluate the responses of these innate immune cells to treatment with EOC EVs and in vitro transcribed HSATII RNA. THP-1 cells treated with EOC EVs from 2 different cell lines had significant induction of IFN-response genes, including DHX58, IFNB1, ISG15, OAS2, MX1, MX2, and IFI44, as measured by quantitative RT-PCR (Figure 6D). HSATII transfection, compared with GFP RNA transfection, significantly induced expression of IFNB1, OAS2, ISG15, MX1, MX2, and IFI44, which indicates that HSATII RNA in EVs partially contributes to the IFN response seen by EOC EVs in monocyte-derived cells (Figure 6E). This suggests that HSATII is sensed by and can generate an IFN response in immune cells that are enriched in the tumor microenvironment of HSATII-high tumors.

Modulating the repeatome with epigenetic drugs or repeat-specific antisense oligos has diverse effects in EOC. We have shown that repeat RNAs can be transmitted to responding innate immune cells and drive an IFN response. However, our collective analyses in EOC cell line models and tumors has indicated that tumor cells with high baseline levels of SAT RNAs lack IFN pathway activation, implying that they have developed an adaptation to suppress the IFN response to repeats. Therefore, we hypothesized that modulating different repeats in tumor cells may overcome this repeatome tolerance. Repetitive elements are known to be suppressed in the normal genome, in part, by DNA and histone methylation (7, 8), and epigenetic therapies have been shown to induce transcription of some repeat species in ovarian cancer models (9, 12, 13). Thus, we first tested the effect of treatment with a DNA methyltransferase inhibitor (DNMTi; 5-azacytidine, 500 nM) and a histone deacetylase inhibitor (HDACi; trichostatin A, 250 nM) on EOC cell lines. As expected, these drugs induced broad changes in repeat element expression, but there were notable differences, with DNMTi promoting a greater induction of ERV, SINE, and LINE elements, while HDACi consistently increased SAT elements across cell lines (Supplemental Figure 5, A and B). Analysis of coding genes induced by these agents revealed enrichment of IFN-response gene expression in cell lines treated with DNMTi, whereas EMT pathway genes were enriched in cell lines treated with HDACi (Supplemental Figure 5, C and D); this was consistent with coexpression patterns of these distinct repeat subsets in our EOC cell lines (Figure 3A) and tumors (Figure 4, A and B). These findings suggest that DNA methylation and histone acetylation have different contributions to the regulation of the repeatome profile in EOC, and, importantly, the response to these drugs can have discordant pro- and antitumoral effects on cancer cells.

Given the consistent relationship between SAT repeat expression and EMT-high and IFN-low phenotypes, we pursued direct targeting of the HSATII-specific locked nucleic acids (LNAs) as an antisense oligo therapeutic. HSATII LNAs and control scramble LNAs were transfected into EOC cell lines, followed by total RNA-Seq analysis at various times after transfection. This revealed a specific and marked increase in HSATII RNA in the cells, which peaked on days 2–3 (Figure 7A), with minimal off-target effects on other repeat RNA species. Analysis of the coding gene transcripts in HSATII LNA–transfected cells over time revealed an upregulation of innate immune-response genes and IFN-stimulated genes, indicating that HSATII LNAs could target cancer-specific HSATII RNA and trigger an IFN response (Figure 7B). In addition, EOC cells grown in nonadherent culture following HSATII LNA transfection consistently demonstrated a significant reduction in tumorsphere growth and, in the case of OVSAHO cells, increased cell death, compared with cells transfected with control LNA (Figure 7C). Further investigation into the immune-related transcriptional changes in HSATII LNA–transfected tumor cells also revealed alterations in expression of genes related to MHC class I (MHC-I) antigen presentation. Similar to the anticorrelation observed between steady-state HSATII levels and innate immune and IFN-response genes, we found that EOC cell lines (Supplemental Figure 5E) and primary human EOC tumors (Supplemental Figure 5F) with higher baseline HSATII RNA levels had decreased expression of MHC-I–related genes. However, EOC cells transfected with HSATII LNA revealed a striking upregulation of these MHC-I–related genes (Figure 7E). Furthermore, HSATII LNA–transfected EOC cell lines also demonstrated an increase in MHC-I proteins on the cell surface compared with control LNA-transfected cells (Figure 7D). Taken together, the increase in HSATII RNA levels by targeted LNA induced an IFN response associated with EOC tumor cell cytotoxicity and upregulation of MHC genes, suggesting the possibility that HSATII RNA modulation could sensitize EOC tumor cells to immunotherapy strategies.