Genomic and transcriptomic profiling reveals distinct molecular subsets associated with outcomes in mantle cell lymphoma

Mantle cell lymphoma (MCL) is a phenotypically and genetically heterogeneous malignancy in which the genetic alterations determining clinical indications are not fully understood. Here, we performed a comprehensive whole-exome sequencing analysis of 152 primary samples derived from 134 MCL patients, including longitudinal samples from 16 patients and matched RNA-Seq data from 48 samples. We classified MCL into 4 robust clusters (C1–C4). C1 featured mutated immunoglobulin heavy variable (IGHV), CCND1 mutation, amp(11q13), and active B cell receptor (BCR) signaling. C2 was enriched with del(11q)/ATM mutations and upregulation of NF-κB and DNA repair pathways. C3 was characterized by mutations in SP140, NOTCH1, and NSD2, with downregulation of BCR signaling and MYC targets. C4 harbored del(17p)/TP53 mutations, del(13q), and del(9p), and active MYC pathway and hyperproliferation signatures. Patients in these 4 clusters had distinct outcomes (5-year overall survival [OS] rates for C1–C4 were 100%, 56.7%, 48.7%, and 14.2%, respectively). We also inferred the temporal order of genetic events and studied clonal evolution of 16 patients before treatment and at progression/relapse. Eleven of these samples showed drastic clonal evolution that was associated with inferior survival, while the other samples showed modest or no evolution. Our study thus identifies genetic subsets that clinically define this malignancy and delineates clonal evolution patterns and their impact on clinical outcomes.

datasets was applied to filter out germline risk mutation sites. This includes: 1)a panel of normal files from Novogene that was generated from 2800 WES data and 500 WGS from a Chinese population (provided by Novogene); 2) a panel of normal files from the ChinaMap database (16), which was created using 10,588 WGS of Chinese individuals. 3) a panel of normal files derived from 1800 WES and 380 WGS (provided by St. Jude Children's Research Hospital).
Driver mutations were defined using MutSig2CV software (17). Genes with q value ≤ 0.1 and frequency > 5% were defined as driver mutations (13)(14)(15). Recurrently mutated genes are identified mutation occurred in >5 samples and with the mutation frequency >3%. To confirm the efficiency of the germline filter step, data for the 102 tumor samples that had matched normal data were reanalyzed using the tumor-only pipeline. After panel of normal filtration, the recurrent mutation landscape of matched normal samples analyzed as tumor-only was comparable with that of somatic mutations identified from normal pipeline with matched normal (Supplemental Figure 2C). The q values of candidate driver genes identified using either of the two Mutsig2CV modes were highly correlated (R 2 =0.887, P=2.31×10 -5 ，supplemental Figure 2D). Fisher's exact test was used to estimate occurrence bias between tumor-only mode and matched-normal mode for all SNVs present in more than 3 samples. The expected P values were calculated using uniform distribution (Supplemental Figure 2E).

Copy number analysis from WES data
Somatic copy number alterations (SCNAs) and somatic allelic copy number variations were detected using best practices for the GATK4 somatic copy number variant discovery pipeline (4). First, interval files were preprocessed and annotated. Second, the coverage across intervals was calculated in tumor and normal samples. Third, a panel of normal coverage files was generated using the 102 normal samples from matched tumor-normal pairs. Fourth, segment copy ratio and allelic segment copy ratios were calculated and normalized. To generate an allelic copy number variation file that could work with ABSOLUTE, a WDL file from github was used (https://github.com/broadinstitute/gatk/blob/master/scripts/unsupported/combine_tracks_postprocessing_cnv).
Significant CNA regions in the cohort were detected by using GISTIC2 (18). In brief, GISTIC2 assigns a G-score that considers the amplitude of the aberration as well as the frequency of its occurrence across samples for each aberration. Ninety-nine percent was used as the confidence level to determine wide peak boundaries and q value ≤ 0.1 was used to determine significantly altered regions.
To evaluate the consistence of copy number detection of tumor-only samples with samples have paired normal, the 102 matched normal samples were re-analyzed using tumor-only mode. Allelic SCNAs identified using tumor-only mode showed high correlation with those identified using matched-normal mode (Supplemental Figure 7A). In addition, the significantly deleted and amplified regions were highly consistent between the modes (Supplemental Figure 7D).

Mutual exclusivity and co-occurrence estimation
Odds ratios (OR) were calculated to assess associations among different genetic lesions; Fisher's exact test was used to determine significance. Multiple testing correction was performed using the Benjamini and Hochberg method and a false-discovery rate of 5% (q ≤ 0.05) was defined as the threshold for significance.

Estimation of purity, ploidy, and cancer cell fraction (CCF)
The ABSOLUTE algorithm was used to calculate the tumor purity, ploidy, and cancer cell fraction for somatic mutations and copy number variations (19). Candidate models were reviewed by three independent reviewers. For tumor samples that had matched normal samples, ABSOLUTE was run independently in both matchednormal and tumor-only modes. The resulting estimates of ploidy and purity showed high similarity between both 5 modes (Supplemental Figure 7B-C). Fluorescence in situ hybridization (FISH) was used to independently validate the purity of samples. Purity as determined using ABSOLUTE was highly consistent with purity estimated by FISH (Supplemental Figure 6C).

Inference of the order of genetic alterations
Statistical methods were adapted to infer the order of genetic alterations, as previously reported (20). Briefly, CCFs of mutation and SCNA pairs were first calculated for each sample in the full 134 MCL sample cohort. Then samples with paired clonal and subclonal genetic events were identified. All known driver pairs for which at least 3 clonal-subclonal orderings were observed were considered. Next, a one-sided Fisher's exact test was performed to test whether an event was more frequently changed from clonal evolution than random occurrence. Only mutations and SCNAs that occurred at a frequency > 5% were considered.

Clonal evolution of longitudinally collected samples
Phylogenetic analysis was performed on longitudinally collected samples (collected at diagnosis and at relapse or first and second relapse) using the PhylogicNDT package (21). Briefly, raw CCF probability density distributions of the somatic genetic variants identified using the ABSOLUTE pipeline were used for multidimensional nonparametric Dirichlet process to define the underlying clonal structure. The assignment of mutations to clusters were sampled and learned via a Markov Chain Monte Carlo (MCMC) Gibbs Sampler through a multinomial distribution. Tumor clone(s) with a CCF change between the two time points > 0.5 were defined as "extreme evolution"; a change between 0.2 and 0.5 defined as "modest evolution"; and a change <0.2 defined as "no evolution".

Mutation signature analysis
MutationalPattern was used to determine de novo mutation signatures (22). Briefly, all SNVs were classified into 96 trinucleotide changes in each sample. SNVs were categorized into two groups "Cluster" and "Noncluster" according to the nearest mutation distance, using the threshold of 1Kb (23). Then, "Cluster" and "Noncluster" SNVs were split in each sample to generate a 96x2M (M is the sample number) matrix. MutationalPattern was applied to discover mutation signatures by using non-negative matrix factorization (NMF) to reduce the dimensions. Cosine similarity was calculated to determine whether "extracted_signatures" were in line with the COSMIC V3 mutational signature database(24) (https://cancer.sanger.ac.uk/cosmic/signatures). The signature from COSMIC V3 with highest cosine similarity were choosed (Supplemental Figure 5B). The contribution of each mutation signature in each sample was plotted using the "plot_contribution" function in both "absolute" and "relative" modes.

CLUMPS analysis
The details of the calculation of the weighted average proximity (WAP) score are described previously (25). In brief, protein structures corresponding to each gene were downloaded from the PDB databank and the ones containing the majority of the mutated residues were used for WAP score calculation according to the following equation (25).
= ∑ − , 2 2 2 . Here q and r are distinct protein residues, dq,r is the distance in Å between the centroids of the two residues, and nq (and nr) are sigmoidal functions: = + . Nq is the number of patient samples where mutation q was observed, and m are constants with values 2 and 3, respectively. The significance of a given WAP score was calculated by estimating the P value from the null distribution. The null distribution was constructed by randomly selecting the residue positions from all over the protein structure and calculating the WAP scores, repeating the procedure 10,000 times. The following protein structures were used for calculating the WAP scores; ATM: (6K9L, 6K9K(26), 5NP1 (27)

RNA sequencing library preparation
Sequencing libraries were generated using the NEBNext UltraTM RNA Library Prep Kit for Illumina (NEB). Briefly, mRNA was purified from 1 μg of total RNA using poly-T oligo-attached magnetic beads. Fragmentation was carried out using divalent cations in NEBNext First Strand Synthesis Reaction Buffer (5X). First strand cDNA synthesis was performed using random hexamer primer and M-MuLV reverse transcriptase (RNase H-). Second strand cDNA was subsequently synthesized using DNA polymerase I and RNase H. Remaining overhangs were converted into blunt ends via exonuclease/polymerase activities. After adenylation of the 3' ends of DNA fragments, NEBNext adaptors with hairpin loop structures were ligated to the fragments to prepare them for hybridization. The library fragments were purified using the AMPure XP system (Beckman Coulter) to select fragments of 250-300 bp in length. Library quality and size was assessed using an Agilent Bioanalyzer 2100 system (Agilent Technologies).

Integrative analysis of gene expression and copy number variations
All the libraries were sequenced on Illumina Hiseq platform and 150bp paired-end 50 million reads were generated per library. Adaptors and low-quality bases were first removed from the sequencing reads using Trimmomatic (2). The remaining reads were then aligned to the human reference genome (GRCh38/hg38) using STAR with default parameters (40). Expression levels of mRNAs were normalized to transcript per million (TPM).
To identify gene expression associated with SCNAs, mRNAs that were expressed independent of SNCAs were filtered out by performing Pearson correlation analysis between gene expression (log2 (TPM+1)) and copy number change of corresponding genes (log2(X/2)), where X indicates copy number of the corresponding gene. Transcripts with P value < 0.1 were defined as "SCNA sensitive mRNAs". SCNA-associated RNA expression was determined by assigning MCL samples into two groups based on the presence or absence of SCNAs. Genes with fewer than 3 samples in either group were filtered out for the statistical analysis. Absolute log2 fold change > 0.5 with q value ≤ 0.05 (Mann-Whitney U test and Benjamini-Hocherg correction for multiple comparisons) was used as threshold to define significantly dysregulated mRNAs associated with SCNAs.

Gene expression enrichment analysis
Gene expression signatures in the different MCL subsets were identified using Gene Set Enrichment Analysis (GSEA) in the 48 samples that had paired WES and RNA-seq data. These studies used three databases: KEGG pathway, HALLMARK, and Signature DB databases (updated 03/09/2020, https://lymphochip.nih.gov/signaturedb/) (41) to capture expression signatures associated with different clusters. Gene sets with P < 0.05 and FDR < 0.25 were considered significantly enriched pathways.
To directly compare pathway expression for each cluster, the log2-transformed pseudo TPM values (TMP+1) for all genes in the gene set were averaged to provide a signature value for each sample. The average signature value for samples assigned to each cluster was calculated as the cluster average expression of the signature. These values were then linearly transformed and F test used to compare each cluster.

Allele-specific expression in RNA-seq
Expression of mutations was called using the GATK4 RNA-seq short variant discovery pipeline for the 48 samples with matched WES and RNA-seq data (4). Briefly, reads were mapped to the human reference genome (GRCh37/hg19) in two-pass mode with STAR to better align the reads to novel splice junctions (40). The alignment file was transformed into a DNA aligner-like format using splitNCigarReads from GATK4. Then a somatic mutation calling module, including BaseRecalibrator, Apply Recalibration and HaplotypeCaller, was used to detect expressed somatic mutations. Variants were filtered using the VariatFilteration command. Many mutations could not be detected by RNA somatic mutation pipeline due to lack of read coverage of the base, which indicated low expression of the genes. Therefore, read coverage was calculated for each base using BEDTools (42). Only mutations covered by more than 20 reads were included in the analysis and those with more than 5 alternative reads were considered expressed mutations. Variants with |RNA_MAF -DNA_MAF|≥0.2 and a false discovery rate ≤ 0.01 (Fisher's exact test) were considered to show allele-specific expression.

Survival and statistical analysis
The primary endpoints were progression free survival (PFS) and overall survival (OS). PFS was defined as the interval between diagnosis and first progression or date of the last follow-up. OS was defined as the interval between diagnosis and death or last follow-up. Survival curves were estimated using the Kaplan-Meier method and log-rank test was used to assess statistical significance for PFS and OS between cohorts. Multivariate Cox regression analysis was used to assess the independent prognostic impact from MIPI risk, IGHV mutational status, and individual genetic factors for outcomes in the MCL cohort. Student's t test or Mann-Whitney U tests were used to evaluate differences between continuous variables. Fisher's exact test or chi-squared tests were used to examine the significance of differences in categorical variables. Spearman correlation was used to measure the association between continuous variables. Statistical analyses were performed using SPSS Version 21.0 or R version 3.6.2. Differences were considered statistically significant when p values were less than 0.05. P values for multiple comparisons were adjusted using the Benjamini-Hochberg correction.

Quantitative Polymerase chain reaction (PCR) validation
cDNA synthesis was performed by using 5X All-In-One RT MasterMix Synthesis System (Applied Biological Materials, Vancouver, Canada) following the manufacturer's instructions. The SYBR premix Ex TaqTM assays (Takara, Dalian, China) was then utilized to amplify specific cDNA fragments in the Prism 7500 PCR system (Applied Biosystems) to evaluate the expression of selected genes, including NAA38 (17p13 3). All these selected genes were located in significant SCNVs. The sequences for the primers were provided below. Data were analyzed by comparative threshold cycle method using B2M as housekeeping reference gene.      Figure 2. Quality control for mutation call pipelines. A. Scatter plots showing the correlation between the measured allele frequency (AF) and the predicted AF for germline events identified by using the tumor-only pipeline to analyze matched-normal samples. Left: positive correlation if events are predicted as germline events by the tumor-only pipeline. Right: no correlation if the events are predicted as germline events by the tumor-only pipeline. B. Scatter plots showing the correlation between the measured AF and the predicted AF for somatic events identified by using the tumor-only pipeline to analyze matched-normal samples. The correlation was more pronounced when the events were predicted as somatic events than as germline events. C. Recurrent somatic mutations present in the 89 tumor samples that had matched normal samples. Mutations were determined using the paired pipeline (left) and the tumor-only pipeline (right). D. Scatter plot showing the correlation of Mutsig2CV false discovery rate (FDR) values for recurrent mutations that were called by the paired pipeline and tumor-only pipeline (events with FDR q value <0.1 were plotted). The plot shows the consistency of two pipelines for Mutsig2CV analyses. E. Two-sided Fisher's exact tests were used for somatic events determined by paired pipeline and tumor only pipeline. The observed P values were lower than randomly expected P value, indicating no significant bias of somatic mutation calls between the two pipelines.                          AMP15q24, NSD2,  FAM8A1, SMARCD1, SPTSSA,  HIVEP2, GART, RIPK4, TEX26,  GALNT13, SLC5A7, INSL5,   SSR3  TMIGD3  CARD11  ODF3  MYB  CACNA1A   NOTCH2, ALK  KCNB1, COPE  SHISA6, WDR74  HOXA4, PRPF31  ALOX15, Figure 17. Clonality changes of genetic events in our longitudinal samples. Dynamic changes in genetic alterations during disease progression from our longitudinal samples. Representative genetic alterations for each cluster are listed in the plot.