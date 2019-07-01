Erythrocyte membrane preparation, quantitation of spectrin content, and limited tryptic digestion of spectrin. Erythrocyte membranes were prepared from peripheral blood as previously described (39). Membrane proteins were analyzed and spectrin content determined as described (40). Membrane proteins were separated by SDS-PAGE on 3.5%–17% gradient polyacrylamide gels and stained with Coomassie blue. The spectrin/band 3 ratio was quantified by densitometric scanning of the stained gels at 540 nm and integration of the surface area under the spectrin and band 3 peaks (41). Spectrin was extracted by incubating ghosts overnight at 4°C in low ionic strength buffer. Limited tryptic digests of spectrin extracts were prepared as described (39) and separated by 2-dimensional gel electrophoresis with isoelectric focusing (I EF) as modified by Speicher et al. (42).

Exon capture and WES. Targeted regions of genomic DNA were captured using a SeqCap EZExome V2.0 (Roche) solution-based capture system according to the manufacturer’s protocol. The captured, purified, and amplified libraries targeting exomes from patients were sequenced on a HiSeq Analyzer with paired-end sequencing at 75-bp read length. Sequencing reads were processed and analyzed as described (8).

Fastq sequence reads were aligned to the human genome (hg19build37/GRCh37) using BWA mem 0.7.9a software (9). Variant analysis was performed using Genome Analysis Toolkit (GATK) analysis software v.3.1-1 (10, 11). Sequencing reads in the region of known insertion/deletions in the 1000 Genomes database were realigned to reduce false-positive variants. Individual base quality scores were empirically recalibrated using covariate information in combination with the variation in dbSNP build 137 and from 629 human genome sequences from the 1000 Genomes project (12, 13). Base alignment quality scores were determined then recalibrated with GATK to further reduce false-positive calls (14). SNPs and insertions/deletions (indels) were called using the GATK HaplotypeCaller Bayesian algorithm for variant discovery and genotyping, which uses base qualities and allele counts to determine probabilities for called variants. Indels and novel SNPs were annotated using Annovar (43), which determines if a SNP or indel changes a protein sequence or splice site, and provides a variety of predictions, including the combined annotation dependent depletion (CADD) algorithm (44) and Mutation Taster (45), to assess the likely effect of a mutation on protein function.

Whole-genome sequencing. Genomic DNA was prepared using the TruSeq PCR-free DNA HT sample preparation kit (Illumina) with 450 bp insert size. Intact genomic DNA was sheared, followed by end-repair and bead-based size selection of fragmented molecules. Adenines were added to the 3′ ends of the DNA size–selected fragments followed by ligation of Illumina sequence adaptors ligated onto the fragments and PCR. Library quality control included a measurement of the average size of library fragments using a BioAnalyzer (Agilent), estimation of the total concentration of DNA by PicoGreen (Thermo Fisher Scientific), and a measurement of the yield and efficiency of the adaptor ligation process via a quantitative PCR assay using primers specific to the adaptor sequence.

Sequencing was performed on a HiSeq X instrument (Illumina) generating 2 × 150-bp read lengths. After alignment and duplicate removal, this equated to ×30 mean genome coverage (for the gender-specific ~2.85 gigabase mappable human genome). WGS data were processed through an automated pipeline at New York Genome Center’s high-performance computational facility. Paired-end 150-bp reads from the WGS were aligned to the GRCh37 human reference using the Burrows-Wheeler Aligner (BWA-MEM v0.78) (46) and processed using the best practices pipeline that includes marking of duplicate reads by the use of Picard tools (v1.83, http://picard.sourceforge.net), realignment around indels, and base recalibration via GATK v3.2.2 (47).

Two patients (nos. 1 and 23) were selected for WGS, both heterozygous for the αBH variant. Analyzing WGS data, 122 variants in the SPTA1 region (chr1:158545000-158690000) present in both samples were selected for further analysis using Annovar software to annotate allele frequencies and predict functional consequences. In addition, we identified samples in the 1000 Genomes database of healthy individuals who were homozygous for αBH and downloaded genotype data for these individuals. Variants present in the 2 patient samples that were homozygous in the 1000 Genomes samples were excluded from further analysis, leaving 16 candidate variants, of which 4 were excluded because they were present at high frequency (>1%) in the 1000 Genomes database. Eleven remaining variants were excluded because they were homopolymeric or short tandem repeat size variations that are common and often not accurately genotyped. The sole remaining variant, rs200830867, chr1:158613314 G to A, is the αLEPRA allele (13). This variant was initially described in an rHS patient in trans to an SPTA1 nonsense mutation associated with an elongated α-spectrin mRNA transcript containing 70 nt from the 3′ end of intron 30 hypothesized to lead to frameshift and premature chain termination (13).

Mutation validation. Variants identified by WES or WGS were validated by Sanger sequencing. Sanger sequencing was performed on an 3130XL capillary sequencer (Applied Biosystems). SPTA1 variants were classified as cis or trans by study of the proband’s parents. Variants were classified as novel if not found in the 1000 Genomes or Exome Aggregation Consortium (ExAc) databases.

Molecular modeling. The effects of α-spectrin missense mutations on tetramer structure and their likely functional consequences were evaluated using the structures of the univalent α-β tetramer complex (48), closed spectrin dimers (49, 50), and the divalent tetramer complex as templates (50).

Minigene analyses. Minigene splicing assays were designed as described (18). Plasmids were constructed containing the human erythroid ANK1 promoter, the human HBG1 gene including the HBG1 ATG/Kozak consensus, and the region of the SPTA1 gene containing either WT or αLEPRA sequence for use in minigene splicing assays. Briefly, a SmaI-BglII fragment of the ANK1 gene promoter was linked to a BglII-HindIII fragment containing exons 1–3 of the HBG1 gene and flanking 3′ region. A 1684-bp XbaI fragment of the SPTA1 gene from intron 29 to intron 32, containing either the WT or αLEPRA sequence, was cloned into intron 2 of the HBG1 gene. These plasmids contained a termination codon and the polyadenylation signal from the HBG1 gene.

Wild-type and αLEPRA spectrin minigene plasmids containing fragments of the SPTA1 gene from intron 29 to intron 32 were prepared and transfected into 107 K562 cells (ATCC, CCL-243) using nucleofection program T-016, buffer V (Amaxa). Cells were harvested after 48 hours, RNA was prepared then reverse transcribed using oligo d(T) and random hexamers, and cDNA was amplified with primers that included sequences from flanking HBG1 and internal SPTA1 exons. These primers allow differentiation of minigene amplification products from endogenous SPTA1 transcripts.

Fluorescent Taqman probes (Applied Biosystems) corresponding to total SPTA1 and the elongated α-spectrin transcript were added to amplification reactions, allowing determination of the contribution of elongated α-spectrin transcripts after normalization. To assay the total minigene SPTA1 transcript, PCR was performed using primers 5′-CTTGGAGACTATGCCAACCTAAA-3′ (sense) and 5′-CACATTTCCCAGGAGCTGAA-3′ (antisense) with transcript quantitation by Taqman probe 5′-AATGGATCAGTGAGATGCTGCCCA-3′ (sense). To assay elongated α-spectrin transcripts, PCR was performed using primers G4833, 5′-GAGAATTCCCTGAGGTCAGATG-3′ (sense) and G44820, 5′-ATCCCAGACTCCCTCCTG-3′ (antisense) with transcript quantitation by Taqman probe 5′-AGGCTTTGATGAAGAAACGGGACGA-3′ (sense). All analyses included 3 biologic and 2 technical replicates.

CRISPR/Cas9 gene editing analyses. Four guide RNAs (gRNAs) in the regions flanking the predicted BP2/αLEPRA region were designed, synthesized, and tested for use in gene editing experiments. gRNAs were synthesized using HiScribe T7 High Yield RNA Synthesis Kit (New England Biolabs). Testing was done using the Guide-it sgRNA Screening Kit (Takara). One gRNA was selected and used for all experiments (Target sequence: GGATTCAGAAGATATACTCA). Cas9 protein (15 μg, PNAbio) was complexed with 200 pmol gRNA. The gRNA-Cas9-RNP complex and 200 pmol donor ssDNA (BP2 A to G donor ssDNA: g(s)g(s)a(s)actctgtaccacacaagtagcccattattagatgtttctcctcctattaagttgaaacccacctccctgtaaggcatatattatttgaccctgagtatatcttctgaatccGcaaaggatacctt(s)c(s)a(s) or αLEPRA donor ssDNA: g(s)g(s)a(s)actctgtaccacacaagtagcccattattagatgtttctcctcctattaagttgaaacccacctccctgtaaggcatatattatttgaccctgagtatatcttctgaatTcacaaaggatacctt(s)c(s)a(s)) were nucleofected (Amaxa) into K562 cells using the T-016 nucleofector program. Cells were placed into 96-well plates 48 hours after nucleofection. Clones were expanded for 10–14 days and then screened for either homozygous substitution of the BP2 A to G or homozygous alteration of the WT nucleotide αLEPRA substitution C to T, by PCR amplification followed by Sanger nucleotide sequencing.Homozygous mutant BP2 and αLEPRA clones were identified and used in additional studies (Supplemental Figure 4).

mRNA stability assays. The influence of NMD on the stability of WT α-spectrin and αLEPRA mRNA was examined in WT K562 cells and in K562 cells rendered homozygous for the αLEPRA allele. Cells were incubated at 37°C with the NMD inhibitor emetine (100 μg/ml, emetine dihydrochloride hydrate, MilliporeSigma, E2375) for 8 hours or with the NMD inhibitor cycloheximide (100 μg/ml, MilliporeSigma, C1988) for 4 hours, or with a combination of both. The amount of total α-spectrin mRNA transcripts and αLEPRA mRNA transcripts were determined by real-time RT-PCR with fluorescent probes as described above. Analyses included 3 biologic and 3 technical replicates.

RNA-seq data. RNA-seq data sets accessed from the Gene Expression Omnibus database were GSE61566 and GSE53983 for human erythroid cells and GSM958729 for K562 cells.

Statistics. GraphPad Prism 8 software (Graph Pad Software) was used for statistical analyses. All data were reported as mean ± SD in tables or as error bars. Comparisons between 2 groups were performed using the 2-tailed Student’s t test. Multiple test correction was performed using the Benjamini and Hochberg method (51). Significance was set at P less than 0.05.

Human subjects. All human studies were approved by the Yale University Human Investigation Committee review board. Written informed consent was received from participants or their parents, as appropriate, prior to inclusion in the study.