Disease-specific T cell receptors maintain pathogenic T helper cell responses in postinfectious Lyme arthritis

BACKGROUND Antibiotic-Refractory Lyme Arthritis (ARLA) involves a complex interplay of T cell responses targeting Borrelia burgdorferi antigens progressing toward autoantigens by epitope spreading. However, the precise molecular mechanisms driving the pathogenic T cell response in ARLA remain unclear. Our aim was to elucidate the molecular program of disease-specific Th cells. METHODS Using flow cytometry, high-throughput T cell receptor (TCR) sequencing, and scRNA-Seq of CD4+ Th cells isolated from the joints of patients with ARLA living in Europe, we aimed to infer antigen specificity through unbiased analysis of TCR repertoire patterns, identifying surrogate markers for disease-specific TCRs, and connecting TCR specificity to transcriptional patterns. RESULTS PD-1hiHLA-DR+CD4+ effector T cells were clonally expanded within the inflamed joints and persisted throughout disease course. Among these cells, we identified a distinct TCR-β motif restricted to HLA-DRB1*11 or *13 alleles. These alleles, being underrepresented in patients with ARLA living in North America, were unexpectedly prevalent in our European cohort. The identified TCR-β motif served as surrogate marker for a convergent TCR response specific to ARLA, distinguishing it from other rheumatic diseases. In the scRNA-Seq data set, the TCR-β motif particularly mapped to peripheral T helper (TPH) cells displaying signs of sustained proliferation, continuous TCR signaling, and expressing CXCL13 and IFN-γ. CONCLUSION By inferring disease-specific TCRs from synovial T cells we identified a convergent TCR response in the joints of patients with ARLA that continuously fueled the expansion of TPH cells expressing a pathogenic cytokine effector program. The identified TCRs will aid in uncovering the major antigen targets of the maladaptive immune response. FUNDING Supported by the German Research Foundation (DFG) MO 2160/4-1; the Federal Ministry of Education and Research (BMBF; Advanced Clinician Scientist-Program INTERACT; 01EO2108) embedded in the Interdisciplinary Center for Clinical Research (IZKF) of the University Hospital Würzburg; the German Center for Infection Research (DZIF; Clinical Leave Program; TI07.001_007) and the Interdisciplinary Center for Clinical Research (IZKF) Würzburg (Clinician Scientist Program, Z-2/CSP-30).


MACSima TM Imaging Cycling Staining
Tissue sections were obtained from paraformaldehyde fixed synovia embedded in paraffin.
Acquired images were processed and analyzed by MACSiQ® View Imaging Software (Miltenyi Biotec) following current processing work-flow (Miltenyi Biotec).Processed data were segmented to identify individual cell nuclei using DAPI, cytoplasm using CD20 and Ki-67, and cellular membranes using CD3, CD4, CD68, CD138 as well as HLA-DR.

TCRβ repertoire sequencing
TCRβ rearrangements were amplified using amplicon rescue multiplex PCR and sequenced by next generation sequencing on an Illumina MiSeq platform (iRepertoire®, Huntsville, AL, USA).After initial filtering and mapping using iRepertoire® algorithms, multiple sequence copies of unique sequences were counted as a single sequence if not otherwise stated.
During the study, an in-house pipeline for TCRβ sequencing was established as well: cDNA was generated using the Applied Biosystems™ High-Capacity cDNA Reverse Transcription Kit per manufacturer's protocol.For semi-nested multiplex PCR primers from (1) were used: In a 1 st PCR (15 cycles, annealing temperature 62 °C) a common linker sequence was introduced with the primers targeting the different TRBV segments (equimolar mix of 28 primers, final concentration of the mix 0.6 µM), the reverse primer targets the constant region of the TCRβ chain.In a 2 nd PCR (50 cycles, annealing temperature 65 °C), with a forward primer targeting the linker sequence and a reverse primer binding to a more 5' part of the constant region (final concentration 0.5 µM each).The resulting product was purified using the QIAquick PCR Purification Kit (Quiagen) and sent for adapter ligation and amplicon sequencing to a commercial provider (GENEWIZ, Azenta Life Sciences, Amplicon-EZ with Illumina MiSeq 2x250 bp).Paired-end reads were assembled, quality filtered (mean Phred-Score >20) and deduplicated using the pRESTO workflow (2).Sequences occurring less than twice were excluded from further analysis.

TCRβ repertoire analysis
Resulting sequence data was analyzed using IMGT/HighV-QUEST, unproductive sequences were filtered out.Resulting files were further analyzed for V-, D-, and J-segment usage, CDR3length, and clonality using ARGalaxy (3).TCR clonotypes were defined by identical V-, D-and J-gene segment usage as well as identical CDR3 nucleotide sequence.For calculation of clonal diversity resulting Change-O databases were analyzed using Alakazam (4).Standard settings were used for computation of the diversity scores (bootstrap n=200, ci=0.95).
Immunarch was used for visualization of clonotype tracking and repertoire overlap (5).AIRR output tables from IMGT/HighV-QUEST (6) were adopted for use in Grouping of Lymphocyte Interactions by Paratope Hotspots version 2 (GLIPH2, (7)).Standard CD4 reference version 2.0 was used.Output data from GLIPH2 specificity groups with sequences from at least two patients was visualized as networks using the Cytoscape v3.9.1 software.Sequence logos were generated using the 'ggseqlogo' R vignette.Additionally, the frequencies of aa doublets at specific CDR positions were computed: 'GH' at the 2 nd and 3 rd positions in CDR1β (IMGT position 28 and 29) and 'SL' or 'SV' at the 8 th and 7 th positions from the end of the CDR3βjunction (including Cys-104 and Trp/Phe-118; IMGT position 111 and 112 for a junction length of 15 amino acids (aa) / CDR3β length of 13 aa).Sequences with a junction or CDR3β length less than 13 or 11 aa, respectively, were excluded from the analysis.
10x Genomics Chromium single-cell RNA-seq Subsequent analysis on scRNA-seq output files was carried out using Seurat V4 for gene expression analysis (8) and scRepertoire version 1.7.2 for integrating VDJ sequencing data (9).Multiplexed samples were demultiplexed using cellsnp-lite and vireo and assigned to donors by combining this data with Hashtag Oligo UMIs in each donor (10).Quality control was done by calculating the median + 3*(median absolute deviation) from RNA counts, number of features and the percentage of mitochondrial reads per cell.Droplets with properties above these thresholds and below 2500/1000 RNA counts/feature counts were excluded from further analysis.
To focus the analysis on CD4 + T cells each sample was normalized and clustered with the standard settings from Seurat (log-Normalization, 2000 most variable features) and clusters with expression of specific genes for B cells ("CD79A", "CD79B", "CD19", "MS4A1"), monocytes ("LYZ") and yd T cells ("TRGC1", "TRGC2", "TRDC") were excluded from further analysis.TCR gene segment related features were removed from gene expression matrixes to avoid impact on clustering.The three samples were subsequently integrated using the data integration vignette of Seurat after normalization by the SCTransform function.Cell clusters were defined based on gene expression analysis using FindCluster with a resolution of 0.2.
Differentially expressed genes were identified by the FindAllMarkers function of Seurat.GSEA analysis was done using the 'fgsea' R package with Reactome pathways.For creation of expression scores genes from Reactome pathways were used as input for AddModuleScore function from Seurat.
TCR data was integrated from the CellRanger output files using scRepertoire.Sequences from cells with more than one α and β chain each were excluded.Clonotypes were defined as having the same VDJC genes comprising the TCR and the same nucleotide sequence of the CDR3 region.Clonal networks and occupied repertoire space were analyzed using the built-in functions of scRepertoire.Clones (here same VJ-segment usage and same CDR3 aa sequence) were defined as being convergent when consisting of at least two unique CDR3 nucleotide sequences as proposed in (11).For velocity analyses, loom files with information on spliced and unspliced reads were computed from Cellranger output using velocyto and then further processed using scVelo in dynamic mode (12,13).

Adaption of published TCRβ data
Publicly available TCRβ repertoire sequencing data sets were re-analyzed for this study.Resulting AIRR-tables -filtered for productive TCRβ sequenceswere used for further analysis.
Saguraki et al. (16): TCRβ bulk sequencing data from FACS sorted SF CD4 + PD-1 hi CXCR5 -T cells of three RA patients was downloaded as raw reads from DDJB archive DRA011207.Raw read assembly and quality control was done using the pRESTO pipeline with standard settings (2).Sequences with at least two occurrences were used as input for IMGT/HighV-Quest and further analyzed as described above.
Fischer, Dirks et al. (17): TCRβ bulk sequencing data from FACS sorted SF CD4 + PD-1 hi CXCR5 -T cells of four JIA patients processed in our laboratory in a previous study (Genbank KEXF01000000) were further analyzed as described above.

TCRs with known antigen specificities/VDJdb database:
To elucidate CDR3β motifs implicated in the binding of viral antigens, a work-flow similar to that described in (18) was employed.Specifically, a total of 55,471 TCR sequences possessing documented specificities for EBV, CMV or Influenza A were retrieved from the VDJdb database as of January 26, 2023 (19) and utilized as input data for GLIPH2 analysis.The criteria applied for pattern settings involved the selection of sequences with unique CDR3 counts greater than 4, Fisher scores below 0.05, as well as vb and length scores lower than 0.05, collectively forming what is hereafter referred to as the 'viral pattern'.Subsequently, cells identified within the ARLA scRNA-seq dataset of this study, exhibiting the presence of any of these defined viral patterns, were categorized into the 'viral motifs' group.
To investigate the presence of ARLA-associated TCRβ motifs among TCRs exhibiting known specificities, a search was conducted within the VDJdb database (19).This search targeted TCRβ sequences featuring MHCII restrictions and encompassing over ten TCRs aimed at antigens within a particular organism.

Supplemental Table 2 -Distribution of HLA-DRB1 allele frequencies
The distribution of selected HLA-DRB1 alleles is shown within the cohort of patients with antibioticrefractory Lyme Arthritis (ARLA) analyzed in this study (Germany, Refractory) as well as data obtained from a previous study involving Caucasian patients from North America with antibiotic-responsive and refractory Lyme Arthritis (20).Data from German control individuals were retrieved from the population of 'Germany DKMS -German donors' from the 'Allele frequency net database' (21).Data from North American control individuals were retrieved from European-American bone marrow donors (20,22).The upper part of the table reports the known 'risk alleles' identified in the North American cohort, while the lower section summarizes the known 'protective alleles'.Comparative analysis of cumulative frequencies for each allele group was conducted between both antibiotic-refractory patient cohorts using Fisher`s exact test ( 1 p=0.18; 2 p<0.01).

Figure 2 -Supplemental Figure 4 -
PD-1 hi HLA-DR + CD4 + T cells are oligoclonally expanded effector cells (A) Dot plots presenting CCR7 and CD45RO expression on SF PD-1 hi HLA-DR + (red) and PD-1 lo HLA-DR -(blue) CD4 + T cells.(B) Clonal diversity analysis of the TCR Vβ repertoire within both T cell populations from 2 ARLA patients.The generalized diversity index (Hill) was computed across a diversity order (q) range using uniform resampling to correct for sequencing depth.The diversity index (qD) is depicted as a smooth curve.(C) Dot plot and histogram demonstrating K-i67 expression in PD-1 hi HLA-DR + and PD-1 lo HLA-DR -SF CD4 + T cells.(D) Compiled data from four ARLA patients, indicating Ki-67 + cell frequencies within the specified populations.Bars represent mean frequency ± standard deviation.Paired two-tailed Student's t-test, **: p<0.01.The TCR CDR3β motif enriched in synovial fluid PD-1 hi HLA-DR + CD4 + T cells in ARLA patients is non-germline encoded (A) Tracking of occupied repertoire space using CDR3β amino acid (aa) sequences derived from the 'specificity cluster' (Figure 2B) at various time points in three patients with available follow-up samples.Each color corresponds to a unique CDR3β aa sequence.(B) Sequence plots of the CDR3β joining region depict nucleotide (nt) sequences (each in the upper row) and their corresponding aa sequences (each in the lower row) for motifs containing 'SLSY' in the CDR3β, organized separately for each patient.(C) The absolute counts of unique nt or aa sequences resulting in CDR3βs with specified aa motifs (motifs selected from the 'specificity cluster,' filtered for an overall count of unique CDR3β (aa) ≥ 20), each symbol represents one patient.

Table 3 -Experimental analyses conducted in ARLA patients Supplemental Table 4, 5 and 6 are
provided in a separate Excel file Supplemental