Identical gluten-specific clonotypes are found in peripheral blood and gut mucosa. We sorted gluten-specific CD4+ T cells binding to a pool of 4 HLA-DQ:gluten tetramers presenting the most immunodominant HLA-DQ2.5–restricted gluten epitopes from matched blood and gut biopsy samples from 3 untreated CD (UCD) patients (Supplemental Figure 1; supplemental material available online with this article; https://doi.org/10.1172/JCI98819DS1). While such tetramer-binding cells amount to around 2% of CD4+ T cells in intestinal lamina propria of untreated patients, these cells are rare in blood, ranging from 3 to 70 cells per million CD4+ T cells (Figure 1A and Supplemental Table 1). Identical TCR-β clonotypes defined by a unique nucleotide sequence were found in both sampled compartments (Supplemental Figure 2A). Because of sampling limitations, the maximum observed clonotype overlap between 2 independent sequencing experiments of the same sample was around 50% (95% CI, 42%–59%) (Supplemental Figure 2B). Based on the high degree of clonotype sharing and the fact that the HLA-DQ:gluten tetramer–binding effector-memory T cells in blood were gut homing (Supplemental Figure 1), we conclude that the more easily accessible gluten-specific T cells in blood reflect the repertoire of the gluten-specific T cells in gut.

Figure 1 The number of circulating gluten-specific T cells decreases after commencement of GFD, and the T cell repertoires overlap in samples taken weeks or years apart. (A) Frequency of gut-homing effector-memory CD4+ T cells binding to a pool of HLA-DQ:gluten tetramers in blood and gut samples taken from 6 patients during the first weeks and 1 to 2 years after commencement of GFD. (B) Distribution of TCR-αβ clonotypes obtained by single-cell TCR sequencing of gluten-specific T cells from 2 patients with the most available TCR data. Data from the remaining 4 patients are shown in Supplemental Figure 3A. Clonotypes observed in at least 2 cells are plotted as stacked boxes in the percentage of the total number of cells. The clonal size of the most dominant clonotype is displayed as a number. The total numbers of clonotypes and cells in each sample are shown below each stacked bar. (C) Area-proportional Venn diagrams of TCR-αβ clonotypes obtained by single-cell sequencing at various time points after commencement of GFD. The patients indicated in the top panels were followed for 10 weeks up to 1 year, whereas the patients indicated in the lower panels were followed for 1 to 2 years after commencement of GFD. The dark red areas represent clonotypes that were observed both at the latest time points and when the patients were untreated. The percentages denote the proportion of these shared clonotypes (dark red areas) at the latest time point (black border). The remaining clonotype overlaps are marked in light red. Asterisks indicate that data were obtained from blood sample only.

Frequency of gluten-specific CD4+ T cells decreases upon GFD. We analyzed gluten-specific T cells in gut biopsies and in peripheral blood of 6 UCD patients who were followed up until 2 years after commencement of GFD. Upon commencement of GFD, the frequency of gluten-specific T cells in blood decreased in all subjects, but at a variable rate. Most subjects had a clear decline by 1 year, except 2 subjects (CD1283 and CD1268) who showed a decrease in the frequency of gluten-specific CD4+ T cells only at additional follow-up after 2 years of GFD. From all 6 patients, we sorted circulating and gut tissue–resident gluten-specific CD4+ T cells as single cells and performed paired TCR-αβ sequencing. We observed expansion of multiple clones in all samples. The extent of clonal dominance, calculated by the sample-corrected Shannon diversity index, was highest in UCD patients and decreased upon GFD (Figure 1B and Supplemental Figure 3, A and B). Thus, clonal contraction contributes to the observed decrease in the frequency of circulating gluten-specific CD4+ T cells upon GFD.

The same clonotypes are found in multiple samples taken weeks to years apart. Next, we studied whether cells of the same clonotype, defined by cells expressing paired identical nucleotide TCR-αβ chains, were present in samples taken at different time points from the same individual. We found in all 6 patients the reoccurrence of many clonotypes in multiple samples (Figure 1C). The proportion of clonotypes found after commencement of GFD that were also found in the first samples when the patients were untreated varied somewhat, likely due to limited sampling. More importantly, there was no trend of decreasing overlap over time. Since the patients were on GFD after the initial sampling point, new gluten-specific clonotypes should not have been recruited from the naive to the memory repertoire. Thus, after commencement of GFD, the clonally expanded gluten-specific T cells contract and remain as memory T cells.

Gluten-specific memory T cells expand and dominate on oral gluten challenge. To study the impact of gluten antigen reintroduction on the gluten-specific T cell repertoire, we challenged treated CD patients with dietary gluten for 14 days. In 7 participants who showed significant increase in the number of HLA-DQ:gluten tetramer–binding T cells after gluten challenge, we performed paired single-cell TCR-αβ sequencing (Figure 2A). Similarly to what was shown in earlier findings (20), we found that the total number of circulating gluten-specific T cells reached a peak level on day 6 (Figure 2A) and that the repertoires were composed of clonally expanded cells from a diverse set of clonotypes (Figure 2B). The degree of clonal expansion increased, as demonstrated by a lower sample-corrected Shannon diversity index, in the circulating gluten-specific T cells on day 6 (Supplemental Figure 3, C and D).

Figure 2 Preexisting T cell clonotypes expand and dominate during and after gluten-challenge response. (A) Frequency of CD4+ T cells binding to a pool of HLA-DQ:gluten tetramers in blood and duodenal biopsy samples from 7 patients during gluten challenge. Tem, effector-memory T cells. (B) Distribution of TCR-αβ clonotypes obtained by single-cell TCR sequencing of tetramer-binding T cells from the 2 patients who showed the most response. Data from the remaining 5 patients are shown in Supplemental Figure 3C and Supplemental Figure 4. The x axes denote the sampling time points baseline before challenge (B) and day 6 (D6), day 14, and day 28 after the initiation of gluten challenge. The y axes show the percentage share of each clonotype represented as stacked boxes. Only clonotypes observed in at least 2 cells are plotted, and the most dominant clonotypes are displayed as numbers within the boxes. The colored boxes represent the 3 most dominant clonotypes at day 6 that were also observed at other time points. The isolated and nonstacked colored boxes represent shared clonotypes with clonal size 1. The total numbers of clonotypes and cells in each sample are shown below each stacked bar. Reoccurrence of identical TCR clonotypes in different samples from patients CD1300 and CD442 is depicted in area-proportional Venn diagrams (C and D). (C) TCR-αβ clonotype data obtained by single-cell sequencing. (D) TCR-β clonotype data compiled from both single-cell and bulk sequencing. The dark red areas represent clonotypes that were observed both at baseline and at the latest time point. The percentages denote the proportion of these shared clonotypes (dark red areas) at the latest time points (black border). The light red areas represent all other clonotype overlaps. Asterisks show only single-cell data for day 28.

A major question coming from this challenge study is whether the gluten-specific T cell response induced by reexposure to gluten will consist of reactivation of preexisting memory T cells or will involve recruitment of naive cells. When we compared clonotypes sampled on day 6 with the baseline memory repertoire, we found a considerable overlap (Figure 2C and Supplemental Figure 4A). These data suggest that the gluten-specific T cell repertoire on day 6 is made up by clonal expansions of preexisting memory T cells.

Unchanged dominance of memory clonotypes 28 days after reintroduction of gluten. We next compared paired nucleotide TCR-αβ clonotype data from blood and biopsy samples taken on day 14, or an additional day-28 blood sample after gluten challenge, with clonotype data at baseline. From the single-cell data of all 7 patients, we found that 12%–44% of TCR-αβ clonotypes detected at the latest time point were also found in the memory T cell repertoire at baseline prior to challenge (Figure 2C and Supplemental Figure 4A). To maximize the sample sizes, we performed, in addition, bulk sequencing of samples from 2 patients who had many gluten-specific T cells. With more clonotypes being detected by bulk sequencing, we found that 52%–55% of TCR-β clonotypes detected at the latest time point were present in the baseline samples (Figure 2D). Note that the proportion of clonotypes in samples taken at day 6, day 14, and day 28 that had already been observed at baseline remained remarkably stable (48%–58%), with no indication of declining dominance of memory clonotypes over time (Supplemental Figure 4B). The data suggest that reintroduction of gluten causes a transient clonal expansion of the existing gluten-specific memory T cells. The overlap observed was largely within the range of maximum expected clonotype overlap between 2 independent sequencing experiments (Supplemental Figure 2B), indicating little change of the overall gluten-specific T cell repertoire upon gluten challenge.

Similar fraction of clonotypes is observed 6 months and 27 years apart. Patients in the challenge study were followed for only up to 28 days. It is possible that the gluten-specific T cell repertoire changes slowly or only after repeated gluten antigen exposure. To compare TCR repertoires many years apart, we invited 5 patients, from whom we had historic T cell material from decades ago, to donate new blood and biopsy samples. By single-cell sequencing, we observed paired TCR-αβ clonotype sharing on the nucleotide level, including identical nucleotide sequences of secondary productive TCR-α chains (Supplemental Figure 5), between historic and recent samples, but to a variable degree (Figure 3A). For patients CD373 and CD412, the sharing was low (2%–4%) due to the small number of clonotypes we could retrieve from the few cryopreserved blood cells from the 1990s. By bulk sequencing a T cell line (TCL) established from a single biopsy specimen of CD412 in the 1990s, we obtained a higher number of unique TCR-β clonotypes and found an overlap of 18% (Figure 3B). For CD114, who was diagnosed in his early childhood, we had 2 historic samples from the 1980s that were taken 19.5 and 20 years after the diagnosis and commencement of the GFD. These 2 samples taken 6 months apart had 51 clonotypes in common, which made up 71% of the smaller 19.5-year GFD sample (total of 72 clonotypes), but only 19% of the much larger (n = 264) 20-year GFD sample (Figure 3B). Interestingly, we found a degree of TCR-β clonotype overlap in the recent samples taken 47 years after the diagnosis that was similar to that of the previous samples taken more than 2 decades ago (22%–53%). Identical clonotypes, especially those with the largest clonal sizes, were also observed in samples taken 16 to 20 years apart in the remaining 2 patients (CD364 and CD436, Supplemental Figure 6). Taking the limited sampling from a diverse repertoire into account, we conclude that the gluten-specific T cell repertoire in CD patients remains remarkably stable over several decades.

Figure 3 T cell clonotypes persist in gut tissue and blood for decades. Gluten-specific TCR clonotypes observed at various time points in years after commencement of GFD from patients CD412, CD114, and CD373 are depicted in area-proportional Venn diagrams. (A and B) Single-cell data (TCR-αβ) and combined single-cell and bulk sequencing data (TCR-β), respectively. The dark red areas represent clonotypes that were observed both at the latest time point and when the patient was untreated (CD412) or in the earliest samples we had access to (19.5-year or 20-year GFD for CD114; 2-year GFD for CD373). The percentage (black font) denotes the proportion of shared clonotypes (dark red areas) at the latest time point (black border). For CD114, the proportion of shared clonotypes at 19.5-year GFD and 20-year GFD is also shown (blue font). Asterisks show data obtained from blood sample only. Double asterisks show data obtained from TCL generated from single biopsy.

Public TCR sequences observed in 10% of gluten-specific T cells. We collected a total of 1,813 unique paired amino acid TCR-αβ sequences from 17 HLA-DQ2.5+ CD patients by single-cell TCR sequencing. Within this data set, we frequently observed identical amino acid sequences for either the TCR-α or TCR-β chain in different individuals (Figure 4 and Supplemental Table 2). Closer inspection of these public TCR sequences revealed common CDR3 motifs. We collapsed public TCR sequences that used the same V- and J-gene segment, had the same CDR3 length, and differed by no more than 3 amino acids in the CDR3 sequences to generate a list of semipublic TCR sequence motifs (Figure 4). Lists of the top semipublic CDR3α and CDR3β motifs are given in Table 1 and Table 2, respectively. In addition, we identified 40 paired public TCR-αβ sequences in which identical amino acid TCR-αβ sequences were found among cells from 2 to 4 individuals. In most cases, this public response was a result of convergent recombination in which each individual expresses unique nucleotide sequences that converge toward identical amino acid sequences (Supplemental Table 3). In total, there were 188 publicly used TCR-α, TCR-β, or paired TCR-αβ sequences amounting to 10% of all paired TCR-αβ amino acid sequences in this study (Figure 4).

Figure 4 Public TCR sequences amount to 10% of the gluten-specific T cell repertoire. (A) Number of public TCRs defined as identical TCR-α, TCR-β, or paired TCR-αβ amino acid sequences observed in at least 2 individuals in a data set of a total of 1,813 gluten-specific TCR amino acid sequences from 17 HLA-DQ2.5+ patients. (B and C) Number of public TCR-α and TCR-β sequences, respectively, that were found in the number of patients plotted on the y axes. The open bars show public TCR-α or TCR-β sequences defined as identical amino acid sequences, whereas gray bars show semipublic TCR-α and TCR-β motifs generated by collapsing TCR-α or TCR-β amino acid sequences that differ by 3 residues or less.

Table 1 The top semipublic CDR3α motifs