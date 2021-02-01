Accessing HIV-1–infected antigen-responsive CD4+ T cells. To study antigen-responsive CD4+ T cells infected with HIV-1, we obtained PBMCs from 10 HIV-1–infected adults on suppressive ART for a median of 8 years (Supplemental Table 1; supplemental material available online with this article; https://doi.org/10.1172/JCI145254DS1). Cells were depleted of CD8+ T cells and stimulated with CMV or HIV-1 Gag antigens for 18 hours. Because the CMV peptidome is large (213 ORFs) and CMV-specific T cell responses are broad and heterogeneous (38, 39), we used lysates of CMV-infected cells rather than immunodominant proteins (e.g., pp65). To study HIV-1–specific cells, we used overlapping Gag peptides, since responses to class II–restricted Gag epitopes have been well characterized (40). We sorted responding cells on the basis of upregulation of both CD40 ligand (CD40L) and CD69 (Figure 1A). The median frequencies of CMV- and Gag-responding cells among all CD4+ T cells were 2.5% and 1.8%, respectively (Figure 1B), comparable to findings in previous studies (39, 41). For each antigen, we also sorted CD40L and CD69 double-negative cells with high expression of CD45RO to obtain memory populations depleted of responding cells (Figure 1A). PBMCs from HIV–CMV– donors showed no increase in CD40L+CD69+ cells compared with conditions without antigens (Figure 1B and Supplemental Figure 1C). We also sorted CD40L+CD69+ T cells responding to anti-CD3/anti-CD28 beads to obtain a population representing the CD4+ T cell repertoire (Figure 1A). As expected, we observed that a higher proportion of cells became activated in response to anti-CD3/anti-CD28 beads (median of 30%, P < 0.0001), and a lower fraction of these were memory cells (Supplemental Figure 1B).

Figure 1 Experimental approach to study HIV-1–infected, antigen-responding CD4+ T cells. (A) Experimental design and gating logic to isolate cells responding to stimulation. CD8-depleted PBMCs were stimulated with no antigen, anti-CD3/anti-CD28–conjugated beads, CMV lysates, or HIV-1 Gag peptides. CD4+ T cells upregulating both activation markers CD40L and CD69 were sorted. For CMV and Gag stimulations, nonresponding cells with high CD40RO expression were also isolated (highlighted in green). (B) Frequencies of CD4+ T cells responding to the indicated stimulation. Mean values of all time points are shown for each of 10 participants; horizontal bars show the median and interquartile values. Statistical significance was determined by 1-way ANOVA. (C) Experimental design to characterize the clones of the HIV-1–infected antigen-responding cells. Samples from follow-up time points were processed as in A; responding cells were sorted in small pools and subjected to WGA. Pools containing infected cells were detected by u5-gag or env PCR. Proviruses matching potential clones previously identified by single-genome sequencing were detected by Sanger sequencing. Whole-genome–amplified DNA was then used for integration site analysis, full proviral genome sequencing, the intact proviral DNA assay, and TCRβ sequencing. Ag, antigen.

Initially, DNA from sorted cells was used for single-genome sequencing of HIV-1 proviruses to study the clonality of infected cells, whereas TCRβ sequencing was used to assess the clonality of all sorted cells (see Methods). Subsequently, responding cells from the same donors (Supplemental Figure 1) were sorted in small pools representing limiting dilution with respect to infected cells (Figure 1C). We then used whole-genome amplification (WGA) with phi29 polymerase to generate thousands of copies of cell genomes (Supplemental Figure 3; see complete unedited blots in the supplemental material) (42, 43). PCR on whole-genome–amplified DNA identified HIV-1–positive pools, and then sequencing identified the proviruses of interest for determination of the integration site and the full proviral sequence (Figure 1C).

Identical proviral sequences are common among antigen-specific cells. Figure 2A shows a phylogenetic tree of 186 HIV-1 sequences from independent limiting dilution PCRs for a representative participant (P2). We found that higher frequencies of identical sequences (appearing as “rakes” on the tree) were present in both antigen-responding (0.65 and 0.67 for CMV and Gag, respectively) and unrelated memory cells (0.68) compared with cells responding to anti-CD3/anti-CD28 stimulation (0.21), likely reflecting the presence of naive cells in the latter condition. We identified at least 1 set of identical sequences in antigen-responding cells from all participants (10 of 10 for CMV and 8 of 8 for Gag) (Figure 2B, Supplemental Figure 4, and Supplemental Table 2). Integration site analysis proved that most identical proviral sequences were true clones of infected cells (see below). In aggregated single-genome sequences (SGSs) from all participants (n = 1787), a higher fraction of the proviral sequences were identical among CMV-responding cells than among Gag-responding cells, anti-CD3/anti-CD28–responding cells, or memory cells that did not respond to CMV or Gag (Figure 2B). These results link responsiveness to a chronic viral antigen to in vivo proliferation of HIV-1–infected cells. Figure 2C shows clonal proviruses with defined integration sites (see below) dominating HIV-1–infected, CMV-responding cells in 4 participants (P1, P3, P5, P8); we identified these sequences in multiple samples collected up to 10 months apart, demonstrating stable persistence.

Figure 2 HIV-1–infected, CMV-responding cells are enriched in proviral populations generated by clonal expansion. (A) Representative NJ tree of 186 independent HIV-1 u5-gag DNA SGSs from participant P2, rooted to the HIV-1 subtype C consensus. Sequences from different sorted populations are color coded (see legend). A branch distance of 1 nucleotide is shown on the tree scale. (B) Frequencies of identical proviral sequences within sorted populations for all 10 participants. (C) NJ trees of HIV-1 sequences recovered from CMV-responding CD4+ T cells from 4 participants. Identical sequences are collapsed onto the same branch, and trees are rooted to the HIV-1 subtype B consensus sequence. Dashed branches indicate hypermutated proviruses. Symbols indicate the method and time point used to generate the sequences. Large CMV-specific clones are colored as in D, and the gene containing or closest to the integration site is indicated (see Supplemental Table 3 for detailed integration site data). (D) Dot plot showing increased frequencies of probable clones identified in CMV-responding cells compared with cells responding to anti-CD3/anti-CD28 stimulation or CMV-nonresponding memory cells. Only clones confirmed by integration site or potential clones composed of at least 4 sequences were included. Probable clones are color-coded across stimulation conditions and as in C and Supplemental Figure 4. (E) Dot plot showing higher clonality of proviral populations from CMV-responding cells measured with the Gini coefficient. Horizontal bars show the median and interquartile range. Statistical significance was determined by 1-way ANOVA. CMV-nr, CMV-nonresponding; CMV-re, CMV-responding; Gag-re, Gag-responding; Memory nr, CMV-nonresponding memory cells. Gene symbols next to the tree branches show the genes containing or closest to (indicated by an asterisk) the integration site. Genes previously linked to the persistence of HIV-1–infected cells are highlighted in red

Considering all sets of 4 or more identical sequences to be potential clones, we calculated the frequencies of cells from each potential clone among all HIV-1–infected, CMV-responding cells from each donor. Potential clones had a median frequency of 0.22 (range, 0.03–0.89). In the control population of CMV-nonresponding CD45ROhi cells, the same sequences were either absent or present at a much lower frequency (Figure 2D). In some cases, the relative abundance of these clonal variants was not only dominant among HIV-1–infected, CMV-responding cells, but also among HIV-1–infected cells responding to anti-CD3/anti-CD28 (Supplemental Figure 4), suggesting that in some cases, CMV-specific clones represent the most expanded clones of HIV-1–infected cells. For example, in participant P8, sequences from the provirus integrated in the PAFAH1B1 gene represented 87% of all HIV-1 sequences from CMV-responding cells and 22% of those from cells responding to anti-CD3/anti-CD28 (Figure 2C and Supplemental Figure 4). To compare the clonality of HIV-1–infected cells across conditions, we used the Gini coefficient, a measure of distribution previously used to estimate oligoclonality in human T cell leukemia virus, type 1 (HTLV-1) infection (44). Proviral populations from CMV-responding cells showed significantly higher oligoclonality than did those from Gag- or anti-CD3/anti-CD28–responding cells (Figure 2E). As expected, nonresponding memory cells also contained groups of identical sequences (Figure 2A and Supplemental Figure 4), specific for other unknown antigens and in some cases present at high frequencies.

These results show that clonally expanded proviral populations were present within CMV- and Gag-responding CD4+ T cells; the significantly higher clonality of the infected cells responding to CMV suggests that the chronic immune responses to CMV antigens, characterized by memory inflation (30), contribute to proliferation and maintenance of the HIV-1 reservoir in many infected individuals.

Integration site analysis of HIV-1–infected, antigen-responding clones. To confirm that the identical proviral sequences result from clonal expansion, we recovered integration sites using linker-mediated PCR (LM-PCR) on whole-genome–amplified DNA from pools of antigen-responding cells that contained identical proviruses (Figure 1C). We retrieved integration sites of 22 expanded clones, 16 from CMV-responding cells and 6 from Gag-responding cells (Figure 3A and Supplemental Table 3). Most were within introns (19 of 22), consistent with previous studies (45, 46). There was no bias in orientation relative to the host gene (9 in the same and 13 in the opposite orientation). Ten integration sites were in cancer-related genes; among these, we found 1 provirus in MKL1, 4 in BACH2, and 2 in STAT5B. HIV-1 integration has been previously described in these 3 genes (Supplemental Table 3). Integration in BACH2 and STAT5B has been linked to HIV-1 persistence due to gene activation by promoter insertion (23). The proviruses identified here shared the same features of those previously found in individuals on ART (Figure 3B). Strikingly, all 4 proviruses in BACH2 were in the same orientation relative to host gene transcription and upstream of the BACH2 translation start site, similar to the 55 BACH2 integrants identified previously (see Methods). Moreover, despite defects including deletions and/or inversions (Figure 3A), these proviruses retained the 5′ long terminal repeat (LTR) and splicing donor sequences required to generate LTR-driven chimeric transcripts (47, 48). Similarly, the 2 proviruses in STAT5B were in the same orientation as STAT5B in intron 1, upstream of the translation start site, as with 42 of the 57 proviruses previously described. These unique features, likely the result of postintegration positive selection, suggest that some HIV-1–infected clones not only proliferate in response to antigen, but also gain a survival advantage from the effects of HIV-1 integration. However, 15 of the 22 proviruses from antigen-responding expanded clones showed integration sites in loci not associated with proliferation of HIV-1–infected cells (Supplemental Table 3). Moreover, some of these were near or within genes with only trace to low mRNA levels in lymphocytes, as their expression is restricted to other tissues (see Methods). CD4+ T cell clones carrying these proviruses have probably undergone extensive proliferation, with negligible contribution of HIV-1 integration–related effects.

Figure 3 Characterization of defective and infectious proviruses from antigen-responding CD4+ T cell clones. (A) Genome sequences and integration sites (IS) recovered from proviruses in antigen-responding clones obtained from each participant. Each horizontal bar represents 1 provirus found in CMV- or Gag-responding cells (indicated by teal and purple boxes, respectively). Sequence features are color coded (see legend). The intact provirus from participant P3 is highlighted in black. Captured host-proviral junctions are depicted as squares flanking the horizontal bars, and the gene symbols listed on the right show the gene containing or closest to (indicated by an asterisk) the integration site. Genes previously linked to the persistence of HIV-1–infected cells are highlighted in red. Proviruses with asymmetrical aberrant integration are marked with black and red squares. (B) Integration sites in BACH2 and STAT5B found in CMV- and Gag-responding clones (teal and purple, respectively) compared with those previously reported in individuals on ART (black). Arrowhead direction represents proviral orientation relative to host gene transcription. The small gray arrows show host gene transcriptional orientation, and the large gray arrows show the translation start site. (C) Summary of the proviral sequences in A. (D) Frequency of infected cells carrying inducible replication-competent proviruses from 5 participants. Horizontal bars show the median and interquartile values. Statistical significance was determined using a 1-way ANOVA. (E) NJ tree including u5-gag sequences from p24-positive qVOA wells, gDNA SGS, and provirus P3.c.FBXO22, sampled according to the inset legend. The highlighter plot shows mismatches from the top sequences in the tree.

Antigen-responding clones carry both defective and infectious proviruses. To investigate whether clones of HIV-1–infected, antigen-responding cells carried intact or defective proviruses, we recovered the partial or full-length sequences of the proviruses for which we identified integration sites (Figure 3, A and C). As expected, most genomes (13 of 22) were defective because of mapped or inferred deletions or G-to-A hypermutation (see Supplemental Results). However, we detected 1 intact genome. Despite the limited sampling, this proportion of intact and defective proviruses was consistent with a previous analysis of the proviral landscape in CD4+ T cells (49).

We also carried out quantitative viral outgrowth assays (qVOAs) (50) (Figure 3, D and E, and Supplemental Figure 7) on the sorted cell populations and observed outgrowth from all cell populations from all participants. Infected cell frequencies in infectious units per million (IUPM) were not significantly different for CMV- or Gag-responding cells than for cells responding to anti-CD3/anti-CD28 stimulation or for nonresponding memory cells. In addition, we observed no difference relative to historical qVOAs using resting CD4+ T cells from the same individuals (Figure 3D), suggesting that, at least in these participants, the HIV-1 reservoir was not enriched in cells responding to the antigens tested.

To determine whether the induced infectious proviruses were from clonally expanded cells, we sequenced the HIV-1 RNA from the supernatants of p24-positive wells (Figure 3E and Supplemental Figure 7). In participant P3, we detected viral outgrowth in 4 independent wells seeded with CMV-responding cells (corresponding to 10.8 IUPM). These replication-competent viruses had identical near-full-length genome sequences that were genetically distinct from 1 outgrowth virus recovered from CMV-nonresponding cells (1.03 IUPM). In addition, these 4 identical isolates matched three HIV-1 DNA SGSs found in CMV-responding cells 8 months previously. Interestingly, we did not detect outgrowth of the intact provirus integrated into the FXBO22 gene, which was the most abundant sequence among HIV-1–infected, CMV-responding cells from participant P3 (frequency of 0.2) (Supplemental Figure 4) and persisted across multiple time points (Figure 3E), suggesting a lack of inducibility or the presence of missense mutations that would affect replicative fitness.

These results provide strong evidence that CD4+ T cell clones carrying an inducible replication-competent provirus can be selected over time in response to CMV. However, the size of infectious proviral clones within antigen-specific populations varied, as in the study from Mendoza et al. (37). In participant P2, cells responding to anti-CD3/anti-CD28 and CMV-nonresponding cells had a markedly higher frequency of cells carrying inducible, replication-competent proviruses (18.3 and 33.5 IUPM, respectively, versus 2.7 IUPM in CMV-responding cells). This was likely due to a single large clone specific for an antigen other than CMV (Supplemental Figure 4). Proviral sequences identical to this outgrowth virus were found in PBMCs from 8 months previously, when they represented the most abundant viral variant in both CD3/CD28 and CMV- or Gag-nonresponding memory cells (frequency of 0.15, 0.49, and 0.32, respectively) (Figure 2A and Supplemental Figure 4). Thus, other factors, including antigens not explored here, can lead to massive expansion of clones carrying infectious proviruses.

TCRβ repertoire mirrors the clonality of proviral populations. VDJ recombination generates a TCR repertoire with a theoretical diversity exceeding the total body number of T cells (up to 1015 cells) (51). Therefore, we used VDJ rearrangements as T cell barcodes to compare clonality in CMV- and Gag-responding cells with that in nonresponding memory T cells and cells activated by anti-CD3/anti-CD28. Despite comparable sequencing depth (Supplemental Figure 8), we found that species richness for TCRβ sequences in CMV- and Gag-responding cells was significantly lower than that for CD3/CD28-responding cells (Figure 4A), and the oligoclonality indices for CMV-responding cells were not only higher than those for CD3/CD28-stimulated cells but also than those for Gag-responding cells (Figure 4B). Thus, patterns of clonality for the whole responding CD4+ T cell population mirrored those of the HIV-1–infected cells within that cell population. Oligoclonality of TCRβ sequences correlated with oligoclonality of proviruses within the relevant cell populations (Spearman’s coefficient r = 0.52, P = 0.01), and samples from CMV-responding cells clustered away from anti-CD3/anti-CD28 and Gag-responding cells (Figure 4C), suggesting that adaptive immune responses globally affected the expansion of HIV-1–infected clones, supporting previous studies describing inflation of memory cell populations driven by CMV (30).

Figure 4 Antigen-responding cells show higher clonality and evidence of convergent selection. (A) TCR diversity, estimated by the Chao index (93), was lower in antigen-responding cells. Gray circles represent memory cells nonresponsive to either CMV or Gag stimulation (teal and purple borders, respectively). (B) Gini coefficients based on TCRs from CMV-responding cells showed the highest clonality. (C) Correlation of Gini coefficients based on TCR and proviral populations. Spearman’s r value and linear correlation with 95% CIs are shown. (D) log 10 abundance of productive TCRs from CMV-responding cells from participant P1 collected at month 6 and month 9 of the study. (E) VDJ sequences from the 10 most abundant clonotypes encoding CASRGSTEAFF. (F) Summed abundance of all degenerate and expanded CDR3β sequences. (G) Frequency of TCR clusters normalized by CDR3β input. Clusters were filtered on the basis of CDR3β of 2 or greater and a Fisher’s exact test P < 0.0001. (H) TCR clusters are plotted on the basis of the number of total CDR3β and Vβ gene scores. Lower scores indicate more homogeneous Vβ. Circle size is scaled to the total sum of TCR templates in each group, indicating clonal expansion of cells within the cluster. (I) Representative TCR clusters involving more than 1 participant, displayed as networks and CDR3β sequence logos. Nodes represent each CDR3β sequence, with circle colors based on participant and circle size representing clonal expansion. Edge colors highlight antigen stimulation (teal for CMV and purple for Gag). CDR3β logos display amino acid representation at each position. The core motif shared by the convergent cluster is colored. The table shows cluster characteristics and shared HLA alleles (see Supplemental Table 4 for additional details). Exclamation marks indicate enrichment of the participants’ HLA allele contributing to the cluster. Horizontal bars show the median and interquartile values. Statistical significance was determined by 1-way ANOVA.

To estimate the distribution and the relative sizes of CMV- and Gag-responding T cell clones within the entire CD4+ T cell repertoire, we identified TCRβ sequences with a significantly (P < 0.01) higher abundance in CMV- and Gag-responding cells than in cells activated by anti-CD3/anti-CD28 stimulation (see Methods). Across 7 participants, we identified an average of 210 (SD ± 103) and 137 (SD ± 82) T cell clones enriched in CMV- and Gag-specific cells, respectively (Supplemental Figure 8). Of note, significantly fewer clones were enriched upon either stimulation (average of 16, SD ± 10, P = 0.0004), likely reflecting cross-reactivity or nonspecific activation. Interestingly, although these CMV- and Gag-reactive T cell clones represented only a small percentage of all CD4+ T cells (average of 2.5% and 1%, respectively; Supplemental Figure 8, C and D), they were among the top expanded T cell clones in the anti-CD3/anti-CD28–responding cell population. CMV-responding T cell clones were particularly dominant (Supplemental Figure 8), as was the case in the analysis of clonal proviral populations among HIV-1–infected cells (Figure 2D and Supplemental Figure 4).

To prove that the striking clonality of CMV-responding CD4+ T cells resulted from antigen-dependent selection and not homeostatic proliferation, we examined the most expanded clonotypes (defined here as cell clones sharing an identical VDJ β sequence) and found that many shared the same amino acid sequence in the CDR3β region, despite different VDJ rearrangements at the nucleotide level. These so-called “degenerate” TCRs are signatures of convergent immune responses, selected over time for binding to specific peptide-loaded MHC molecules (52, 53). Figure 4, D and E, shows one such degenerate TCR. TCRβ sequencing on CMV-responding CD4+ T cells collected 3 months apart from participant P1 showed overlapping distributions of clonal frequencies (suggesting stability over time), including 19 different clonotypes with an identical CDR3β amino acidic sequence (CASRGSTEAFF) (Figure 4E). These rearrangements represented the most abundant CDR3β sequences among CMV-responding cells (7% and 10%, for the 2 time points). Degenerate expanded clones of this type were significantly more abundant in antigen-responding cells (median of 2.8% for CMV and 2% for Gag, respectively) compared with nonresponding and anti-CD3/anti-CD28–stimulated cells (median of 0.5% and 0.003%, respectively) (Figure 4F and Supplemental Figure 8), supporting the hypothesis that CD4+ T cell responses against these 2 chronic infections undergo a process of convergent selection.

To further explore clonal selection based on TCR specificity, we performed a TCR cluster analysis of shared structural features using the GLIPH2 algorithm, which groups TCRs into clusters predicted to bind the same peptide-MHC complex (54). TCRs from CMV-responding cells had a significantly higher frequency of clusters than did those from anti-CD3/anti-CD28–stimulated cells (median of 4.6 vs. 0.8 clusters every 1000 TCR input sequences, P = 0.005; Figure 4G). In addition, a higher proportion of clusters from CMV-responding cells were larger in size (and included a higher number of clonotypes), included expanded clones, and showed a restricted V gene use (P < 0.05; Figure 4H), supporting an overall higher degree of selection toward convergent immune responses. Finally, when we used GLIPH2 on TCRs sampled from all participants (n = 8 for CMV and n = 7 for Gag), we found convergent clusters including CDR3s from multiple participants, likely representing public responses against shared immunodominant epitopes. Four exemplary clusters are described in Figure 4I and Supplemental Table 4. The CDR3 sequences in these clusters showed a restricted use of V and J genes, shared significant motif residues, and were often degenerate. Moreover, the individuals contributing to these clusters shared 1 or more class II HLA alleles. Although it is not technically feasible to extend these TCR analyses to the rare subset of CMV-responding cells that harbor latent proviruses, our results strongly support the conclusion that the clonality of CMV-responding cells is the result of antigen-driven proliferation and not a homeostatic process that occurs independently of TCR specificity.

Coupled quantification of provirus and VDJ rearrangements. To better understand the nature of antigen-specific CD4+ clones carrying proviruses, we developed a method to identify their cognate TCRβ sequences. We sequenced the TCRβ repertoire from multiple whole-genome–amplified cell pools containing the same provirus and searched for the unique VDJ rearrangement that recurred across all pools carrying a provirus with a specific integration site. To confirm the assignment to TCR/provirus pairs, we leveraged the extraordinarily high diversity of the TCR repertoire and adopted combinatorial statistics previously used to confidently pair α and β chains (see Methods and Supplemental Figure 9A) (55). The patterns of co-occurrence observed for assigned pairs of proviruses and TCRβ sequences indicated that they belonged to the same T cell clones and did not occur together simply by chance (P value between 10–3 and 10–13) (Supplemental Figure 9B). We identified 8 unique pairs across 6 participants, 5 pairs from CMV-responding cells and 3 from Gag-responding cells (Figure 5). T cell clones carrying specific proviruses ranked among the most abundant CMV-responding TCR sequences (Figure 5D). As expected, the TCRs of the 3 Gag-specific clones paired with proviruses were less abundant, in agreement with our analyses of clonality of Gag-responding cells.

Figure 5 Analysis of VDJ and provirus pairs belonging to the same antigen-responding clone. (A) ddPCR design for duplex quantification of VDJ and proviral copies from gDNA of sorted cells. (B) Representative ddPCR 2D plot of duplex amplification of CASIGSSAAFF and cognate provirus integrated into the MKL1 gene. (C) Quantification of clonotypes by VDJ-specific ddPCR strongly correlated with TCRβ ImmunoSeq. Five CMV-responding clonotypes (described in D) were quantified in sorted cells responding to CMV or anti-CD3/anti-CD28 stimulation; axes represent log 10 abundance. (D) log 10 -ranked abundance plots of CMV- of Gag-responding cells showing HIV-1–infected clonotypes for which both the VDJ rearrangement and the integration site were identified (highlighted in orange). For each pair, bar graphs show the frequency of provirus (orange) and VDJ (blue) copies in CMV-responding and nonresponding memory cells. Provirus-to-VDJ ratios were used to calculate the percentage of a given clone that was HIV-1 infected.

To estimate the fraction of a given CMV-specific CD4+ clone that carried the associated provirus, we designed duplex digital droplet PCR (ddPCR) assays to directly count, within the same sample of sorted cells, both the rearranged VDJ sequence and the host-provirus junction belonging to 5 CMV-specific clones (Figure 5, A–C). These corresponding VDJ and proviral sequences were highly enriched in CMV-responding cells but absent or rare in memory cells not responding to CMV. We calculated the ratio between proviral copies (infected cells) and VDJ copies (total cells in the antigen-specific clone) (Figure 5D). We observed 2 patterns: in 3 clones from 3 participants, only a fraction of all the cells comprising the clone carried the proviruses (range, 14%–72%), suggesting that these clones were expanded before a member of the clone became infected and subsequently proliferated. Conversely, in 2 clones from participant P5, almost 100% of the cells comprising the clone carried the relevant provirus, suggesting that extensive antigen-driven clonal expansion happened after the infection event in one of the progenitor cells of that clone (Figure 6, A and B).

Figure 6 Total body clone sizes and contribution of CD4+ T cell subsets. (A) Proviral and VDJ frequencies were used to back-calculate total body clone sizes. Clones with integration sites in genes previously linked to the persistence of infected cells are highlighted in red. Venn diagrams show the fraction of a clonotype (orange) carrying its cognate provirus (blue). (B) Two scenarios of infection-expansion dynamics of antigen-driven clonal selection. The left panel shows infection of a clone already expanded in response to antigen, whereas the right panel shows selection occurring after an early infection event. (C) Contribution of CD4+ T cell memory subsets to clones based on provirus and VDJ measurements. Memory subsets are defined as shown in Supplemental Figure 10. Horizontal bars show the relative contribution of memory subsets for each clone. Gene symbols show the genes containing or closest to (indicated by an asterisk) the integration site. Genes previously linked to the persistence of HIV-1–infected cells are highlighted in red.

To investigate the contribution of T cell differentiation to the persistence of these antigen-specific clones, we applied the ddPCR assay to CD4+ T cells subsets sorted on the basis of CCR7 and CD45RA expression levels (Supplemental Figure 10). Although we observed the expected proportions of naive (TNa), central memory (Tcm), effector memory (Tem), and effector memory CD45RA+ (TEMRA) T cells within total CD4+ T cells, most proviral copies were found in Tcm and Tem subsets (Figure 6C and Supplemental Figure 10), in accordance with previous studies based on HIV-1 DNA (19, 25, 26), HIV-RNA (56), and on specific expanded clones carrying proviruses (57, 58). However, we observed a wide variation in the relative contribution of Tcm or Tem cells for different proviruses (from 0% to 100%), suggesting that adaptive immune responses to individual epitopes are heterogeneous and likely influenced by other factors, such as the frequency of peptide-MHC stimulation, TCR affinity (59) and T cell activation. More important, despite this variation, we observed a similar contribution of Tcm and Tem subsets between proviral and VDJ copies of the same clone (Figure 6C), supporting the hypothesis that infected and uninfected cells within an antigen-responding T cell clone are under the same differentiation-proliferation program. This finding further supports the idea that expansion of HIV-1–infected T cell clones is mostly regulated by T cell physiology rather than HIV-1–mediated effects.

To understand the extent of expansion of these HIV-1–infected, CMV-specific clones, we estimated their total body size. We observed striking clone sizes ranging between 105 and 108 cells (Figure 6A). The number of divisions needed to reach such sizes (mean of 21, SD ± 2.9) is achievable, considering the short doubling time of activated T cells (60), but it does not account for cell death (during clonal contraction) and the fact that antigen-driven expansion occurs only for rare (specific) clones and not for the whole T cell population. To investigate whether clones of this size could result from homeostatic proliferation, we calculated the likelihood that a clone could reach a given size by chance if the entire population of CD4+ T cells was maintained by a constant, balanced process of division and death (see Supplemental Methods and Supplemental Table 5). For the sizes of the HIV-1–infected CMV-responsive clones described here, the probabilities approached zero even for high turnover rates, supporting a scenario in which nonrandom events drove the proliferation of rare cells (antigen-specific clonotypes) rather than a homogeneous process of “cohort homeostasis.” Thus, the major driver of infected cell proliferation, and hence HIV-1 persistence, is the response to antigen and not to specificity-independent homeostatic or integration site–related proliferation.