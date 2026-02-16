Study participants. 10 PWH on ART and 1 HIV-seronegative individual were recruited to donate rectosigmoid biopsies and leukapheresis products or whole blood through the University of North Carolina (UNC) Global HIV Prevention and Treatment Clinical Trials Unit and the UNC Center for AIDS Research HIV Clinical Cohort. The study was approved by the UNC Biomedical Institutional Review Board. Informed consent was obtained from all participants prior to study enrollment. The PWH cohort was 80% male and 20% female; participants had a median age of 59 years (IQR, 43–61 years), had a median CD4 count of 880 cells/μL (IQR, 588–1139 cells/μL), and had a median pre-ART CD4 nadir of 345 cells/μL (IQR, 179.5–389.5 cells/ μL). At the time of sample collection, individuals had been diagnosed with HIV for a median of 16 years (IQR, 13.3–23.2 years), treated with ART for a median of 12.8 years (IQR, 11.2–19.8 years), and durably suppressed (<50 copies/mL) for a median of 11.2 years (IQR, 5.7–15.4 years) (Supplemental Table 1; supplemental material available online with this article; https://doi.org/10.1172/JCI196536DS1).

Single-cell proteomic and transcriptomic profiling of colon and blood cells from PWH on ART. To define features of the HIV-1 reservoir in GI tissue–resident immune cells, fresh rectosigmoid pinch biopsies from the 10 PWH on ART and the HIV-seronegative individual were dissociated to obtain single-cell suspensions. In parallel, we isolated PBMCs from the same individuals. Both PBMCs and GI cells were stimulated for 6 hours with PMA/ionomycin and IL-2, or with control vehicle, before performing scRNAseq/surface protein analysis (Figure 1A). For some samples (donors 4–11), magnetic CD4+ T cell+ isolation was performed to increase the frequency of CD4+ T cells in the cell suspension. For blood-derived samples from 3 donors, we also performed a combined analysis of single-cell transcriptomes and surface protein abundances (CITEseq) using a panel of barcoded antibodies against 135 different surface proteins and 6 isotype controls. The overall scRNAseq/surface protein dataset was assessed using a computational analysis pipeline (Supplemental Figure 1) that included comprehensive quality control (QC) analysis to exclude empty droplets, doublets, or dying cells and the combination of all samples into a single data object. After filtering, we acquired high-quality single-cell transcriptomes from 394,107 cells (161,171 cells from colon and 232,936 cells from peripheral blood) (Supplemental Table 2). Comparison across donors and sample conditions demonstrated similar QC metrics across donors, compartment, and treatment conditions (Supplemental Figure 2, A–C).

Figure 1 Single-cell proteomic and transcriptomic profiling of colon and blood cells from people with HIV. (A) Schematic overview of experimental design and sample processing pipeline. (B) Uniform manifold approximation and projection (UMAP) visualization of the combined dataset of scRNAseq and surface protein abundance for matched colon and blood cells from 10 people with HIV (PWH) on antiretroviral therapy (ART) and 1 HIV-seronegative participant. Visualization shown is after reciprocal principal component analysis (RPCA) correction. Top: Blood-derived cells highlighted in red. Bottom: Colon-derived cells highlighted in blue. (C) UMAP visualization of cell clusters colored and labeled by annotated cell type. (D) Dot plot of gene expression levels from scRNAseq dataset, including both stimulated and unstimulated cells, organized by clusters (y axis) and gene of interest (x axis). Shown are key markers used in identification and annotation of cell types.

We first visualized the scRNAseq dataset from all 11 donors with uniform manifold and projection (UMAP) (Figure 1, B and C). Initial examination of the data revealed heterogeneity driven by the stimulation condition and, to a lesser extent, the tissue compartment (Supplemental Figure 2D). We then employed batch correction with reciprocal principal component analysis (RPCA) to identify corresponding cell types across samples. In the absence of RPCA, unstimulated and PMA/IL-2–stimulated cells were separated on the UMAP plot, indicating the potent effect of this stimulation condition on the transcriptome of both blood and colon cells. Blood and colon cells were also somewhat separated on the UMAP plot, indicating transcriptomic differences between the anatomical compartments. After RPCA, cells from each condition and compartment were aligned across sample conditions, enabling the visualization of shared characteristics in UMAP projections and neighbor finding while still retaining clear compartment-specific cell populations (Figure 1B and Supplemental Figure 2D). Examining the data from the 11 donors separately in the RPCA-corrected UMAP plot, we observed that the data from different donors largely overlapped, suggesting that the overall cluster structure to the data was independent of the donor (Supplemental Figure 3).

We then employed graph-based neighbor finding and clustering to define transcriptionally distinct groups of cells within the data. Across the entire dataset, we identified 32 distinct cellular clusters. We used a manual annotation approach in which annotation was performed by examining expression of a panel of known lineage marker genes (Figure 1D and Supplemental Table 3) and the differentially expressed genes (DEGs) detected in each cluster for unstimulated cells to avoid potential activation-induced masking of cell identity (Supplemental Figure 4). The annotation of cluster identities was further informed by parallel surface protein (CITEseq) analysis from peripheral blood cells (Supplemental Figures 5 and 6) and analysis of cytokines induced by stimulation (Supplemental Figure 7). Of the 32 clusters, 30 were positive (expressed in >25% of cells) for PTPRC (CD45) RNA, an immune cell lineage marker (Figure 1D). Furthermore, 20 clusters were robustly positive (expressed in >50% of cells) for RNA encoding the T cell lineage marker CD3D. We renamed the clusters to form numerically close groupings, such that clusters 1–20 contained predominantly CD3+ T cells while other immune cell populations form clusters 21–29.

The predominant site of HIV infection and persistence is CD4+ T cells, leading us to focus our analysis on CD4+ T cells. While CD4 RNA was overall sparsely detected, CD4 RNA and CD4 surface protein staining revealed CD4 expression in all T cell clusters except for cluster 19, which we assessed as likely containing a mixture of NK cells (NCAM1/CD56+) and CD8+ T cells. CD8 RNA and CD8 surface protein staining were primarily detected in clusters 17–20; these clusters also expressed transcripts for the cytotoxic effector GZMB, even in the absence of stimulation (Supplemental Figure 7). Thus, clusters 17–20 likely consist of a mixture of CD8+ T cells, NK cells, and CD4+ T cells with a cytotoxic phenotype.

We then identified subsets of functionally distinct cells within the CD4+ T cell clusters (clusters 1–16). Clusters 1, 2, and 3 were identified as naive or naive-like T (Tn) cells owing to the absence of the FAS/CD95 surface marker that defines antigen-experienced T cells as well as the presence of the Tn surface protein marker CD45RA (Figure 1D and Supplemental Figure 5). Clusters 1 and 2, and to a lesser extent cluster 3, also expressed comparatively high levels of transcripts for genes encoding surface markers associated with Tn and central memory T (Tcm) cells, including SELL/CD62L, CCR7, CD28, IL7R/CD127, and S1PR1 (32, 33). Additionally, these cells expressed the Tn/Tcm-associated transcription factors (TFs), TCF7, LEF1, and KLF2, that are important for a long-lived quiescent cellular phenotype (34). Clusters 4, 5, and 6 were separated from the main clusters of T cells and were identified as Tregs based on high levels transcripts for the IL2 receptor (IL2RA) and the Treg lineage TF FOXP3 (Figure 1D) (35). Cluster 4 had comparatively lower expression of FAS/CD95 and may be partially composed of naive Tregs. Cluster 6 exhibited a distinct transcriptional profile, with high levels of expression for IL1R1 and the TFs MAF and IKZF3, whereas clusters 4 and 5 exhibited high levels of IKZF2 expression. Furthermore, unlike the Treg clusters 4 and 5, stimulated cells in cluster 6 expressed IL10, IL17A, and IL17F and may represent a previously described population of intestinal Tregs that expresses both Treg (FOXP3+) and Th17 (IL17A+, IL17F+) characteristics, referred to as Treg17 cells (36, 37). We identified clusters 7–12 as long-lived Tcm cell clusters owing to the presence of FAS/CD95 RNA indicating antigen-experienced cells; expression of the Tn/Tcm TFs CCR7, LEF1, TCF7, and IL7R (Figure 1D); and the lack of surface CD45RA protein expression (Supplemental Figure 5). Cluster 12 expressed TOX and EOMES suggesting the presence of exhausted T cells, a dysfunctional cell state best characterized in chronic infections and cancer (38). Cluster 13 cells expressed IL17F and IL17A after stimulation, suggesting the presence of Th17 cells; this designation is supported by expression of IL23R and the TF RORC (39). Finally, cluster 14 and 15 were annotated as tissue-resident memory CD4+ T cells (Trm cells), based on expression of CD69, ITGE, S1PR1, and KLF2, while cluster 16 was annotated as containing Th2 cells based on high expression of GATA3 (40).

Blood and colon T cells exhibit distinct subpopulation abundances and transcriptomic profiles. We next restricted our analyses to the T cell clusters (clusters 1–20) (Figure 2A) and compared blood T cells to colon T cells. From a visual observation of the UMAP projection, we noted that colon and blood cells exhibited different distributions across the transcriptomic clusters (Figure 2B). To assess global differences between blood and colon T cells, we identified DEGs between cells from these compartments. Among all T cells, 1,311 genes had higher levels of expression in colon T cells, while 2,438 genes had higher levels of expression in blood T cells (P adj < 0.05, log 2 fold change > 0.3) (Supplemental Figure 8A and Supplemental Table 4). When we examined the top 500 upregulated genes in blood or colon, we observed significant (P adj < 0.05) enrichment for several specific biological pathways. In genes that were more highly expressed in colon T cells, the most enriched pathways from the MSigDB database (https://www.gsea-msigdb.org/gsea/msigdb) were “TNF-alpha signaling via NF-kB,” “IL-2/STAT5 Signaling,” and “Inflammatory Response,” while in the genes that were more highly expressed in blood T cells, “IFNa Response” and “IFNg Response” were the most enriched (Supplemental Tables 5 and 6). Notably, when we examined genes that regulate lymph node homing and retention (CCR7, S1PR1, and SELL/CD62L) these genes were all more highly expressed in blood T cells, while 2 integrins that promote binding of cells to extracellular matrix in tissues (ITGAE and ITGA1) as well as the T cell tissue retention and activation marker (CD69) (11) were all expressed more highly in colon-resident T cells (Figure 2C). We also examined several functionally important T cell TFs (41, 42) and observed that RUNX3, STAT3, and STAT4 had higher expression in colon-derived T cells, while RUNX1, LEF1, FOXO1, TCF7, and KLF2 were more highly expressed in blood-derived T cells (Figure 2C). When we considered only the unstimulated samples, we also observed several thousand DEGs between colon and blood T cells. Among unstimulated T cells, 2,113 genes were expressed at higher levels in colon T cells and 735 genes were expressed at higher levels in blood T cells (Supplemental Table 7 and Supplemental Figure 8B).

Figure 2 Blood and colon T cells exhibit distinct population abundances and transcriptomic profiles. (A) UMAP visualization of combined T cell clusters 1–20. (B) UMAP visualization of unstimulated T cells from the blood and colon, with the different compartments shown separately. (C) Dot plot comparing expression for selected differentially expressed transcription factor and surface markers between unstimulated blood and colon CD4+ T cells. (D) Abundance of cells within individual T cell clusters in each compartment, as a fraction of all cells in the compartment. Data from colon cells are shown in blue; data from blood cells are shown in red. (E and F) Dot plots showing expression of transcription factor and surface markers in unstimulated colon-derived (E) and blood-derived (F) T cells within each cluster.

When we examined the proportional abundance of each of the clusters within blood or colon cells, we observed that certain clusters were preferentially represented in one of the compartments (Figure 2D). In particular, we observed that Tn cells were much more abundant in blood, while Treg17, Th17, and Trm cells were more abundant in colon tissue. Thus, colon and blood T cells are different with respect to the subtypes of T cells present. We then examined expression of selected sets of genes across the different clusters within both unstimulated blood and unstimulated colon T cells, including TFs and tissue retention molecules (Figure 2, E and F); known HIV expression–regulating TFs, NF-κB and AP1 (Supplemental Figure 8, C and D); chemokines and chemokine receptors (Supplemental Figure 8, E and F); and interferon-stimulated genes (Supplemental Figure 8, G and H). We observed that the pattern of elevated expression of tissue retention markers (ITGAE, ITGA1, and CD69) was present across most colon T cell clusters, consistent with the higher fraction of cells with a tissue-resident phenotype in the colon (Figure 2, E and F). Similarly, genes that exhibited elevated expression in blood-derived T cells (LEF1, TCF7, and KLF2) had higher across expression across multiple subclusters of blood T cells. Overall, these data identify important differences between the composition and molecular phenotype of cells in the colon and the blood of PWH.

HIV RNA+ cells display heterogeneous phenotypes in the blood and colon of PWH on ART. We next examined the expression of HIV RNA transcripts (vRNA) across the cell populations by alignment of the data to a consensus clade B HIV reference genome (43). Although this approach likely underestimates the true abundance of infected cells, due to the limited depth of sampling with scRNAseq and the presence of transcriptionally silent proviruses, this approach nonetheless can be used to characterize a subset of infected cells. Across cells from all 10 PWH, we identified a total of 125 cells with vRNA, 123 of which were from T cell clusters (Supplemental Table 8). The 2 HIV vRNA+ cells identified outside of these T cell clusters were found in clusters annotated as “unclassified lymphocytes.” To simplify analyses, we then focused on the 123 HIV vRNA+ cells within the 20 T cell clusters. 66 of the HIV vRNA+ cells were found within the colon T cells, and 57 were found within the blood T cells. The number of HIV vRNA+ cells within each sample was highly variable across the donors, with values ranging from 0 to 21, and the frequency of vRNA+ cells per million ranging from 0 per million (/M) to 1,698/M (Figure 3A). As expected, vRNA+ cells were more numerous in the PMA/IL-2–stimulated samples than in the unstimulated samples (92 vRNA+ cells vs. 31 vRNA+ cells respectively). The frequency per million of vRNA+ cells in unstimulated colon samples compared with unstimulated blood cells (mean 296/M vs. 143/M) was not significantly different (P = 0.41, Mann-Whitney test). Within vRNA+ cells, we also examined the expression level of HIV transcripts and found no significant difference in expression for colon cells versus blood cells (P = 0.17, Mann-Whitney test) (Figure 3B) or in PMAi/IL-2–stimulated cells versus unstimulated cells (P = 0.27, Mann-Whitney test) (Figure 3C).

Figure 3 HIV RNA+ cells display heterogeneous phenotypes in the blood and colon of PWH. (A) The frequency of cells with detectable viral RNA per million cells is displayed for each donor, separated by tissue of origin (blood/colon) and condition (unstimulated/PMAi/IL-2 stimulated). Horizontal bar represents mean values for each column. Each dot represents data from an individual donor. (B) Violin plot of normalized HIV expression level for each infected CD4+ T cell within the dataset divided by compartment. Blood cells are shown in red; colon cells are shown in blue. Each dot represents a single infected cell. (C) As in B, but with the data subdivided by stimulation condition. PMAi/IL-2 cells are shown in pale blue, and unstimulated cells are shown in purple. (D) UMAP visualization of the combined dataset with vRNA+ cells from the blood compartment highlighted in red. (E) As in D, but with vRNA+ cells from the colon compartment highlighted in blue. (F) Pie chart showing proportion of vRNA+ cells and total cells in each transcriptomic cluster for blood and colon cells separately. (G) Bar chart of the frequency of vRNA+ cells within each transcriptomic cluster for blood and colon tissue. Tem, effector memory T cell.

Next, we examined the distribution of HIV-mapping reads across the viral genome. From the 125 vRNA+ cells in the scRNAseq analysis we identified a total of 1,599 HIV-mapping reads derived from 873 unique molecular identifiers (UMIs) (Supplemental Table 9). It should be noted that the raw reads derived from 1 UMI can have multiple start points due to the random fragmentation step after the first PCR of the scRNAseq protocol. We observed several notable features of the distribution of viral reads across the HIV genome. First, many of the reads mapped to the 5′ region of the virus. These included both reads within the 5′ long terminal repeat (LTR), with a large number of reads beginning at the transcriptional start site (TSS), and reads extending past the 5′ LTR into the 5′ region of the Gag open reading frame. Second, additional peaks of viral gene reads were present near nucleotide positions 4,900, 5,400, and 6,000 (Supplemental Figure 9, A and B). Since the library construction protocol used for this study relies on oligo-dT–dependent priming at the poly-A tails of mRNAs and enzymatic fragmentation to generate reads proximal to the poly-A site, this distribution of viral reads distant from the viral poly-A site was initially unexpected. It is unlikely that the TSS proximal reads represent fragments from paused TAR RNAs, since these partial transcripts are not polyadenylated. These observations were consistent across donors, as we observed reads that began at the TSS in 9 donors, reads past the 5′ LTR at the 5′ end of Gag in 7 donors, and peaks near 4.9 kb, 5.4 kb, and 6 kb in 2, 3, and 3 donors respectively. Notably, our findings resemble findings from recent studies using a similar 3′ scRNAseq approach (44, 45). As explored by Schlachetzski et al., the LTR regions are identical in HIV proviruses, so alignments to the 5′ LTR versus the 3′ LTR are difficult to resolve if the read does not extend past the LTR into unique regions (44). However, mapping ambiguity between the 5′ and 3′ LTR does not fully explain the presence of 5′ LTR-mapping reads, since mis-assigned 3′ LTR reads would not extend past the U5 region into Gag. We also identified A-G–rich regions of the HIV genome (13 bp A/G at HXB2 position 778–790, 16 bp A/G at HXB2 position 858–873) internal to the Gag transcript that are likely sufficient to allow internal priming during the reverse transcription step of library construction. Internal priming from A-rich regions is a well-described phenomenon in oligo-dT primed RNA-sequencing libraries (46), and base-pairing between T and G is also thermodynamically stable when G is in an A-rich region (47). Furthermore, the large size of the HIV genome has previously been shown to restrict recovery of full-length viral sequences in near full-length proviral sequencing approaches (48). Thus, we speculate that, in addition to the ambiguity created by identity between the 5′ and 3′ LTRs, internal reverse transcription priming events during library construction may account for a large fraction of viral mapping reads in scRNAseq libraries generated with an oligo-dT based priming approach.

We then examined the distribution of vRNA+ cells across the transcriptomic clusters for both blood- and colon-derived T cells (Figure 3, D–G, and Supplemental Figure 10A). Overall, we observed for both blood- and colon-derived cells that, while vRNA+ cells were heterogeneous in nature (found in several different transcriptomic clusters) (Figure 3F), they disproportionately originated from a subset of the T cell clusters. In particular, vRNA+ cells were predominated by Th17, Trm, Tcm, Th2, and cytotoxic CD4+ T cells (Figure 3F) and also differed in relative proportion depending on tissue source. For example, Trm cells made up a larger proportion of the overall pool of infected cells in the colon than in blood cells. While the cluster distribution of vRNA+ cells in the blood was quite different from total blood cells, the vRNA+ cells in the colon more closely resembled total colon T cells (Figure 3F). It is noteworthy that the population of vRNA+ cells in the blood also more closely resembles total colon T cells than total blood T cells with respect to cluster proportions. We thus speculate that a substantial fraction of the vRNA+ cells detected in the blood originated from colon tissue.

We then examined the proportional abundance of vRNA+ cells as a frequency per million cells for each cluster (Figure 3G and Supplemental Figure 10A). Interestingly, Treg17 cells and Th17 cells exhibited the highest proportional abundance of vRNA+ cells, with Tcm and Trm clusters also exhibiting a relatively high abundance of vRNA+ cells. Th2 cells in the colon also exhibited a relatively high frequency of HIV RNA–expressing cells. By contrast, vRNA+ cells were relatively scarce in cells with a Tn phenotype for both compartments. We also examined the expression level of HIV within the vRNA+ cells across the different clusters to determine whether specific cell types are more prone to elevated HIV expression. A broad range of HIV RNA expression was evident across cell clusters, but no statistical difference between the clusters was detected (P = 0.197 by Kruskal-Wallis test) (Supplemental Figure 10B), possibly due to the small number of vRNA+ cells in some clusters. Additional data may reveal evidence for differential expression of HIV proviruses in different cellular environments.

The frequency of vRNA+ cells in the colon is correlated with the frequency of vRNA+ cells in blood. To better understand the relationship between the blood reservoir and the colon reservoir, we next assessed the level of correlation between the frequency of HIV vRNA+ cells in these compartments. When we examined the correlation between blood and colon vRNA+ cells for all samples, both unstimulated and stimulated, we observed a significant correlation across the cohort (r = 0.6387, P = 0.0024 Spearman’s correlation) (Figure 4A). When we separated the two conditions (stimulated and unstimulated), we still observed a significant correlation between vRNA+ cells in blood and colon for stimulated samples (r = 0.7072, P = 0.0198) but not for unstimulated samples (r = 0.5435, P = 0.1442). Since a subset of the study participants had previously had their replication-competent blood reservoir quantified by quantitative viral outgrowth assay (QVOA), we also examined the correlation between infectious units per million cells (IUPM) measured by this assay and the frequency of vRNA+ cells. Notably, the frequency of vRNA+ cells in neither the blood nor the colon correlated with the IUPM measured by QVOA (Figure 4B). Thus, while the frequency of inducible vRNA+CD4+ T cells in the colon correlates with the frequency of inducible vRNA+ cells in the blood, we do not observe a clear correlation between the frequency of vRNA+ cells and the inducible replication-competent HIV blood reservoir measured by QVOA.

Figure 4 The frequency of vRNA+ cells in the colon is correlated with the frequency of vRNA+ cells in blood. (A) Scatter plot of frequency of vRNA+ cells in blood cells vs. vRNA+ cells in colon-derived T cells for all conditions (left), unstimulated cells (middle), and stimulated cells (right). Spearman’s correlation coefficients and P values are shown. Each data point represents a single participant. (B) Scatter plots of the frequency of vRNA+ cells in blood cells (left) or colon-derived T cells (right) vs. infectious units per million cells (IUPM) calculated from a quantitative viral outgrowth assay (QVOA) of blood CD4+ T cells for each PWH. Spearman’s correlation coefficients and P values are shown. Each data point represents a single participant.

Identification of DEGs in vRNA+ cells. We next examined whether the HIV vRNA+ cells exhibit any unique gene expression patterns that could distinguished them from uninfected cells (Figure 5A). First, we compared all HIV vRNA+ T cells across both compartments and both stimulation conditions to vRNA– T cells. We identified 111 genes that were expressed to a higher level in vRNA+ cells (Figure 5B and Supplemental Table 10). This list included several cytokines (IL2, IL17F, IL21, IL22, IFNG, TNF, and CCL20) as well as ZBED2, a TF that regulates IFN responses (49); spermine oxidase (SMOX), an enzyme that has been proposed to mediate Tat-dependent neurotoxicity (50); and the TF MAF. By contrast, only 5 genes, MALAT1, EEF1D, NACA1, RPL13A, and CYTIP, were transcriptionally downregulated in the HIV vRNA+ cell population relative to vRNA– cells. We next compared gene expression between vRNA+ and vRNA– T cells within blood or colon, dichotomized by stimulation condition, and observed that 50 genes (including HIV) exhibited elevated expression in vRNA+ unstimulated blood T cells compared with vRNA– cells (Figure 5C and Supplemental Table 11), while 23 genes were elevated in unstimulated vRNA+ colon T cells (Figure 5D and Supplemental Table 12). Notably, these 2 sets of genes exhibited no overlap, suggesting that DEGs that characterize vRNA+ cells in the blood are distinct from the DEGs that define vRNA+ cells in the colon. When we looked within the stimulated blood cell dataset, we observed 22 upregulated genes and no downregulated genes in vRNA+ cells (Figure 5E and Supplemental Table 13). In vRNA+ cells, ZBED2, the zinc finger transcriptional repressor ZIK1, and SMOX were upregulated, along with interferon regulatory factor 8 (IRF8). Consistent with our observed enrichment of vRNA+ cells within the Th17 compartment, the cytokine IL17F also exhibited upregulated expression in vRNA+ cells within stimulated blood samples. Within stimulated colon T cells, only 9 genes exhibited differential expression, all of which were higher in vRNA+ cells (Figure 5F and Supplemental Table 14). The most significantly upregulated (P adj < 0.05) gene in stimulated colon cells was the long noncoding RNA ITPR1-DT, followed by an antisense transcript for SRCAP. Some genes associated with Th17 cell identity such as RORC and IL21 also trended high in stimulated vRNA+ colon T cells, but these differences were not statistically significant (P adj > 0.05). For the blood cell dataset, we also examined differential expression of surface proteins between vRNA+ and vRNA– cells. Prior to correction for multiple comparisons, 29 proteins were differentially abundant between vRNA+ cells and vRNA– cells across the entire dataset, with 21 proteins upregulated and 8 downregulated (Figure 5G and Supplemental Table 15). Following multiple comparison correction, 5 proteins were found to be significantly differentially abundant — 3 that were higher in vRNA+ cells (CD45RO, CD95, and CD54) and 2 that were higher in vRNA– cells (CD3 and CD7) (Figure 5H). These data are consistent with enrichment of vRNA+ cells in memory T cells.

Figure 5 Identification of differentially expressed genes in vRNA+ cells. (A) Schematic overview showing experimental design. (B) Volcano plot of differentially expressed (log 2 fold change > 0.3, P adj < 0.05, Wilcoxon’s rank-sum test) genes in HIV vRNA+ cells compared with vRNA– CD4+ T cells across all samples (blood and colon, stimulated and unstimulated cells). Upregulated genes in red, downregulated genes in blue. (C–F) Volcano plots of differentially expressed genes, as in B, specifically within the indicated sample conditions. (G) Differentially expressed surface proteins between vRNA+ cells and vRNA– cells before correction for multiple testing. (H) As in G, but after Bonferroni’s P value correction for multiple testing.