antiviral immunity research

Towards a Systems Immunology Approach to Understanding Correlates of Protective Immunity against HCV

Abstract

Over the past decade, tremendous progress has been made in systems biology-based approaches to studying immunity to viral infections and responses to vaccines. These approaches that integrate multiple facets of the immune response, including transcriptomics, serology and immune functions, are now being applied to understand correlates of protective immunity against hepatitis C virus (HCV) infection and to inform vaccine development. This review focuses on recent progress in understanding immunity to HCV using systems biology, specifically transcriptomic and epigenetic studies. It also examines proposed strategies moving forward towards an integrated systems immunology approach for predicting and evaluating the efficacy of the next generation of HCV vaccines.

1. Introduction

Acute hepatitis C virus (HCV) infection resolves spontaneously in approximately 30% (15–45%) of infected individuals, whereas the remaining 70% (55–85%) of persons develop persistent infection and progressive chronic liver disease and are at risk of developing liver cancer. In 2016, the World Health Organization proposed a strategy to eliminate hepatitis as a public health problem by 2030. Elimination (a reduction to zero new cases in a defined geographical area) relies on the use of highly effective direct-acting antivirals (DAA) that can achieve complete cure in ~95% of treated subjects. However elimination of HCV and potentially its eradication (a complete and permanent worldwide reduction to zero new cases) are unlikely to occur without vaccines that can limit new virus transmission, especially in high-risk populations who have reduced access to testing and treatment and who are at higher risk of reinfection. A highly immunogenic T-cell-based vaccine against HCV recently failed to prevent chronic infection in a phase 2 clinical trial in people who inject drugs (PWID). Therefore, there is an urgent need to dissect the key components of protective immunity against HCV in real-life conditions and to identify elements that were missing in the vaccine-induced immune response.

Systems immunology approaches that employ integrated interdisciplinary methods to define the interactions between the different cellular and molecular components of the immune system have become powerful tools for profiling immune responses to vaccines and viral infections. HCV represents an interesting model to apply systems immunology approaches to study immunity against a human viral infection with two dichotomous outcomes, where the correlates of protective immunity in spontaneously resolved versus chronic infection can be examined. Furthermore, chronic HCV infection can be completely cured using DAA, thus offering a unique opportunity to assess the reversal of immune exhaustion/dysfunction post-virus clearance. Finally, neither spontaneous nor DAA-mediated HCV clearance protects against reinfection. This is partly due to the large number of viral variants that are continuously produced during infection and transmitted among high-risk populations, resulting in mixed infections or superinfections. Indeed, high-risk individuals such as PWID continue to be exposed and reinfected, thus presenting an opportunity to examine the correlates of protective immunity against a highly variable virus in a natural rechallenge experiment. Recently, several studies examined the transcriptomic and epigenetic changes in the context of HCV infection with different outcomes and post-DAA-treatment. Here, I will summarize these studies and propose an integrated systems immunology approach to define correlates of protective immunity against HCV.

2. Overview and Comparison of Systems-Transcriptomic Methods

Over the past two decades, different methods have been developed and used to measure gene expression by quantifying mRNA levels in transcriptomic analysis, as well as gene expression modifications through epigenetic analysis. Each method has its own advantages and limitations. The most common methods that were used or are likely to be used in the context of HCV and their limitations are summarized in this section.

2.1. Microarrays

Microarrays are the earliest form of high-throughput technology used in transcriptomic analysis. This approach is based on the use of many probes (typically thousands) of specific single-strand DNA fragments, corresponding to genes of interest that are loaded on a chip. These probes can bind fluorescently labeled complementary DNA (cDNA) reverse transcribed from mRNA samples of interest. The intensity of the fluorescence corresponds to the expression level of the corresponding transcript within the mRNA. However, microarrays require a relatively high mRNA input, and they can only detect specific predefined transcripts that are included on the chip.

2.2. RNA-Sequencing (RNA-Seq)

The development of next-generation sequencing (NGS) led to its application to RNA-sequencing (RNA-seq), in which the entire transcriptome can be examined. This method is based on the direct sequencing of fragmented cDNA libraries reverse transcribed from the mRNA of the sample of interest. Sequence reads are then mapped to the genome and the data are further transformed and processed to obtain read counts that reflect expression levels within the sample. RNA-seq has completely replaced microarrays due to several key advantages. First, it provides a significantly superior dynamic range for measuring expression across a wide range of transcript levels with less input RNA, making it more suitable for rare patient samples and for the detection of transcripts of low abundance. Second, because it relies on direct sequencing as compared to the use of predefined probes in microarrays, it is suitable for the detection of novel genes or transcripts of interest that may not be represented on the microarray chip. Third, the direct sequencing approach makes RNA-seq highly suitable for use in surrogate animal models of different species such as the equine hepacivirus (EqHV) model. Finally, RNA-seq data provide additional analysis opportunities for important biological information, e.g., the comparison of differential splicing across samples, functionally relevant single nucleotide polymorphism (SNP) analysis and RNA editing events. Both microarrays and RNA-seq can be performed on either total mRNA extracted from tissues (e.g., liver tissue), total peripheral blood mononuclear cells (PBMCs) or purified/sorted cells of interest (e.g., total CD8+ T cells or tetramer+ CD8+ T cells). RNA-seq analysis, performed on whole tissue or populations of cells, reflects the averaged gene expression across all cells and is thus termed ‘bulk RNA-seq’. It is useful to detect differences in different experimental conditions, disease outcomes and/or tissues. It is now also possible to infer the relative frequencies of different cellular populations from bulk RNA-seq data in a process known as CIBERSORT. However, the sensitivity of the detection of expression signals associated with a specific cellular subset is dependent upon the frequency of that subset and the levels of expression of the transcript of interest. Hence, signals from rare but key cellular populations may be missed.

2.3. Single Cell RNA-Seq (scRNA-Seq)

The advent of scRNA-seq has provided unprecedented opportunities for examining transcripts at the single-cell level and for defining the heterogeneity of cellular populations within a tissue or specific cellular subsets (e.g., HCV-specific CD8+ T cells). This process is based on partitioning individual cells into plates or droplets, where they are lysed and labeled with a unique identifier and then processed using procedures similar to those used for bulk RNA-seq. This method can also be coupled with analysis of the T or B cell receptors, thus providing the capacity to analyze the T or B cell repertoire and to track the expansion of specific T or B cell clones during the course of infection or in different tissue compartments. This technology is highly useful in understanding the interaction of individual cells with their microenvironment and has been successfully used to characterize the heterogeneity of macrophage populations in the liver and their evolution during different stages of fibrosis. However, the additional processing steps involved in scRNA-seq may differentially affect the viability and consequently the representation of certain cellular populations in the final dataset, leading to interpretation bias. In addition, scRNA-seq uses a low cellular and RNA input and this is associated with low RNA capture efficiency and a phenomenon known as “dropout”, occurring when a gene is observed at a moderate or even high expression level in one cell but is not detected in another cell. As a result, scRNA-seq data are of lower resolution and exhibit higher technical noise than bulk RNA-seq. In addition, microarrays, bulk RNA-seq and scRNA-seq are all prone to ‘batch effect’ related to variations induced by non-biological factors during different experimental batches. These issues represent challenges for the computational analysis of transcriptomic data. Several methods are currently available to correct for some of these issues (e.g., batch effect) and new bioinformatic analysis tools are constantly being developed to overcome these limitations, but they all need to be carefully considered and customized in data analysis and interpretation for each experiment. In summary, both bulk and scRNA-seq have their advantages and limitations and the choice of the method to use should carefully consider the research question and the limitations associated with each method.

2.4. Combined Transcriptomic and Proteomic Analysis Methods

Transcriptomic data requires validation at the protein level. Advances in high-dimensional flow cytometry and mass cytometry (cytometry by time-of-flight mass spectrometry (CyToF)) and the capacity to run high-resolution panels have provided additional dimensions for data validation but are still limited in terms of the different antibody combinations that can be used (typically 40–50 markers). Recent methods, which have not yet been applied to HCV, combine transcriptomic and proteomic analysis. These include cellular indexing of transcriptomes and epitopes by sequencing (CITE-Seq) and RNA expression and protein sequencing (REAP-Seq). Both methods combine RNA sequencing with the use of antibodies that are labeled with a specific barcode corresponding to the protein of interest, and a stretch of adenine bases that serve as a starting point for RNA sequencing. The two technologies differ only in the conjugation methods used to link the DNA barcode to the antibodies. They are used to obtain quantitative and