4  AIRR-Seq

AIRR-Seq, Rep-Seq, Repertoire, VDJ and immune profiling

4.1 Immunity overview

4.1.1 Innate vs Adaptive immunity

  • Innate (Physical, Phagocytes, Dendritic cells, NK cells, ILCs, 0-12 hours)
  • Adaptive immunity (Begins ~3 days or more)
Fig 4.1: Just as resistance to disease can be innate (inborn) or acquired, the mechanisms mediating it can be correspondingly divided into innate (left) and adaptive (right), each composed of both cellular (lower half) and humoral elements (i.e. free in serum or body fluids; upper half). Adaptive mechanisms, more recently evolved, perform many of their functions by interacting with the older innate ones.
Fig 4.2: Innate and adaptive immunity time line. The mechanisms of innate immunity provide the initial defense against infections. Adaptive immune responses develop later and require the activation of lymphocytes. The kinetics of the innate and adaptive immune responses are approximations and may vary in different infections.

Source

4.1.2 Adaptive immunity

  • Humoral immunity (B cells, antibodies)
  • Cell mediated immunity (CD4 Helper T cells, CD8 Cytotoxic T cells, Macrophages)
Fig 4.3: The figure shows an adaptive immune response to the first and second encounter with the SARS-CoV-2 antigen (spike protein).

Following the first encounter, various antigens go through the process of phagocytosis and decomposition inside the antigen-presenting cells (APCs). APCs fragment the antigen into smaller peptides, which they present on their surface mediated by surface receptors called major histocompatibility complex class II (MHC-II) molecules. The antigens are then presented to several types of cells in the host, among which we emphasize CD4+ T helper cells (also B and CD8+ cells). B cells which differentiate into plasma cells, secrete antibodies that inhibit the entry of the viral particle into the healthy cells. The activation of T helper cells by APC causes them to differentiate into different subtypes with specific functions mediated by cytokine secretion and cell-to-cell contact. Th2-differentiated T helper cells help humoral responses mature by providing a second signal to B cells, mostly through IL-4 secretion and CD40/CD40L interaction. Some CD4+ cells also become T follicular helper cells (Tfh), which govern the important interactions in the germinal centers important for the maturation of memory B cells and long-lived high-affinity antibody-producing plasma cells. Another subset of CD4+ T cells differentiates into a pool of memory T helper cells. Th1 T helper cells play a crucial role in cellular response formation. They pivot the MHC-I activation of CD8+ cells (CTL, cytotoxic T lymphocytes) by interacting with APC’s simultaneously. Activated CTLs then act by causing apoptosis (by Fasl ligand–FasL binding) of the host cells that are infected with the SARS-CoV-2. Some CTLs differentiate into memory cytotoxic T cells, which have the role of fast restoration of the CTL response with secondary antigen contacts. A similar mechanism of destruction occurs when NK cells interact with the virally infected cell. They contain granules with IFNγ and TNFα in their cytoplasm, which, when secreted, cause programmed cell death. As well, the mechanism of activation of NK cells does not occur through MHC molecules is important because MHC is not always present on the virally infected cells.

Primorac et al. (2022)

Fig 4.4: Detailed lineage tree showing development of various blood cells. Image source: Beckman Coulter.

Fig 4.5: Cells of the immune system showing a simplified lineage.

Kavathas et al. (2019), Lewis & Blutt (2019)

4.2 Immune cells

4.2.1 B cell development

Fig 4.6: B cell development.
  • The aim is to generate diverse antigen receptors, eliminate self reactives, promote foreign reactives to mature B cells
  • Starts in the bone marrow as HSC followed by CLP (Common lymphoid progenitor)
  • Early pro B cell → Heavy chain random DJ rearrangement
    • If it succeeds on both chromosomes, this cell becomes Late pro B cell
  • Late pro B cell →V and DJ rearrangement
    • If it succeeds on both chromosomes, this cell becomes Large pre B cell
  • Large pre B cell →Heavy chain peptide is synthesized, pairs with surrogate light chains
    • On successful pairing, this cell proliferates
    • The same heavy chain can pair with many other light chains.
  • Small pre B cell →Undergoes light chain rearrangement
    • kappa rearrangement on one chr
    • kappa rearrangement on second chr
    • If kappa didn’t work then
    • delta rearrangement on one chr
    • delta rearrangement on second chr
    • When light chain pairs successfully with heavy chain→Immature B cell
  • Immature B cell
    • VDJ recombination is complete
    • Functional BCR receptor as IgM -> Mature naive B cell
    • BCRs can bind to diverse ligands such as peptides, lips, carbohydrates etc
  • Mature naive B cell
    • The heavy chain mRNA undergoes alternative splicing which changes the transmembrane region to either create IgM or IgD but with same variable region
  • Naive B cells interact with antigen in the lymph nodes. When the antigen has crosslinked with BCR, the cell is activated. An activated B cell massively proliferates called clonal expansion. All clonally expanded cells have the same antigen specificity but undergoes somatic hypermutation (affinity maturation). SHM may increase antigen affinity than the original activated parent B cell.
  • Affinity matured B cells differentiate into plasma cells. Plasma cells are antibody secreting B cells. The first antibody produced is IgM. T cell independent differentiation only produces IgM antibodies and also does not produce memory B cells.
  • T cell mediated B cell maturation. B cells interact with T cells in the lymph node enabling Ig class switching. ie; change in heavy chain.

Source: Kuby immunology

Morgan & Tergaonkar (2022)

4.2.2 T cell development

Fig 4.7: T cell development.
  • TCR has two chain: alpha and beta chains. TCRs only bind to peptides and self MHC.
  • In bone marrow, HSC →CLP which migrates to thymus
  • Thymus is divided into peripheral cortex and central medulla. Most T cell development happens in the cortex.
  • In the thymus, the CLP starts off as double negative T cell
  • Double negative T cells: T cells with germline DNA that are negative for CD4 and CD8
    • Early double-negative thymocyte: D-J beta chain rearrangement, first try on 1st loci, else second loci
    • Late double-negative thymocyte: V-DJ beta chain rearrangement, first try on 1st loci, else second loci
    • CD3 receptor is expressed on cell surface
  • Double-positive T cells:
    • Expresses both CD4 and CD8 receptors.
    • Alpha chain rearrangement. V-J rearrangement at loci 1, else loci n.
    • Alpha and beta chains are exposed on cell surface and cell undergo selection
      • Cells that have too strong or too weak self affinity are discarded
      • Cells that recognise self MHC with moderate strength -> CD4+
      • Cells that recognize self MHC with weak strength -> CD8+
  • Single-positive: Naive CD4 or CD8 T cells
  • Memory CD4 or CD8 T cells
  • Effector CD4 or CD8 T cells

Source: Kuby immunology

De Simone et al. (2018)

4.3 VDJ recombination

Fig 4.8: Fig 1 and 2 shows BCR/antibody structure and the V(D)J locus. Fig 3 shows the genomic structure of BCR and TCR chains. Image references: 1, 2, 3

V(D)J recombination is the mechanism of somatic recombination that occurs only in developing lymphocytes during the early stages of T and B cell maturation. It results in the highly diverse repertoire of antibodies/immunoglobulins and T cell receptors (TCRs) found in B cells and T cells, respectively. The process is a defining feature of the adaptive immune system.

V(D)J recombination in mammals occurs in the primary lymphoid organs (bone marrow for B cells and thymus for T cells) and in a nearly random fashion rearranges variable (V), joining (J), and in some cases, diversity (D) gene segments. The process ultimately results in novel amino acid sequences in the antigen-binding regions of immunoglobulins and TCRs that allow for the recognition of antigens from nearly all pathogens including bacteria, viruses, parasites, and worms as well as “altered self cells” as seen in cancer.

Germline organisation of the Ig locus

  • Ig proteins consists of two identical heavy and light chains
    • H chain locus is located on Chr 14
  • The light chains can be kappa or lamba
    • Kappa chain locus on Chr 2
    • Lamba chain locus on Chr 22

4.3.0.1 BCR review

  • BCR consist of two heavy and 2 light chains. Heavy chain has 1 V domains and 4-5 C domains. Light chain contains 1 C and 1 V domain.
  • CDR in V domains makes contact with antigen
  • Following antigen stimulation, B cells secrete immunoglobulins that bear the same antigen binding site as the BCR

4.3.0.2 TCR review

  • Human T cells develop in the thymus and they acquire the ability to recognize foreign antigens and provide protection against many different pathogens.
  • This functional flexibility is guaranteed by the expression of highly polymorphic surface receptors called T cell receptors (TCRs).
  • TCR is composed of two different protein chains. The vast majority of human T cells express TCRs composed of α (alpha) and β (beta) chains.
  • The genes encoding alpha (TCRA) and beta (TCRB) chains are composed of multiple non-contiguous gene segments which include variable (V), diversity (D), and joining (J) segments for TCRB gene and variable (V) and joining (J) for TCRA gene.
  • The enormous diversity of T cell repertoires is generated by random combinations of germ line gene segments (combinatorial diversity) and by random addition or deletion at the junction site of the segments that have been joined (junctional diversity).
  • The sequence encoded by the V(D)J junction is called complementarity determining region 3 or CDR3. This sequence has the highest variability in both alpha and beta chains and determines the ability of a T cell to recognize an antigen.
  • The total number of possible combination is estimated to exceed 10^18.
  • The diversity of naïve T cells is the T cell repertoire
  • Exposure to an antigen drives a rapid clonal expansion of cells carrying identical TCRs to generate a population of “effector cells.”
  • After antigen clearance, a reduced number of these cells remain in the blood as “memory cells.”
Fig 4.9: Somatic V(D)J arrangement in the alpha and beta chains. (A) Genomic organization and somatic recombination of TCRB and TCRA loci. Antigen repertoire diversity is guaranteed by a recombination step that progressively rearranges V, D, and J segments for T cell receptor (TCR) beta chains and V and J segments for TCR alpha chains. This variability (combinatorial diversity) is further increased by addition or deletion of nucleotides at the junction sites (junctional diversity). (B) Productive arrangements of beta and alpha transcripts. (C) Organization of TCR. TCR is composed by two subunits TCR alpha and TCR beta each organized in a constant region and a variable region responsible for antigen recognition.
Fig 4.10: Differences between BCR and TCR receptors.

4.4 Antibodies

Fig 4.11: Structure of an antibody
  • Antibodies are secreted by plasma B cells
  • Consist of 2 identical heavy and 2 identical light chains
    • Light chain is either kappa or lamda. There are 4 subtypes of lamda with lamda 1 as most common.
    • Heavy chain is either mju, gamma, alpha, delta or epsilon
      • Heavy chain defines the isotype (IgG, IgA etc)
  • Each chain has a variable domain and constant domain
    • Light chain has 50% variable and 50% constant domain
    • Heavy chain has 25% variable and 75% constant domain
    • The variable domain is called the paratope that interact with epitopes on antigen
      • The variable region has 3 CDR (complementarity determining) regions that are hypervariable. CDR3 is the most variable.
      • Peritope?
  • The CDR region of heavy chain is composed of V,D,J genes
  • The CDR region of light chain is composed of V,J genes
  • Antibody isotypes (classes) are defined by the heavy chain: IgG, IgD, IgE, IgA (dimer), IgM (pentamer)
  • Types of antibodies (further work)

Source: Kuby immunology

  • Chap 6: The organisation and expression of lymphocyte recepter genes
  • Chap 8: T-cell development
  • Chap 9: B-cell development
  • Chap 10: T-cell activation, helper subset differentiation and memory
  • Chap 11: B-cell activation, differentiation and memory generation
  • Chap 12: Effector responses: antibody and cell-mediated immunity

4.5 AIRR

Single cell immune repertoire analysis overview Irac et al. (2024)

4.5.1 Experimental design

  • In addition to V(D)J recombination, random nuclotides are inserted at the VDJ junctions. B cells further change itself by introducing random mutations called somatic hypermutation.
  • Source of B and T cells should be carefully considered
    • Usually PBMCs or specific cell types
  • Consider DNA vs RNA from cells
    • DNA does not give isotype information due to lack of splicing
    • Plasma cells have more RNA than naive cells, so if mixed together, plasma cells may be overrepresented.
  • Assignment of clones
    • Assignment of B cells to it’s progenitor B cells is called clonotype family assignment. Clonal lineage is determined based on TCR/BCR sequence. Sequences are grouped by VDJ and CDR length followed by calculating the hamming distance between them.
    • All cells that are derieved from a naive B cell are grouped as clonotypes
    • Hamming distance cut-offs based on V and J sequences
    • Optionally CDR3 length is also taken into account
    • Other kmer based clustering methods are used
  • Sampling depth and replicates
    • Expect low clonal overlap between subsequent sampling from same individual
    • Saturation curves (Species accumulation curve) to estimate sampling depth
    • Deeper sequencing helps to discover more clones
      • Sequence 5-10X reads relative to the number of cells sampled
    • Ensure reliability with biological replicates
  • Results can vary between sequencing protocols
  • UMIs help to resolve PCR duplicates and sequencing errors.
    • UMIs can be used for error correction and estimating sample contamination.
    • Low sequencing depth will result in very few UMIs per read
    • With UMIs, sequence 100X or more reads relative to the number of cells sampled
  • D region reconstruction is unreliable
  • A single B cell line should have only one V gene
  • Bulk AIRR loses information on VH-VL pairing

AIRR-Seq experiment guide and best practices Eugster et al. (2022), Antibody repertoire challenges and high throughput sequencing Georgiou et al. (2014), Bioinformatic analysis of adaptive immune repertoires Greiff et al. (2015), Ecosystem of machine learning analysis of adaptive immune repertoires Pavlović et al. (2021), BCR AIRR guidelines Yaari & Kleinstein (2015), Benchmarking of VDJ methods. Barennes et al. (2021), Breden et al. (2017), Vander Heiden et al. (2018)

4.5.2 Diversity analyses

  • Compare immune repertoire diversity
    • One of the aims of AIRR analyses is to quantify and compare diversity of immune repertoires
    • Repertoires are not comparable based on frequency because Ab sequences do not overlap
    • Diversity indices solve this problem by mapping frequency to a new coordinate system (Renyi entropy)
      • Many different diversity indices: Species richness, Shannon entropy, Simpson’s index, Berger-Parker index
      • Single diversity indices are insufficient to capture sequence frequency space
      • Renyi entropy is difficult to interpret biologically
      • Hill diversity?
      • Diversity profiles better than indices?
  • Public clones are clonotypes shared between samples
    • Are certain clones emerging by chance or by antigen interaction?
    • IGOR: Inference and generation of repertoires
    • Predicting TCR public clones by modelling VDJ recombination
  • How similar are sequences within a repertoire?
    • network analysis based on distance matrix (levenstein distance)
    • If undergoing immune response, repertoire follows power law (few nodes many connections)
      • naive repertoires follow exponential distribution

4.5.3 Lineage tracing

  • Phylogenetics and lineage tracing
  • Infer antibody evolution in response to antigen
  • Detect selection on B cell lineage
  • Quantify dynamics of affinity maturation
  • Reconstruct evolution of broadly neutralizing antibodies
  • Comparison of tree topologies is challenging

Tracing antibody repertoire evolution by phylogeny Yermanos et al. (2018), Using B cell recepter lineage to predict affinity Ralph & Matsen IV (2020), Phylogenetic analysis of migration, differentiation and class switching in B cells Hoehn et al. (2022)

4.6 Tutorials

4.6.1 Videos

Youtube channels AIRR Community and The Antibody Society are good sources of information.

4.7 Tools

Tools related to VDJ analyses.

Collection of VDJ analyses tools

Overview of methods for TCR repertoire analysis Rosati et al. (2017) Benchmarking of BCR reconstruction from single-cell tools Andreani et al. (2022) Benchmarking immunoinformatic tools for the analysis of antibody repertoire sequences Smakaj et al. (2020)

General

  • Enclone (10X, Shell) SHM-aware clonotyping, phylogenetic/lineage analysis, multiple sequence alignment
  • Immcantation (Shell) QC, UMI processing, clonal clustering, germline reconstruction, lineage topology, repertoire diversity, mutation profiling
  • airrflow nf-core nextflow airr pipeline using Immcantation
Fig 4.12: AIRR nextflow pipeline using Immcantation
  • ImmuneML (Shell/GUI) Ecosystem for machine learning analysis of adaptive immune receptor repertoires (Pavlović et al. (2021))

Reads to VDJ data

Fig 4.13: MIXCR workflow

Downstream analyses

4.7.1 Immunarch

  • Exploratory data analyses
    • Number of clones per sample, filter low quality clones/samples?
    • Distribution of CDR3 lengths by sample (nucleotide or AA)
    • Distribution of abundances
    • Downsample data to make it comparable. only downsample on datasets with sufficient depth
    • Public clonotype analysis
      • Estimate similarity of samples using number of shared clonotypes
      • Number of shared clonotypes (best for downsampled data)
      • Jaccard index
      • Morisita-Horn index
      • Visualised using heatmaps
  • Clonality analyses
    • Estimate and compare differences in abundances of clonotypes between samples
    • Compare proportions of the most and least abundant clonotypes
    • Relative abundance
  • Diversity analyses
    • Compare diversity of clonotypes in and between samples
    • Rarefaction (models, extrapolates data)

AIRR analyses using immunarch

4.8 Databases

References

Andreani, T., Slot, L. M., Gabillard, S., Struebing, C., Reimertz, C., Yaligara, V., Bakker, A. M., Olfati-Saber, R., Toes, R. E., Scherer, H. U., et al. (2022). Benchmarking computational methods for b-cell receptor reconstruction from single-cell RNA-seq data. NAR Genomics and Bioinformatics, 4(3), lqac049. https://academic.oup.com/nargab/article/4/3/lqac049/6643029
Barennes, P., Quiniou, V., Shugay, M., Egorov, E. S., Davydov, A. N., Chudakov, D. M., Uddin, I., Ismail, M., Oakes, T., Chain, B., et al. (2021). Benchmarking of t cell receptor repertoire profiling methods reveals large systematic biases. Nature Biotechnology, 39(2), 236–245. https://www.nature.com/articles/s41587-020-0656-3
Breden, F., Luning Prak, E. T., Peters, B., Rubelt, F., Schramm, C. A., Busse, C. E., Vander Heiden, J. A., Christley, S., Bukhari, S. A. C., Thorogood, A., et al. (2017). Reproducibility and reuse of adaptive immune receptor repertoire data. Frontiers in Immunology, 8, 1418. https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2017.01418/full
Collins, A. M., Ohlin, M., Corcoran, M., Heather, J. M., Ralph, D., Law, M., Martı́nez-Barnetche, J., Ye, J., Richardson, E., Gibson, W. S., et al. (2024). AIRR-c IG reference sets: Curated sets of immunoglobulin heavy and light chain germline genes. Frontiers in Immunology, 14, 1330153. https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2023.1330153/full
De Simone, M., Rossetti, G., & Pagani, M. (2018). Single cell t cell receptor sequencing: Techniques and future challenges. Frontiers in Immunology, 9, 1638. https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2018.01638/full
Eugster, A., Bostick, M. L., Gupta, N., Mariotti-Ferrandiz, E., Kraus, G., Meng, W., Soto, C., Trück, J., Stervbo, U., Prak, E. T. L., et al. (2022). AIRR community guide to planning and performing AIRR-seq experiments. Methods and Protocols, 261. https://library.oapen.org/bitstream/handle/20.500.12657/57008/1/978-1-0716-2115-8.pdf
Georgiou, G., Ippolito, G. C., Beausang, J., Busse, C. E., Wardemann, H., & Quake, S. R. (2014). The promise and challenge of high-throughput sequencing of the antibody repertoire. Nature Biotechnology, 32(2), 158–168. https://www.nature.com/articles/nbt.2782
Greiff, V., Miho, E., Menzel, U., & Reddy, S. T. (2015). Bioinformatic and statistical analysis of adaptive immune repertoires. Trends in Immunology, 36(11), 738–749. https://www.cell.com/trends/immunology/fulltext/S1471-4906(15)00223-9
Hoehn, K. B., Pybus, O. G., & Kleinstein, S. H. (2022). Phylogenetic analysis of migration, differentiation, and class switching in b cells. PLoS Computational Biology, 18(4), e1009885. https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009885
Irac, S. E., Soon, M. S. F., Borcherding, N., & Tuong, Z. K. (2024). Single-cell immune repertoire analysis. Nature Methods, 1–16. https://www.nature.com/articles/s41592-024-02243-4
Kavathas, P. B., Krause, P. J., & Ruddle, N. H. (2019). Organization and cells of the immune system. In P. J. Krause, P. B. Kavathas, & N. H. Ruddle (Eds.), Immunoepidemiology (pp. 21–38). Springer International Publishing. https://doi.org/10.1007/978-3-030-25553-4_2
Lewis, D. E., & Blutt, S. E. (2019). 2 - organization of the immune system. In R. R. Rich, T. A. Fleisher, W. T. Shearer, H. W. Schroeder, A. J. Frew, & C. M. Weyand (Eds.), Clinical immunology (fifth edition) (Fifth Edition, pp. 19–38.e1). Elsevier. https://doi.org/10.1016/B978-0-7020-6896-6.00002-8
Morgan, D., & Tergaonkar, V. (2022). Unraveling b cell trajectories at single cell resolution. Trends in Immunology, 43(3), 210–229. https://doi.org/10.1016/j.it.2022.01.003
Pavlović, M., Scheffer, L., Motwani, K., Kanduri, C., Kompova, R., Vazov, N., Waagan, K., Bernal, F. L., Costa, A. A., Corrie, B., et al. (2021). The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires. Nature Machine Intelligence, 3(11), 936–944. https://www.nature.com/articles/s42256-021-00413-z
Primorac, D., Vrdoljak, K., Brlek, P., Pavelić, E., Molnar, V., Matišić, V., Erceg Ivkošić, I., & Parčina, M. (2022). Adaptive immune responses and immunity to SARS-CoV-2. Frontiers in Immunology, 13, 2035. https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2022.848582/full
Ralph, D. K., & Matsen IV, F. A. (2020). Using b cell receptor lineage structures to predict affinity. PLOS Computational Biology, 16(11), e1008391. https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008391
Rosati, E., Dowds, C. M., Liaskou, E., Henriksen, E. K. K., Karlsen, T. H., & Franke, A. (2017). Overview of methodologies for t-cell receptor repertoire analysis. BMC Biotechnology, 17(1), 1–16. https://bmcbiotechnol.biomedcentral.com/articles/10.1186/s12896-017-0379-9
Smakaj, E., Babrak, L., Ohlin, M., Shugay, M., Briney, B., Tosoni, D., Galli, C., Grobelsek, V., D’Angelo, I., Olson, B., et al. (2020). Benchmarking immunoinformatic tools for the analysis of antibody repertoire sequences. Bioinformatics, 36(6), 1731–1739. https://academic.oup.com/bioinformatics/article/36/6/1731/5686386
Vander Heiden, J. A., Marquez, S., Marthandan, N., Bukhari, S. A. C., Busse, C. E., Corrie, B., Hershberg, U., Kleinstein, S. H., Matsen IV, F. A., Ralph, D. K., et al. (2018). AIRR community standardized representations for annotated immune repertoires. Frontiers in Immunology, 9, 2206. https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2018.02206/full
Yaari, G., & Kleinstein, S. H. (2015). Practical guidelines for b-cell receptor repertoire sequencing analysis. Genome Medicine, 7, 1–14. https://link.springer.com/article/10.1186/s13073-015-0243-2
Yermanos, A. D., Dounas, A. K., Stadler, T., Oxenius, A., & Reddy, S. T. (2018). Tracing antibody repertoire evolution by systems phylogeny. Frontiers in Immunology, 9, 2149. https://www.frontiersin.org/articles/10.3389/fimmu.2018.02149/full