2  Gene expression

Single-cell RNA-Seq experiments and analyses

2.1 Overview

Current best practices in scRNA-Seq

  • Perform QC by finding outlier peaks in the number of genes, the count depth and the fraction of mitochondrial reads. Consider these covariates jointly instead of separately.
  • Be as permissive of QC thresholding as possible, and revisit QC if downstream clustering cannot be interpreted.
  • If the distribution of QC covariates differ between samples, QC thresholds should be determined separately for each sample to account for sample quality differences
  • We recommend scran for normalization of non-full-length datasets. An alternative is to evaluate normalization approaches via scone especially for plate-based datasets. Full-length scRNA-seq protocols can be corrected for gene length using bulk methods.
  • There is no consensus on scaling genes to 0 mean and unit variance. We prefer not to scale gene expression.
  • Normalized data should be log(x+1)-transformed for use with downstream analysis methods that assume data are normally distributed.
  • Regress out biological covariates only for trajectory inference and if other biological processes of interest are not masked by the regressed out biological covariate.
  • Regress out technical and biological covariates jointly rather than serially.
  • Plate-based dataset pre-processing may require regressing out counts, normalization via non-linear normalization methods or downsampling.
  • We recommend performing batch correction via ComBat when cell type and state compositions between batches are consistent
  • Data integration and batch correction should be performed by different methods. Data integration tools may over-correct simple batch effects.
  • Users should be cautious of signals found only after expression recovery. Exploratory analysis may be best performed without this step.
  • We recommend selecting between 1,000 and 5,000 highly variable genes depending on dataset complexity.
  • Feature selection methods that use gene expression means and variances cannot be used when gene expression values have been normalized to zero mean and unit variance, or when residuals from model fitting are used as normalized expression values. Thus, one must consider what pre-processing to perform before selecting HVGs.
  • Dimensionality reduction methods should be considered separately for summarization and visualization.
  • We recommend UMAP for exploratory visualization; PCA for general purpose summarization; and diffusion maps as an alternative to PCA for trajectory inference summarization.
  • PAGA with UMAP is a suitable alternative to visualize particularly complex datasets.
  • Use measured data for statistical testing, corrected data for visual comparison of data and reduced data for other downstream analysis based on finding the underlying biological data manifold.
  • We recommend clustering by Louvain community detection on a single-cell KNN graph.
  • Clustering does not have to be performed at a single resolution. Subclustering particular cell clusters is a valid approach to focus on more detailed substructures in a dataset.
  • Do not use marker gene P-values to validate a cell-identity cluster, especially when the detected marker genes do not help to annotate the community. P-values may be inflated.
  • Note that marker genes for the same cell-identity cluster may differ between datasets purely due to dataset cell type and state compositions.
  • If relevant reference atlases exist, we recommend using automated cluster annotation combined with data-derived marker-gene-based manual annotation to annotate clusters.
  • Consider that statistical tests over changes in the proportion of a cell-identity cluster between samples are dependent on one another.
  • Inferred trajectories do not have to represent a biological process. Further sources of evidence should be collected to interpret a trajectory.
  • DE testing should not be performed on corrected data (denoised, batch corrected, etc.), but instead on measured data with technical covariates included in the model.
  • Users should not rely on DE testing tools to correct models with confounded covariates. Model specification should be performed carefully ensuring a full-rank design matrix.
  • We recommend using MAST or limma for DE testing.
  • Users should be wary of uncertainty in the inferred regulatory relationships. Modules of genes that are enriched for regulatory relationships will be more reliable than individual edges.

Luecken & Theis (2019)

Best practices for single-cell analysis across modalities

Heumos et al. (2023)

What information should be included in an scRNA-Seq publication?

Füllgrabe et al. (2020)

Open problems in single-cell analysis

Luecken et al. (2025)

2.2 Experimental design

Experimental Considerations for Single-Cell RNA Sequencing Approaches

Overview of step-wise approach to designing single-cell analysis workflows. RNA integrity number (RIN); Reads per cell (RPC).

Overview of step-wise approach to designing single-cell analysis workflows. RNA integrity number (RIN); Reads per cell (RPC).

Nguyen et al. (2018)

How many reads are needed per cell? Sequencing depth?

Given a fixed budget, sequencing as many cells as possible at approximately one read per cell per gene is optimal, both theoretically and experimentally.

Zhang et al. (2020)

2.2.1 Batch design, number of cells

Avoid technical biases.

Experimental design examples. In the confounded design, cells are isolated from each sample onto separate plates, processed at potentially different times and the two groups (indicated by different colors) are sequenced on separate lanes of the sequencer. In the balanced design on the right, all samples are evenly distributed across all stages of the experiment, thus reducing the sources of technical variation in the experiment.

Experimental design examples. In the confounded design, cells are isolated from each sample onto separate plates, processed at potentially different times and the two groups (indicated by different colors) are sequenced on separate lanes of the sequencer. In the balanced design on the right, all samples are evenly distributed across all stages of the experiment, thus reducing the sources of technical variation in the experiment.

Deciding appropriate cell numbers

Estimate of cells required for experiments with various parameters. (A) The plot shows the log10(#Cells) required to capture at least 50 cell types based on the parameters on the X- and Y-axes. (B) The plot shows the log10(#Cells) required to capture the number of cells on the Y-axis if the population consists of 20 cell types.

Estimate of cells required for experiments with various parameters. (A) The plot shows the log10(#Cells) required to capture at least 50 cell types based on the parameters on the X- and Y-axes. (B) The plot shows the log10(#Cells) required to capture the number of cells on the Y-axis if the population consists of 20 cell types.

Baran-Gale et al. (2018)

2.2.2 Sequencing depth

While 250 000 reads per cell are sufficient for accuracy, 1 million reads per cell were a good target for saturated gene detection.

Svensson et al. (2017)

2.3 Methods and kits

Common methods for single-cell RNA-seq are based on microfluidics, droplets, microwells, or FACS sorting into plates. The most popular platforms are 10X Genomics Chromium, Drop-seq, inDrop, Seq-Well, and SMART-seq2/3. Parse has recently emerged as a new droplet-based platform.

Droplet based methods are high-throughput and cost-effective, but they typically capture only the 3’ or 5’ end of transcripts and have lower sensitivity per cell. Plate-based methods like SMART-seq2/3 provide full-length transcript coverage and higher sensitivity but are lower throughput and more expensive per cell.

How do 10X Genomics and Parse compare?

  • Hardware/cost: 10x needs a Chromium Controller; Parse doesn’t (lower setup cost) but is more hands‑on. 10x prone to GEM wetting/clogging; Parse avoids this.
  • Workflow: 10x ≈3 days and more flexible; Parse ≥4 days with a long uninterrupted day. Parse supports fixation/storage without extra hardware; 10x fixation needs another instrument.
  • Multiplexing: Built‑in with Parse; 10x requires add‑on reagents/steps (e.g., hashing).
  • Input/recovery: Parse needs ≥100k cells/sample; 10x can run ~800 cells. 10x generally higher, more consistent recovery—better for rare populations.
  • Read composition/ambient RNA: Parse slightly higher mitochondrial %; 10x much higher ribosomal/lncRNA (more ambient RNA). Parse washes reduce ambient RNA.
  • Doublets: This study—10x with hashing ~14%; Parse ~31% (WT) and ~21% (mini); literature mixed by tissue.
  • Sensitivity/reproducibility: Parse detects ~2× more genes at similar depth but shows higher variability and batch effects; 10x more reproducible and simpler downstream (less batch correction).
  • Cell‑type resolution (thymus): 10x cleanly resolves major subsets and DP subtypes; Parse struggled (aberrant Cd3d/Cd3g, missed DP‑A, poor SP‑CD4 vs SP‑CD8 separation).
  • Bottom line: Choose 10x for reliability, recovery, and annotation with low input; choose Parse for no instrument, integrated multiplexing, fixation flexibility, and higher gene detection at the cost of longer workflow and variable data.

Filippov et al. (2024)

  • Overall: Both produced high-quality PBMC data with consistent replicates.
  • Efficiency: 10x ≈2× higher cell recovery and ~13% more valid reads; Parse needs more input and deeper sequencing (more invalid barcodes).
  • Multiplets/QC: Parse had lower multiplet rates → fewer cells discarded than 10x.
  • Sensitivity (20k reads/cell): Parse detected ~1.2× more genes in T/NK/B; no advantage in monocytes.
  • Biases: 10x enriched for ribosomal protein-coding genes; 10x GC bias varies by chemistry.
  • Downstream: Parse improved clustering/rare-cell detection but showed weaker marker-gene expression—reference-based annotation recommended.
  • Use-cases: Parse for high-throughput multiplexing and low-expression genes; 10x for higher recovery/valid reads and robust marker quantification; overall quality comparable.

2.4 Mapping and Quantification

2.4.1 CellRanger

  • Process chromium data

  • BCL to FASTQ

  • FASTQ to cellxgene counts

  • Feature barcoding

  • CellRanger

2.4.2 Kallisto Bustools

  • 10x, inDrop and Dropseq

  • Generate cellxgene, cellxtranscript matrix

  • RNA velocity data

  • Feature barcoding

  • QC reports

  • BUSTools

Melsted et al. (2019)

2.4.3 Salmon Alevin

  • Drop-seq, 10x-Chromium v1/2/3, inDropV2, CELSeq 1/2, Quartz-Seq2, sci-RNA-seq3
  • Generate cellxgene matrix
  • Alevin

2.4.4 Nextflow nf-core rnaseq

  • Bulk RNA-Seq, SMART-Seq
  • QC, trimming, UMI demultiplexing, mapping, quantification
  • cellxgene matrix
  • nf-core scrnaseq

2.5 Background correction

Identification and correction for free RNA background contamination in single-cell RNA-seq data.

Accuracy of computational background noise estimation. A Estimated background noise levels per cell based on genetic variants (gray) and different computational tools. B Taking the genotype-based estimates as ground truth, Root Mean Squared Logarithmic Error (RMSLE) and Kendall rank correlation serve as evaluation metrics for cell-wise background noise estimates of different methods. Low RMSLE values indicate high similarity between estimated values and the assumed ground truth. High values of Kendall’s correspond to good representation of cell to cell variability in the estimated values

Accuracy of computational background noise estimation. A Estimated background noise levels per cell based on genetic variants (gray) and different computational tools. B Taking the genotype-based estimates as ground truth, Root Mean Squared Logarithmic Error (RMSLE) and Kendall rank correlation serve as evaluation metrics for cell-wise background noise estimates of different methods. Low RMSLE values indicate high similarity between estimated values and the assumed ground truth. High values of Kendall’s correspond to good representation of cell to cell variability in the estimated values

Janssen et al. (2023)

Tools

CellBender is slow when using CPU.

2.6 Doublet detection

Summary of doublet detection tools.

Summary of doublet detection tools.

The methods include doubletCells, Scrublet, cxds, bcds, hybrid, Solo, DoubletDetection, DoubletFinder, and DoubletDecon. Evaluation was conducted using 16 real scRNA-seq datasets with experimentally annotated doublets and 112 synthetic datasets.

  • Evaluation Metrics
    • Detection Accuracy: Assessed using the area under the precision-recall curve (AUPRC) and the area under the receiver operating characteristic curve (AUROC).
    • Impact on Downstream Analyses: Effects on differential expression (DE) gene analysis, highly variable gene identification, cell clustering, and cell trajectory inference.
    • Computational Efficiency: Considered aspects such as speed, scalability, stability, and usability.
  • Key Findings
    • Detection Accuracy: DoubletFinder achieved the highest detection accuracy among the methods.
    • Downstream Analyses: Removal of doublets generally improved the accuracy of downstream analyses, with varying degrees of improvement depending on the method.
    • Computational Efficiency: cxds was found to be the most computationally efficient method, particularly excelling in speed and scalability.
  • Performance Factors
    • Doublet Rate: Higher doublet rates improved the accuracy of all methods.
    • Sequencing Depth: Greater sequencing depth led to better performance.
    • Number of Cell Types: More cell types generally resulted in better detection accuracy, except for cxds, bcds, and hybrid methods.
    • Cell-Type Heterogeneity: Higher heterogeneity between cell types improved the detection accuracy for most methods.

Overall Conclusion: DoubletFinder is recommended for its high detection accuracy and significant improvement in downstream analyses, while cxds is highlighted for its computational efficiency.

Xi & Li (2021)

For 10X data, the expected odublet rate is 0.8% per 1000 cells for 10x 3’ CellPlex kit and 0.4% per 1000 cells for high-throughput (HT) 3’ v3.1 assay a, b.

The doublet rate is 3% per 100,000 cells for Parse WT kit as mentioned here.

rate cells_loaded cells_recovered
0.4 825 500
0.8 1650 1000
1.6 3300 2000
2.4 4950 3000
3.2 6600 4000
4.0 8250 5000
4.8 9900 6000
5.6 11550 7000
6.4 13200 8000
7.2 14850 9000
8.0 16500 10000

2.7 Cell type prediction

Performance comparison of supervised classifiers for cell identification using different scRNA-seq datasets. Heatmap of the a median F1-scores and b percentage of unlabeled cells across all cell populations per classifier (rows) per dataset (columns). Gray boxes indicate that the corresponding method could not be tested on the corresponding dataset.

Performance comparison of supervised classifiers for cell identification using different scRNA-seq datasets. Heatmap of the a median F1-scores and b percentage of unlabeled cells across all cell populations per classifier (rows) per dataset (columns). Gray boxes indicate that the corresponding method could not be tested on the corresponding dataset.

Summary of the performance of all classifiers during different experiments. For each experiment, the heatmap shows whether a classifier performs good, intermediate, or poor. Light gray indicates that a classifier could not be tested during an experiment. The gray boxes to the right of the heatmap indicate the four different categories of experiments: intra-dataset, inter-dataset, rejection, and timing. Experiments itself are indicated using the row labels.

Summary of the performance of all classifiers during different experiments. For each experiment, the heatmap shows whether a classifier performs good, intermediate, or poor. Light gray indicates that a classifier could not be tested during an experiment. The gray boxes to the right of the heatmap indicate the four different categories of experiments: intra-dataset, inter-dataset, rejection, and timing. Experiments itself are indicated using the row labels.
  • Benchmarked 22 supervised classification methods for automatic cell identification in scRNA-seq, including both single-cell-specific tools and general-purpose ML classifiers.
  • Used 27 public scRNA-seq datasets spanning different sizes, technologies, species, and annotation complexity.
  • Evaluated two main scenarios: within-dataset (intra-dataset) and across-dataset (inter-dataset) prediction.
  • Scored methods by accuracy, fraction of unclassified cells, and computation time, and also tested sensitivity to feature selection and number of cells per population.
  • Found most methods perform well broadly, but accuracy drops for complex datasets with overlapping classes or very fine (“deep”) annotation levels.
  • Reported that a general-purpose support vector machine (SVM) achieved the best overall performance across their experiments.
  • Released code and a Snakemake workflow to reproduce/extend the benchmark (new methods and datasets).

Abdelaal et al. (2019)

Identification of cell types can be completely automated (by comparing to reference data/databases) or semi-automated (reference data + marker genes).

Summary of performance of the automatic cell-type identification methods. Bar graphs of the automatic cell-type identification methods with six evaluation criteria indicated. For each evaluation criteria, the length of the bars shows the performance of the automatic method: poor, median or good. The automatic methods are sorted based on the mean performance of the evaluation criteria. No bar: not evaluated.

Summary of performance of the automatic cell-type identification methods. Bar graphs of the automatic cell-type identification methods with six evaluation criteria indicated. For each evaluation criteria, the length of the bars shows the performance of the automatic method: poor, median or good. The automatic methods are sorted based on the mean performance of the evaluation criteria. No bar: not evaluated.
  • Compares 32 methods using common performance criteria such as prediction accuracy, F1-score, “unlabeling” rate (cells left unassigned), and computational efficiency.
  • Organizes methods by major strategy families, including marker/gene-set–based approaches, reference-based label transfer, and supervised machine-learning classifiers.
  • Highlights that method performance is dataset-dependent, with challenges increasing when cell types are highly similar, labels are very fine-grained, or references are incomplete.
  • Emphasizes practical selection factors beyond accuracy, especially whether a method can leave cells “unknown/unassigned,” how sensitive it is to batch effects, and how well it scales to large datasets.

Xie et al. (2021)

Overall ranking of methods

Overall ranking of methods
  • Compared 8 supervised and 10 unsupervised scRNA-seq cell type identification methods across 14 real public datasets (different tissues, protocols, species).
  • Main result: supervised methods usually outperform unsupervised methods, except when the goal is identifying unknown/novel cell types (where unsupervised tends to do better).
  • Supervised methods work best when the reference is high-quality, low-complexity, and similar to the query; performance drops as reference bias increases (different individuals/conditions/batches/species).
  • Dataset complexity is a major driver: when complexity is low, supervised wins clearly; when complexity is high, supervised vs unsupervised performance becomes more similar and can even reverse under strong reference bias.
  • More training cells generally improve supervised performance until a saturation point; unsupervised results can vary strongly with sample size because cluster-number estimation changes with dataset size.
  • Sequencing depth helps both categories up to a saturation point; deeper data improves results most when baseline depth is low.
  • Batch-effect correction was often not necessary and could hurt performance; most supervised methods did not improve after correction, with CHETAH being a notable exception due to fewer “unassigned” calls.
  • Increasing the number of cell types and stronger cell-type similarity makes the task harder; unsupervised methods are particularly sensitive when the inferred cluster number disagrees with the true number.
  • With imbalanced populations, supervised methods are generally more robust if the training set contains enough examples of rare types; unsupervised performance is affected by imbalance and cluster-number errors.
  • Compute/scalability: unsupervised methods are generally faster; among fast methods, several could handle ~50k cells quickly, and experiments on ~600k cells showed similar trends to smaller datasets.
  • Method-level takeaways highlighted in the paper: among supervised methods, Seurat v3 mapping and SingleR were top overall (Seurat mapping favored for large datasets due to speed), and among unsupervised methods, Seurat v3 clustering was strongest overall, with SHARP recommended for ultra-large datasets.

Sun et al. (2022)

It is also important that cell types are labelled with the same labels across datasets and studies. It is useful to refer to a cell type ontology Cell type ontology.

Summary of the classification performance in each evaluation criteria. Each column is a method and each row is an evaluation criterion from intra-dataset and inter-dataset prediction (intra/inter), cell–cell similarity (DE scale), increased cell type classes, downsampling of gene count, downsampling of read depth, rare cell type detection, unknown cell type detection (rejection option), as well as runtime and memory utilization. The heatmap shows the rank of individual methods based on averaged metrics over overall accuracy, ARI, and V-measure for each evaluation indicated in the left row. Rare cell type detection was ranked by averaged cell type-specific accuracy for classifying cell types < 1.70% in population. Unknown cell type detection was ranked by the averaged accuracy of assigning “unknown” to the leave-out group. Runtime and memory were ranked by utilization efficiency. Gray box indicates that the method was not included in the evaluation. The methods in the heatmap are arranged in ascending order by their average rank over intra-dataset and inter-dataset predictions.

Summary of the classification performance in each evaluation criteria. Each column is a method and each row is an evaluation criterion from intra-dataset and inter-dataset prediction (intra/inter), cell–cell similarity (DE scale), increased cell type classes, downsampling of gene count, downsampling of read depth, rare cell type detection, unknown cell type detection (rejection option), as well as runtime and memory utilization. The heatmap shows the rank of individual methods based on averaged metrics over overall accuracy, ARI, and V-measure for each evaluation indicated in the left row. Rare cell type detection was ranked by averaged cell type-specific accuracy for classifying cell types < 1.70% in population. Unknown cell type detection was ranked by the averaged accuracy of assigning “unknown” to the leave-out group. Runtime and memory were ranked by utilization efficiency. Gray box indicates that the method was not included in the evaluation. The methods in the heatmap are arranged in ascending order by their average rank over intra-dataset and inter-dataset predictions.
  • Benchmarked 10 R packages for automated scRNA-seq cell-type annotation: Seurat, scmap, SingleR, CHETAH, SingleCellNet, scID, Garnett, SCINA, plus two repurposed methylation deconvolution methods (CP, RPC).
  • Evaluated accuracy on real datasets (PBMC, pancreas, Tabula Muris full and lung subsets) and multiple simulation suites; metrics included overall accuracy, ARI, and V-measure.
  • Overall top performers were Seurat, SingleR, CP, RPC, and SingleCellNet; Seurat was best at annotating major cell types in both intra-dataset and inter-dataset tests.
  • Inter-dataset annotation is harder and performance is dataset-dependent; PBMC is particularly challenging due to highly similar immune subtypes (e.g., CD4 vs CD8 T cells).
  • For highly similar cell types (low DE simulations), all methods degrade, but SingleR and RPC were most robust; Seurat was weaker under the hardest similarity setting.
  • As the number of cell-type classes increases (10→50), most methods drop in accuracy; SingleR stays extremely robust and RPC is consistently second-best, while Seurat deteriorates faster after ~30 classes.
  • With gene/feature downsampling, Seurat and SingleR remain most stable (high ARI across reduced feature sets), whereas some methods (e.g., Garnett, scID, scmap) are more sensitive.
  • Rare cell types: Seurat and SingleCellNet lose accuracy when rare groups get very small (≤50 cells in their rare-type simulations), while SingleR/CP/RPC are more robust.
  • “Unknown”/rejection option: among methods that can label “unknown” (Garnett, SCINA, scmap, CHETAH, scID), SCINA had a relatively better balance for rejecting absent types, but rejection-enabled tools were not top overall for accuracy/robustness.
  • Compute trade-offs: SingleCellNet and CP were fastest/most memory-efficient among top-accuracy tools; Seurat can be memory-heavy at large scale (reported up to ~100 GB at 50k cells) and RPC can be slow (hours) at large scale.
  • Practical guidance: use Seurat for general annotation of separable major types; prefer SingleR/RPC/CP when expecting rare populations, high similarity, or many labels (given a good reference).

Huang et al. (2021)

2.8 Differential expression

  • Comparison of DGE tools for single-cell data

All of the tools perform well when there is no multimodality or low levels of multimodality. They all also perform better when the sparsity (zero counts) is less. For data with a high level of multimodality, methods that consider the behavior of each individual gene, such as DESeq2, EMDomics, Monocle2, DEsingle, and SigEMD, show better TPRs. If the level of multimodality is low, however, SCDE, MAST, and edgeR can provide higher precision.

In general, the methods that can capture multimodality (non-parametric methods), perform better than do the model-based methods designed for handling zero counts. However, a model-based method that can model the drop-out events well, can perform better in terms of true positive and false positive. We observed that methods developed specifically for scRNAseq data do not show significantly better performance compared to the methods designed for bulk RNAseq data; and methods that consider behavior of each individual gene (not all genes) in calling DE genes outperform the other tools.

Effect of sample size (number of cells) on detecting DE genes. The sample size is in horizontal axis, from 10 to 400 cells in each condition. Effect of sample size on a TPR, b FPR, c accuracy (=(TP + TN)/(TP + FP + TN + FN)), and precision (=TP/(TP + FP)). A threshold of 0.05 is used for FDR or adjusted p-value.

Effect of sample size (number of cells) on detecting DE genes. The sample size is in horizontal axis, from 10 to 400 cells in each condition. Effect of sample size on a TPR, b FPR, c accuracy (=(TP + TN)/(TP + FP + TN + FN)), and precision (=TP/(TP + FP)). A threshold of 0.05 is used for FDR or adjusted p-value.

T. Wang et al. (2019)

2.9 Data Integration

  • Single-cell data integration challenges

Overview of common data integration methods classified according to their anchor choice.

Overview of common data integration methods classified according to their anchor choice.

a–c, Depending on the anchor choice, three types of data integration strategies can be considered: horizontal integration with features as the anchors (a), vertical integration with cells as the anchors (b) and diagonal integration with no anchors in high-dimensional space (c). The left column shows the data modalities extracted, while the right column illustrates the resulting data matrices to be integrated, depending on the anchor choice.

a–c, Depending on the anchor choice, three types of data integration strategies can be considered: horizontal integration with features as the anchors (a), vertical integration with cells as the anchors (b) and diagonal integration with no anchors in high-dimensional space (c). The left column shows the data modalities extracted, while the right column illustrates the resulting data matrices to be integrated, depending on the anchor choice.

Mosaic integration. a, Overview of an experimental design where different data modalities (each block in the rows) are profiled in different subsets of cells (each block in the columns). Transparent matrices denote missing information. b, Resulting data matrices after applying a mosaic integration approach aimed at imputing missing data modalities.

Mosaic integration. a, Overview of an experimental design where different data modalities (each block in the rows) are profiled in different subsets of cells (each block in the columns). Transparent matrices denote missing information. b, Resulting data matrices after applying a mosaic integration approach aimed at imputing missing data modalities.

Argelaguet et al. (2021) Principles and challenges of data integration by Argelaguet

  • Comparison of data integration methods

a, Overview of top and bottom ranked methods by overall score for the human immune cell task. Metrics are divided into batch correction (blue) and bio-conservation (pink) categories. Overall scores are computed using a 40/60 weighted mean of these category scores (see Methods for further visualization details and Supplementary Fig. 2 for the full plot). b,c, Visualization of the four best performers on the human immune cell integration task colored by cell identity (b) and batch annotation (c). The plots show uniform manifold approximation and projection layouts for the unintegrated data (left) and the top four performers (right).

a, Overview of top and bottom ranked methods by overall score for the human immune cell task. Metrics are divided into batch correction (blue) and bio-conservation (pink) categories. Overall scores are computed using a 40/60 weighted mean of these category scores (see Methods for further visualization details and Supplementary Fig. 2 for the full plot). b,c, Visualization of the four best performers on the human immune cell integration task colored by cell identity (b) and batch annotation (c). The plots show uniform manifold approximation and projection layouts for the unintegrated data (left) and the top four performers (right).

a, Scatter plot of the mean overall batch correction score against mean overall bio-conservation score for the selected methods on RNA tasks. Error bars indicate the standard error across tasks on which the methods ran. b, The overall scores for the best performing method, preprocessing and output combinations on each task as well as their usability and scalability. Methods that failed to run for a particular task were assigned the unintegrated ranking for that task. An asterisk after the method name (scANVI and scGen) indicates that, in addition, cell identity information was passed to this method. For ComBat and MNN, usability and scalability scores corresponding to the Python implementation of the methods are reported (Scanpy and mnnpy, respectively).

a, Scatter plot of the mean overall batch correction score against mean overall bio-conservation score for the selected methods on RNA tasks. Error bars indicate the standard error across tasks on which the methods ran. b, The overall scores for the best performing method, preprocessing and output combinations on each task as well as their usability and scalability. Methods that failed to run for a particular task were assigned the unintegrated ranking for that task. An asterisk after the method name (scANVI and scGen) indicates that, in addition, cell identity information was passed to this method. For ComBat and MNN, usability and scalability scores corresponding to the Python implementation of the methods are reported (Scanpy and mnnpy, respectively).

Luecken et al. (2022)

Qualitative evaluation of 14 batch-effect correction methods using UMAP visualization for dataset 9 of human cell atlas. The 14 methods are organized into two panels, with the top panel showing UMAP plots of raw data, Seurat 2, Seurat 3, Harmony, fastMNN, MNN Correct, ComBat, and limma outputs, while the bottom panel shows the UMAP plots of scGen, Scanorama, MMD-ResNet, ZINB-WaVE, scMerge, LIGER, and BBKNN outputs. Cells are colored by batch.

Qualitative evaluation of 14 batch-effect correction methods using UMAP visualization for dataset 9 of human cell atlas. The 14 methods are organized into two panels, with the top panel showing UMAP plots of raw data, Seurat 2, Seurat 3, Harmony, fastMNN, MNN Correct, ComBat, and limma outputs, while the bottom panel shows the UMAP plots of scGen, Scanorama, MMD-ResNet, ZINB-WaVE, scMerge, LIGER, and BBKNN outputs. Cells are colored by batch.

We tested 14 state-of-the-art batch correction algorithms designed to handle single-cell transcriptomic data. We found that each batch-effect removal method has its advantages and limitations, with no clearly superior method. Based on our results, we found LIGER, Harmony, and Seurat 3 to be the top batch mixing methods. Harmony performed well on datasets with common cell types, and also different technologies. The comparatively low runtime of Harmony also makes it suitable for initial data exploration of large datasets. Likewise, LIGER performed well, especially on datasets with non-identical cell types. The main drawback of LIGER is its longer runtime than Harmony, though it is acceptable for its performance. Seurat 3 is also able to handle large datasets, however with 20–50% longer runtime than LIGER. Due to its good batch mixing results with multiple batches, it is also recommended for such scenarios. To improve recovery of DEGs in batch-corrected data, we recommend scMerge for batch correction.

Feature selection methods affects the performance of integration. Zappia et al. (2025)

Comparison of Multiomic integration Xiao et al. (2024)

Tran et al. (2020)

2.10 Trajectory

Fig 2.1: Comparison of Trajectory inference methods.

Saelens et al. (2019)

Standard trajectory tools

Multiomic trajectory tools

  • Tempora Trajectory inference for time-series data
  • VITAE (Python) Joint Trajectory Inference for Single-cell Genomics Using Deep Learning with a Mixture Prior
  • CellRank2 (Python) Multimodal trajectory inference
  • Moscot (Python) Multimodal spatial trajectory inference

2.11 RNA velocity

Core Concepts and Mechanism

  • RNA velocity is a computational method used to predict the future state of individual cells by analyzing the balance between unspliced (nascent) and spliced (mature) mRNA
  • The method exploits the causal relationship between these two species: an abundance of unspliced mRNA suggests a gene is being upregulated, while its depletion suggests downregulation
  • By combining these measurements across thousands of genes, researchers can reconstruct directed differentiation trajectories from “snapshot” single-cell data without needing prior knowledge of cell lineages
  • The primary signal for RNA velocity is derived from the curvature in a “phase portrait,” which reflects the temporal delay between transcription, splicing, and degradation

Workflow for RNA velocity analysis. (A) Raw scRNA-seq data acquisition. (B) Quantification of unspliced and spliced transcript abundances. (C) Count matrices preprocessing, data normalization, and neighborhood smoothing are included in the classic workflow. (D) Estimation of RNA velocities by fitting spliced and unspliced counts to biophysical models, also yielding kinetic parameters and latent variables. (E) Visualization of high-dimensional velocity vectors in low-dimensional space via methods such as streamline plots and grid-averaged vector fields. (F) Downstream analyses.

Workflow for RNA velocity analysis. (A) Raw scRNA-seq data acquisition. (B) Quantification of unspliced and spliced transcript abundances. (C) Count matrices preprocessing, data normalization, and neighborhood smoothing are included in the classic workflow. (D) Estimation of RNA velocities by fitting spliced and unspliced counts to biophysical models, also yielding kinetic parameters and latent variables. (E) Visualization of high-dimensional velocity vectors in low-dimensional space via methods such as streamline plots and grid-averaged vector fields. (F) Downstream analyses.

Methodological Paradigms

The sources categorize RNA velocity computational tools into three primary classes based on how they infer transcriptional kinetics:

  • Steady-state Methods (e.g., Velocyto): These assume that gene expression reaches an equilibrium between synthesis and degradation; they are often faster and simpler but can be inaccurate if the system has not reached a steady state
  • Trajectory Methods (e.g., scVelo Dynamical, UniTVelo): These fit the full transcriptional cycle using systems of ordinary differential equations (ODEs) to estimate latent time and gene-specific kinetic parameters
  • State Extrapolation Methods (e.g., cellDancer, DeepVelo): These focus on local cell-specific kinetics by leveraging neighboring cell information to capture subtle variations across heterogeneous populations

RNA velocity methods are categorized into three classes based on their paradigms in learning transcriptional dynamics. (A, B) Steady-state methods, include linear regression based on the steady-state ratio and inference based on minimizing Kullback–Leibler (KL) divergence between observed and predicted distributions. (C, D) Trajectory-based methods, where either cell-shared or cell-specific latent trajectories are used to reconstruct cellular dynamics by minimizing the sum of displacements between observed and estimated states. (E, F) State extrapolation methods, which infer future states by minimizing cosine similarity or distance in phase portrait space or high-dimensional gene space.

RNA velocity methods are categorized into three classes based on their paradigms in learning transcriptional dynamics. (A, B) Steady-state methods, include linear regression based on the steady-state ratio and inference based on minimizing Kullback–Leibler (KL) divergence between observed and predicted distributions. (C, D) Trajectory-based methods, where either cell-shared or cell-specific latent trajectories are used to reconstruct cellular dynamics by minimizing the sum of displacements between observed and estimated states. (E, F) State extrapolation methods, which infer future states by minimizing cosine similarity or distance in phase portrait space or high-dimensional gene space.

Biological Applications

RNA velocity has provided quantitative insights across three major biological scenarios:

  • Developmental Biology: It helps resolve complex lineage hierarchies and temporal sequences in systems like embryonic development, neural stem cell differentiation, and retinal maturation
  • Diseased and Injured Environments: The technique identifies abnormal cellular transitions in conditions such as Alzheimer’s disease, systemic lupus erythematosus, and impaired tissue regeneration
  • Tumor Microenvironments: Researchers use it to dissect cancer cell plasticity, immune cell exhaustion trajectories, and the dynamic interactions between malignant cells and their surroundings

Critical Challenges and Limitations

Despite its utility, the sources highlight significant technical and theoretical hurdles:

  • Biophysical Inconsistency: Current binary models (spliced vs. unspliced) oversimplify biology, as most human genes have multiple introns and complex alternative splicing mechanisms
  • Inaccurate Assumptions: Many models rely on constant kinetic rates, which fail to account for “transcriptional boosts” or multi-rate regimes where splicing or degradation speeds vary over time
  • Preprocessing Pitfalls: Common steps like normalization and k-nearest neighbor (KNN) smoothing can introduce distortions, create false signals, or obscure the stochastic nature of gene expression
  • Visualization Artifacts: Projecting high-dimensional velocity vectors onto 2D embeddings (like UMAP or t-SNE) frequently distorts local and global relationships, leading to misleading biological interpretations

Future Directions and Proposed Solutions

To improve the reliability of RNA velocity, the sources propose several innovations: - Stochastic and Discrete Modeling: Moving away from continuous ODEs toward discrete Markov models (e.g., using the Chemical Master Equation) to better handle low-copy number transcripts and “bursty” transcription - Multi-modal Integration: Incorporating other data layers, such as chromatin accessibility (ATAC-seq), metabolic labeling, or protein abundance, to provide a more comprehensive view of cellular dynamics - State-Variable Kinetics: Developing models that allow kinetic rates to change as a cell moves through different biological states

Gorin et al. (2022) Bergen et al. (2021) Y. Wang et al. (2025)

Preprocessing choices affect RNA velocity results for droplet scRNA-seq data Soneson et al. (2021)

  • Velocyto (Python,R)
  • scVelo (Python)
  • SDEvelo (Python) Multivariate stochastic modeling
  • MultiVelo (Python) Velocity Inference from Single-Cell Multi-Omic Data
  • UniTVelo (Python) Temporally Unified RNA Velocity
  • DeepVelo (Python) Deep learning for RNA velocity
  • VeloAE (Python) Low-dimensional Projection of Single Cell Velocity
  • GeneTrajectory (R) R implementation of GeneTrajectory
  • TFvelo Gene regulation inspired RNA velocity estimation
  • DeepCycle (Python) Cell cycle inference in single-cell RNA-seq
  • velocycle (Python) Bayesian model for RNA velocity estimation of periodic manifolds

2.12 Metacells

Metacells increase profile coverage and save computational resources, while preserving biologically relevant heterogeneity in single-cell genomics data.

Metacells increase profile coverage and save computational resources, while preserving biologically relevant heterogeneity in single-cell genomics data.

Main conceptual steps in the metacell construction workflow. Starting from a single-cell profile matrix, space and metrics are first defined for identifying cells displaying high similarity in their profiles (e.g., high transcriptomic similarity in scRNA-seq data). Second, highly similar cells are grouped into metacells. Third, single-cell profiles within each metacell are aggregated to create a metacell profile matrix. Dots represent single cells colored by cell type.

Graining level of metacell partition. (A) tSNE representation of a peripheral blood mononuclear cells (PBMCs) scRNA-seq dataset (see Appendix) at different graining levels. Each dot represents a single cell, a metacell or a cluster, depending on the graining level. Colors represent cell types. (B) Distribution of graining levels in different studies using metacells. Colors represent different metacell construction tools. (C) Graining levels used for datasets of different sizes. Colors represent different metacell construction tools. (D) Example of single-cell RNA-seq datasets with different levels of complexity (T cells, cord blood mononuclear cells (CBMCs) and bone marrow datasets). (E) Number of cell types recovered at different graining levels in the three examples of panel (D). (F) Example of single-cell RNA-seq datasets with different sizes. (G) Number of cell types recovered at different graining levels in the three examples of panel (F).

Metacell quality metrics. (A) Purity is defined as the proportion of cells from the most abundant cell type in a metacell. Higher purity corresponds to higher proportion of cells of the same cell type within a metacell. Purity can also be defined based on other annotations/categories than cell types. (B) Compactness is defined as the average variance of latent space component within a metacell. Better compactness corresponds to lower variance in the latent space components within cells grouped into a metacell. (C) Separation is defined as the distance to the closest metacell. Better separation corresponds to more distant metacells in the latent space. (D) Inner normalized variance is defined as the mean normalized gene variance within a metacell. Better inner normalized variance corresponds to lower variance of the single-cell profiles within a metacell. (E) Metacell size distribution is defined as the distribution of the number of cells in each metacell. Better metacell size distribution corresponds to more homogeneous metacell sizes. (F) Representativeness corresponds to the ability of metacells to faithfully represent the global structure of the single-cell dataset. Better representation corresponds to more uniform coverage of the dataset (black stars represent the centroid of each metacell). (G) Conservation of the downstream analyses at the metacell level is defined as the ability of metacells to preserve the results of the single-cell analysis.

Relationships between metacells and sketching or imputation. Metacells combine the reduction in size of sketching approaches and the reduction in sparsity of imputation strategies.

Limitations of metacells. (A) Example of limitations in metacells when aggregating cells of different cell types (i.e., impure metacell_3 in the example). Such impure metacells can lead to mixed profiles and artifacts in gene co-expression analyses. (B) Correlation between the size of metacells and the number of detected genes. (C) Computational cost of metacell construction (using MC2, SuperCell, and SEACells at a graining level of 75). Time (CPU time) is represented in minutes and memory (max RSS) in GB as a function of the cell numbers contained in the dataset being analyzed. Colors and shapes highlight the tool used for metacells construction. The y-axis is displayed on a log10 scale. All tasks were run on a machine with 500 GB and a time limit of 20 h with 1 CPU except for the run of MC2 with multithreading (10 CPUs). (D) Schematic representation of the integration strategy recommended to analyze large datasets with multiple samples using metacells: (i) constructing the metacells for each sample, (ii) integrating the samples at the metacell level, (iii) performing downstream analyses on the integrated metacell atlas. (E) Computational cost of metacell construction (using MC2, SuperCell, and SEACells at a graining level of 75), metacell construction + downstream analysis, and single-cell analysis (with and without using BPCells in Seurat). Time (CPU time) is represented in minutes and memory (max RSS) in GB as a function of the cell numbers contained in the dataset being analyzed. Following the approach described in panel (D), metacells were built on a per embryo basis and in parallel using 15 CPUs. After samples integration, downstream analyses included dimensionality reduction, clustering, and differential analysis. Colors and shapes highlight the tool used for metacells construction. The y-axis is displayed on a log10 scale.

Concepts that share similarities with metacells. (A) Example of nested communities. (B) Example of graph abstraction. (C) Example of neighborhoods. (D) Example of sample-specific pseudobulks. (E) Example of cell-type-/sample-specific pseudobulks. (F) Example of pseudocells. (G) Example of pseudobulks of pseudoreplicates.
Fig 2.2: An overview of metacells.

Metacells are defined as a partition of single-cell data into disjoint homogeneous groups of highly similar cells followed by aggregation of their profiles. This concept relies on the assumption that most of the variability within metacells corresponds to technical noise and not to biologically relevant heterogeneity. As such, metacells aim at removing some of the noise while preserving the biological information of the single-cell data and improving interpretability.

  • Choice of graining levels depends on both the complexity and size of the data
    • For large and low-complexity data, a relatively high graining level may be used.
    • For higher complexity or lower size, use lower graining levels to preserve the underlying heterogeneity
  • Choose graining levels such that the resulting number of metacells is at least ten times larger than the expected number of cell subtypes.
  • Somewhere between 10 and 50
  • Optimal graining is hard to evaluate using measures such as modularity or silhouette coefficient
  • Number of nearest neighbors
    • Increasing k results in a more uniform distribution of metacell sizes
    • Excessively large values of k (e.g., ~100) may lead to the merging of rare cell types
    • Reasonable range of values 5–30

2.12.1 Metrics

  • Purity: Fraction of cells from the most abundant cell type in a metacell
  • to check that metacells do not mix cells from different cell types
  • Compactness: A measure of a metacell’s homogeneity that helps flag low-quality metacells for review. Its value is dependent on the latent space used.
  • SEACells and SuperCell using PCA space will perform better than MetaCell and MC2 which uses normalized gene space
  • Separation: Euclidean distance between centroids of metacells
  • There is also a clear relationship between separation and compactness. Better compactness results in worse separation and vice versa. Metacells from dense regions will have better compactness but worse separation, while metacells from sparse regions will have better separation but worse compactness.
  • INV: mean normalized variance of features. Minimal is better
  • INV should be proportional to its mean
  • Size: Number of single cells per metacell
  • To ensure balanced downstream analyses, it is better to have a more homogeneous metacell size distribution and avoid significant outliers
  • Representativeness: A good metacell partition should reproduce the overall structure (i.e., manifold) of the single-cell data by uniformly representing its latent space
  • a more uniform representativeness of the manifold leads to increased variation in metacell sizes to compensate for inherent over- and under-representation of different cell types.
  • Conservation of downstream analysis
  • Clustering assignment obtained at the metacell and single-cell levels can be compared using adjusted rand index (ARI) or adjusted mutual information (AMI).
  • metacell concept is used to enhance signal for GRN construction

By aggregating information from several highly similar cells, metacells reduce the size of the dataset while preserving, and possibly even enhancing, the biological signal. This simultaneously addresses two main challenges of single-cell genomics data analysis: the large size of the single-cell data and its excessive sparsity.

Trade-off between topology-preserving downsampling (sketching) and imputation.

2.12.2 Limitations

  • The metacell partition may be considered a very high-resolution clustering.
  • metacells do not guarantee a global convergence.
  • potentially group cells of distinct types within a single metacell (impure metacells)
  • Artifacts can lead to misleading interpretations, including the presence of non-existing intermediate states or spurious gene co-expression
  • rare cell types could be completely missed if entirely aggregated with a more abundant cell type into a single metacell.
  • build metacells in a supervised manner by constructing metacells for each cell type separately

Adding cells per metacell as covariate

Bilous et al. (2024)

Gfeller Lab tutorial

2.13 Cell communication

Cell-cell communication and interaction.

Review of tools Almet et al. (2021)

2.14 Databases

2.14.1 Data

Single-cell data repositiories.

2.14.2 Markers

Curated list of marker genes by organism, tissue and cell type.

2.15 Tools

2.15.1 CLI frameworks

2.15.2 Interactive analysis/visualisation

2.15.2.1 Open source

Ouyang et al. (2021)

Overview of the visualization tools and their capabilities

Overview of the visualization tools and their capabilities

Cakir et al. (2020)

2.15.2.2 Commercial

Enterprise GUI solutions for single-cell data analysis.

2.16 Learning

References

Abdelaal, T., Michielsen, L., Cats, D., Hoogduin, D., Mei, H., Reinders, M. J., & Mahfouz, A. (2019). A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biology, 20, 1–19. https://link.springer.com/article/10.1186/s13059-019-1795-z
Almet, A. A., Cang, Z., Jin, S., & Nie, Q. (2021). The landscape of cell–cell communication through single-cell transcriptomics. Current Opinion in Systems Biology, 26, 12–23. https://www.sciencedirect.com/science/article/pii/S2452310021000081
Andreatta, M., Hérault, L., Gueguen, P., Gfeller, D., Berenstein, A. J., & Carmona, S. J. (2024). Semi-supervised integration of single-cell transcriptomics data. Nature Communications, 15(1), 872. https://www.nature.com/articles/s41467-024-45240-z
Argelaguet, R., Cuomo, A. S., Stegle, O., & Marioni, J. C. (2021). Computational principles and challenges in single-cell data integration. Nature Biotechnology, 39(10), 1202–1215. https://www.nature.com/articles/s41587-021-00895-7
Baran-Gale, J., Chandra, T., & Kirschner, K. (2018). Experimental design for single-cell RNA sequencing. Briefings in Functional Genomics, 17(4), 233–239. https://doi.org/10.1093/bfgp/elx035
Bergen, V., Soldatov, R. A., Kharchenko, P. V., & Theis, F. J. (2021). RNA velocity—current challenges and future perspectives. Molecular Systems Biology, 17(8), e10282. https://doi.org/10.15252/msb.202110282
Bilous, M., Hérault, L., Gabriel, A. A., Teleman, M., & Gfeller, D. (2024). Building and analyzing metacells in single-cell genomics data. Molecular Systems Biology, 1–23. https://doi.org/10.1038/s44320-024-00045-6
Cakir, B., Prete, M., Huang, N., Van Dongen, S., Pir, P., & Kiselev, V. Y. (2020). Comparison of visualization tools for single-cell RNAseq data. NAR Genomics and Bioinformatics, 2(3), lqaa052. https://pmc.ncbi.nlm.nih.gov/articles/PMC7391988/
Filippov, I., Philip, C. S., Schauser, L., & Peterson, P. (2024). Comparative transcriptomic analyses of thymocytes using 10x genomics and parse scRNA-seq technologies. BMC Genomics, 25(1), 1069. https://doi.org/10.1186/s12864-024-10976-x
Füllgrabe, A., George, N., Green, M., Nejad, P., Aronow, B., Fexova, S. K., Fischer, C., Freeberg, M. A., Huerta, L., Morrison, N., et al. (2020). Guidelines for reporting single-cell RNA-seq experiments. Nature Biotechnology, 38(12), 1384–1386. https://doi.org/10.1038/s41587-020-00744-z
Gorin, G., Fang, M., Chari, T., & Pachter, L. (2022). RNA velocity unraveled. PLOS Computational Biology, 18(9), e1010492. https://doi.org/10.1371/journal.pcbi.1010492
Heumos, L., Schaar, A. C., Lance, C., Litinetskaya, A., Drost, F., Zappia, L., Lücken, M. D., Strobl, D. C., Henao, J., Curion, F., et al. (2023). Best practices for single-cell analysis across modalities. Nature Reviews Genetics, 1–23. https://doi.org/10.1038/s41576-023-00586-w
Huang, Q., Liu, Y., Du, Y., & Garmire, L. X. (2021). Evaluation of cell type annotation r packages on single-cell RNA-seq data. Genomics, Proteomics & Bioinformatics, 19(2), 267–281. https://doi.org/10.1016/j.gpb.2020.07.004
Janssen, P., Kliesmete, Z., Vieth, B., Adiconis, X., Simmons, S., Marshall, J., McCabe, C., Heyn, H., Levin, J. Z., Enard, W., et al. (2023). The effect of background noise and its removal on the analysis of single-cell expression data. Genome Biology, 24(1), 140. https://link.springer.com/article/10.1186/s13059-023-02978-x
Li, W. V., & Li, J. J. (2019). A statistical simulator scDesign for rational scRNA-seq experimental design. Bioinformatics, 35(14), i41–i50. https://doi.org/10.1093/bioinformatics/btz321
Luecken, M. D., Büttner, M., Chaichoompu, K., Danese, A., Interlandi, M., Müller, M. F., Strobl, D. C., Zappia, L., Dugas, M., Colomé-Tatché, M., et al. (2022). Benchmarking atlas-level data integration in single-cell genomics. Nature Methods, 19(1), 41–50. https://www.nature.com/articles/s41592-021-01336-8
Luecken, M. D., Gigante, S., Burkhardt, D. B., Cannoodt, R., Strobl, D. C., Markov, N. S., Zappia, L., Palla, G., Lewis, W., Dimitrov, D., et al. (2025). Defining and benchmarking open problems in single-cell analysis. Nature Biotechnology, 1–6. https://doi.org/10.1038/s41587-025-02694-w
Luecken, M. D., & Theis, F. J. (2019). Current best practices in single-cell RNA-seq analysis: A tutorial. Molecular Systems Biology, 15(6), e8746. https://doi.org/10.15252/msb.20188746
Melsted, P., Booeshaghi, A. S., Gao, F., Beltrame, E., Lu, L., Hjorleifsson, K. E., Gehring, J., & Pachter, L. (2019). Modular and efficient pre-processing of single-cell RNA-seq. BioRxiv, 673285. http://dx.doi.org/10.1038/s41587-021-00870-2
Nguyen, Q. H., Pervolarakis, N., Nee, K., & Kessenbrock, K. (2018). Experimental considerations for single-cell RNA sequencing approaches. Frontiers in Cell and Developmental Biology, 6, 108. https://doi.org/10.3389/fcell.2018.00108
Ouyang, J. F., Kamaraj, U. S., Cao, E. Y., & Rackham, O. J. (2021). ShinyCell: Simple and sharable visualization of single-cell gene expression data. Bioinformatics, 37(19), 3374–3376. https://academic.oup.com/bioinformatics/article/37/19/3374/6198103
Saelens, W., Cannoodt, R., Todorov, H., & Saeys, Y. (2019). A comparison of single-cell trajectory inference methods. Nature Biotechnology, 37(5), 547–554. https://www.nature.com/articles/s42003-022-03175-5
Soneson, C., Srivastava, A., Patro, R., & Stadler, M. B. (2021). Preprocessing choices affect RNA velocity results for droplet scRNA-seq data. PLoS Computational Biology, 17(1), e1008585. https://doi.org/10.1371/journal.pcbi.1008585
Sun, X., Lin, X., Li, Z., & Wu, H. (2022). A comprehensive comparison of supervised and unsupervised methods for cell type identification in single-cell RNA-seq. Briefings in Bioinformatics, 23(2), bbab567. https://academic.oup.com/bib/article/23/2/bbab567/6502554
Svensson, V., Natarajan, K. N., Ly, L.-H., Miragaia, R. J., Labalette, C., Macaulay, I. C., Cvejic, A., & Teichmann, S. A. (2017). Power analysis of single-cell RNA-sequencing experiments. Nature Methods, 14(4), 381–387. https://doi.org/10.1038/nmeth.4220
Tran, H. T. N., Ang, K. S., Chevrier, M., Zhang, X., Lee, N. Y. S., Goh, M., & Chen, J. (2020). A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biology, 21, 1–32. https://link.springer.com/article/10.1186/s13059-019-1850-9
Wang, T., Li, B., Nelson, C. E., & Nabavi, S. (2019). Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data. BMC Bioinformatics, 20(1), 1–16. https://academic.oup.com/bib/article/20/1/288/4364840
Wang, Y., Li, J., Zha, H., Liu, S., Huang, D., Fu, L., & Liu, X. (2025). Paradigms, innovations, and biological applications of RNA velocity: A comprehensive review. Briefings in Bioinformatics, 26(4), bbaf339. https://doi.org/10.1093/bib/bbaf339
Xi, N. M., & Li, J. J. (2021). Benchmarking computational doublet-detection methods for single-cell RNA sequencing data. Cell Systems, 12(2), 176–194. https://doi.org/10.1016/j.cels.2020.11.008
Xiao, C., Chen, Y., Meng, Q., Wei, L., & Zhang, X. (2024). Benchmarking multi-omics integration algorithms across single-cell RNA and ATAC data. Briefings in Bioinformatics, 25(2), bbae095. https://doi.org/10.1093/bib/bbae095
Xie, B., Jiang, Q., Mora, A., & Li, X. (2021). Automatic cell type identification methods for single-cell RNA sequencing. Computational and Structural Biotechnology Journal, 19, 5874–5887. https://www.sciencedirect.com/science/article/pii/S2001037021004499
Zappia, L., Richter, S., Ramı́rez-Suástegui, C., Kfuri-Rubens, R., Vornholz, L., Wang, W., Dietrich, O., Frishberg, A., Luecken, M. D., & Theis, F. J. (2025). Feature selection methods affect the performance of scRNA-seq data integration and querying. Nature Methods, 1–11. https://doi.org/10.1038/s41592-025-02624-3
Zhang, M. J., Ntranos, V., & Tse, D. (2020). Determining sequencing depth in a single-cell RNA-seq experiment. Nature Communications, 11(1), 774. https://doi.org/10.1038/s41467-020-14482-y