CDesk: 2. scRNA pipeline

Our CDesk scRNA module comprises of 8 function submodules. Here we present you the CDesk scRNA working pipeline and how to use it to analyze your scRNA data.

2.1 scRNA: Preprocess

CDesk preprocess module integrates preprocessing tools for single-cell RNA-seq data and supports four distinct analysis workflows: cellranger, drseq, singleron, and dnbc. Below is a brief overview of each:

  1. cellranger: The official 10x Genomics pipeline for single-cell RNA-seq analysis. This workflow checks the input CSV file for sample names and FASTQ file paths, creates symbolic links to the cellranger/input directory, and then runs cellranger count for each sample using the specified reference genome.
  2. singleron: Based on Singleron’s Celescope pipeline. This workflow links FASTQ files to the singleron/data directory, generates a sample mapping file (mapfile), and runs multi-sample analysis using the multi_rna command, followed by execution of the generated shell script.
  3. dnbc: A single-cell RNA-seq analysis pipeline developed by MGI for the DNBelab C4 platform, powered by dnbc4tools. It requires cDNA and oligo FASTQ files as input and performs data processing using the dnbc4tools rna run command with a user-specified genome index.
  4. drseq: Use the quality control and analysis software DrSeq for Drop-seq data. It takes two sequencing files as input: data_1.fastq (containing barcode information) and data_2.fastq (containing read sequences).
  5. smartseq: The preprocess pipeline is the same as bulkRNA preproces.

1. cellranger

Here is an example about how to use the CDesk preprocess cellranger module.

CDesk scRNA preprocess cellranger \
-i /.../input.csv \
-s mm10 -o /.../output_directory
Parameters(*necessary) Description Default value
-i,--input* The input fq information file
-o,--output* The output directory
-s,--species* Specify the species
-t,--thread Number of threads 8

If the pipeline runs successfully, there would be a cellranger running output in the output directory.

What should the input file look like?

=== cellranger.csv ===
sample,fq1,fq2
pbmc,/mnt/linzejie/CDesk_test/data/2.scRNA/1.preprocess/10x/pbmc_1.fastq.gz,/mnt/linzejie/CDesk_test/data/2.scRNA/1.preprocess/10x/pbmc_2.fastq.gz

- sample: The sample name to assign as prefix
- fq1: The fastq read1 compressed file
- fq2: The fastq read2 compressed file

2. singleron

Here is an example about how to use the CDesk preprocess singleron module.

CDesk scRNA preprocess singleron \
-i /.../input.csv \
-s mm10 -o /.../output_directory
Parameters(*necessary) Description Default value
-i,--input* The input fq information file
-o,--output* The output directory
-s,--species* Specify the species
-t,--thread Number of threads 8

If the pipeline runs successfully, there would be a celescope running output in the output directory.

What should the input file look like?

=== celescope.csv ===
sample,fq1,fq2
D0,/mnt3/linzejie/CDesk_test/data/2.scRNA/singleron/D0_1.fq.gz,/mnt3/linzejie/CDesk_test/data/2.scRNA/singleron/D0_2.fq.gz
D4,/mnt3/linzejie/CDesk_test/data/2.scRNA/singleron/D4_1.fq.gz,/mnt3/linzejie/CDesk_test/data/2.scRNA/singleron/D4_2.fq.gz

- sample: The sample name to assign as prefix
- fq1: The fastq read1 compressed file
- fq2: The fastq read2 compressed file

3. dnbc

Here is an example about how to use the CDesk preprocess dnbc module.

CDesk scRNA preprocess dnbc \
-i /.../input.csv \
-s mm10 -o /.../output_directory
Parameters(*necessary) Description Default value
-i,--input* The input fq information file
-o,--output* The output directory
-s,--species* Specify the species
-t,--thread Number of threads 8

If the pipeline runs successfully, there would be a dnbc4tools running output in the output directory.

What should the input file look like?

=== dnbc.csv ===
sample,cDNAfq1,cDNAfq2,oligofq1,oligofq2
PHCAR_Day4,/mnt3/linzejie/CDesk_test/data/2.scRNA/dnbc/huada/PHCAR_Day4_R1_001.fastq.gz,/mnt3/linzejie/CDesk_test/data/2.scRNA/dnbc/huada/PHCAR_Day4_R2_001.fastq.gz,/mnt3/linzejie/CDesk_test/data/2.scRNA/dnbc/huada/PHCAR_Day4-oligo_R1_001.fastq.gz,/mnt3/linzejie/CDesk_test/data/2.scRNA/dnbc/huada/PHCAR_Day4-oligo_R2_001.fastq.gz
PHCAR_Day8,/mnt3/linzejie/CDesk_test/data/2.scRNA/dnbc/huada/PHCAR_Day8_R1_001.fastq.gz,/mnt3/linzejie/CDesk_test/data/2.scRNA/dnbc/huada/PHCAR_Day8_R2_001.fastq.gz,/mnt3/linzejie/CDesk_test/data/2.scRNA/dnbc/huada/PHCAR_Day8-oligo_R1_001.fastq.gz,/mnt3/linzejie/CDesk_test/data/2.scRNA/dnbc/huada/PHCAR_Day8-oligo_R2_001.fastq.

- sample: The sample name to assign as prefix
- cDNAfq1: The cDNA fastq read1 compressed file
- cDNAfq2: The cDNA fastq read2 compressed file
- oligofq1: The oligo fastq read1 compressed file
- oligofq2: The oligo fastq read2 compressed file

4. drseq

Here is an example about how to use the CDesk preprocess drseq module.

CDesk scRNA preprocess drseq \
-i /.../input.csv \
-s mm10 -o /.../output_directory
Parameters(*necessary) Description Default value
-i,--input* The input fq information file
-o,--output* The output directory
-s,--species* Specify the species
-t,--thread Number of threads 8
--barcode The barcode range 1:12
--umi The UMI code range 13:20

If the pipeline runs successfully, there would be a DrSeq running output in the output directory.

What should the input file look like?

=== DrSeq.csv ===
sample,fq1,fq2
pbmc,/mnt/linzejie/CDesk_test/data/2.scRNA/1.preprocess/10x/pbmctest_1.fastq,/mnt/linzejie/CDesk_test/data/2.scRNA/1.preprocess/10x/pbmctest_2.fastq

- sample: The sample name to assign as prefix
- fq1: The fastq read1 compressed file
- fq2: The fastq read2 compressed file

5. smartseq

The same as CDesk bulk preprocess.

2.2 scRNA: cluster

CDesk scRNA cluster module automates scRNA-seq clustering analysis by providing two analytical modes: Seurat (R) and Scanpy (Python). The workflow includes standard scRNA-seq analysis steps—data loading, quality control (filtering of genes and cells, mitochondrial gene content assessment), normalization, feature selection, dimensionality reduction (PCA, UMAP, t-SNE), and clustering. It supports multiple input formats, including H5, H5AD, RDS, and 10x Genomics data, and generates comprehensive PDF reports containing clustering results and visualizations.

Here is an example about how to use the CDesk scRNA clustering module.

CDesk scRNA cluster \
-i /.../input \
-o /.../output_directory \
-n test --mode seurat(scanpy)
Parameters(*necessary) Description Default value
-i,--input* The input scRNA data (.h5,.txt/csv/tsv.gz,10x input directory,.rds for seurat mode,.h5ad for scanpy mode)
-o,--output* The output directory
-n,--name* The output prefix name
--mode* Analysis mode {seurat,scanpy}
--min_cells Genes filtration minimum cells threshold 3
--min_features Cells filtration minimum features threshold 200
--nFeature_RNA_min cells feature minimum filter threshold 200
--nFeature_RNA_max cells feature maximum filter threshold 2500
--mt_percent cells mitochondria percent filter threshold 5
--variable_features clustering variable features 2000
--dim_prefer clustering dimesions 30
--res clustering resolution 1
--width Plot width 10
--height Plot height 8

If the pipeline runs successfully, there would be clustering plots and h5ad output file in the output directory for scanpy mode. There would be elbow plots, cluste clustering plots, cluster tree plots and rds output file in the output directory for seurat mode.

CDesk scRNA scanpy cluster example result
CDesk scRNA seurat cluster example result

2.3 scRNA: annotation

CDesk scRNA annotation module realizes the automation of annotation of single-cell data. It offers two annotation strategies: marker gene-based annotation (marker mode) and reference-guided label transfer (transfer mode). In marker mode, cell types are assigned by comparing the expression levels and detection rates of known marker genes across clusters. In transfer mode, cell type labels from a reference dataset are transferred to the query dataset using Seurat’s integration framework. Both methods produce annotated cell type labels and automatically generate visualizations of the annotations projected onto low-dimensional embeddings.

1. marker mode

Cell type annotation is performed using user-provided marker genes that are specifically associated with defined cell types. Three key thresholds are applied to determine cell type identity:

  1. The marker gene must have an average expression level in the cluster exceeding the threshold expression_thre (log1p normalized expression).
  2. The gene must be expressed in a proportion of cells within the cluster greater than percentage_thre.
  3. For a candidate cell type to be considered, a minimum fraction (marker_percentage) of its marker genes must simultaneously meet both of the above criteria.

Finally, the average proportion score is calculated for each qualifying cell type, and the cell type with the highest score is assigned as the final annotation for each cluster.

Here is an example about how to use the CDesk scRNA annotation marker module.

CDesk scRNA annotation marker \
-i /.../input.rds \
--marker /.../marker.csv \
--meta meta \
-o /.../output_directory
Parameters(*necessary) Description Default value
-i,--input* The input .rds file to annotate
-o,--output* The output directory
--marker* The marker file with cell marker information
--meta* The meta colname of reference data used
--cluster Plot clustering umap
--expression_thre The average expression threshold 1.5
--percentage_thre The express > 0 percentage threshold 0.7
--marker_thre The proportion threshold for passing the screening 0.7
--width Plot width 10
--height Plot height 8

If the pipeline runs successfully, there would be a cell annotation file and a plot of the annotations projected onto low-dimensional embeddings.

What should the input file look like?

=== marker.csv ===
Celltype,Marker
C2,Zscan4a
C2,Zscan4b
C2,Eif1a
C2,Dux
C4,Obox4
C4,Khdc1b
C4,Usp17l1
C4,Zfp352
C8,Cdx2
C8,Pou5f1
C8,Nanog
C8,Eomes
C8,Tead4
TE,Cdx2
TE,Eomes
TE,Gata3
TE,Krt8
TE,Krt18
TE,Elf5
PrE,Gata6
PrE,Gata4
PrE,Sox17
PrE,Pdgfra
PrE,Dab2
PrE,Lrp2

- Celltype: The celltypes to specify
- Marker: The corresponding markers

2. transfer mode

It enables reference-based cell type label transfer. It first loads the reference and query datasets, identifies shared genes, and performs independent preprocessing—including normalization, variable feature selection, scaling, and PCA. Anchors between datasets are then identified using Seurat’s FindTransferAnchors in PCA space, followed by label transfer via TransferData. Cell type annotations from the reference (specified by ref_meta) are predicted and added to the query dataset’s metadata.

Here is an example about how to use the CDesk scRNA annotation transfer module.

CDesk scRNA annotation transfer \
--query /.../qry.rds \
--ref /.../ref.rds \
--meta meta \
-o /.../output_directory
Parameters(*necessary) Description Default value
--query* The query .rds data to annotate
--ref* The reference .rds data
-o,--output* The output directory
--meta* The meta colname of reference data used
--cluster Plot type umap
--nfeatures The number of variable features 5000
--dims The number of PC dimensions used 30
--width Plot width 10
--height Plot height 8

If the pipeline runs successfully, there would be a cell annotation file and a plot of the annotations projected onto low-dimensional embeddings.

CDesk scRNA annotation example result (left: marker, right: transfer)

2.4 scRNA: marker

CDesk marker module supports two modes of differential expression analysis for single-cell RNA-seq data:

  1. All-clusters mode (all): Uses FindAllMarkers to automatically identify differentially expressed genes (DEGs) for each cluster compared to all others, and generates a heatmap visualizing the top 10 marker genes per cluster.
  2. Two-group mode (2group): Uses FindMarkers to perform pairwise comparison between two user-specified cell types or clusters, identifying their specific DEGs.

Both modes allow customization of key parameters, including log-fold change threshold, adjusted p-value cutoff, and minimum expression percentage. Significantly differentially expressed genes are exported to CSV files for downstream analysis.

1. all

Here is an example about how to use the CDesk scRNA marker all module.

CDesk scRNA marker all \
-i /.../input.rds -o /.../output_directory \
--meta meta -m 0.25 -p 0.05 -f 0.5
Parameters(*necessary) Description Default value
-i,--input* The input Seurat object rds file
-o,--output* The output directory
--meta* The meta colname of reference data used
-fc Log Fold Change threshold 0.5
-p Adjusted p-value threshold 0.05
-m Minimum percentage of expressed cells 0.25
--width Plot width 16
--height Plot height 14

If the pipeline runs successfully, there would be a cell type marker file and a heatmap plot showing the top differential marker expression in all cell types.

2. 2 group

Here is an example about how to use the CDesk scRNA marker 2group module.

CDesk scRNA marker 2group \
-i /.../input.rds -o /.../output_directory \
--meta meta --type1 type1 --type2 type2 \
-f 0.5 -p 0.05 -m 0.25
Parameters(*necessary) Description Default value
-i,--input* The input Seurat object rds file
-o,--output* The output directory
-m,--meta* The meta colname of reference data used
--type1* Specify the first cell type
--type2* Specify the second cell type
-fc Log Fold Change threshold 0.5
-p Adjusted p-value threshold 0.05
-m Minimum percentage of expressed cells 0.25
--width Plot width 16
--height Plot height 14

If the pipeline runs successfully, there would be a cell type marker file and a heatmap plot showing the marker expression in two cell types.

CDesk scRNA marker example result (left: all, right: 2 group)

2.5 scRNA: trajectory

CDesk trajectory module supports four single-cell transcriptomic analysis functions for trajectory inference and RNA velocity:

  1. cstreet: Performs cell trajectory inference using the CStreet tool.
  2. diffusion: Constructs developmental trajectories via diffusion map embedding and computes pseudotime.
  3. monocle: Implements advanced trajectory inference with Monocle3, enabling differential gene analysis and co-expression module detection.
  4. velocity: Conducts RNA velocity analysis to predict cell fate transitions by integrating spliced and unspliced mRNA dynamics, revealing the directionality of cellular state changes.

All modules support parameterized input and generate visual outputs, providing a comprehensive suite for transitioning from static clustering to dynamic developmental analysis of single-cell data.

1. cstreet

Here is an example about how to use the CDesk scRNA trajectory cstreet module.

CDesk scRNA trajectory cstreet \
--expression /.../ExpressionMatrix_list.txt \
--state /.../CellStates_list.txt \
-n test -o /.../output_directory
Parameters(*necessary) Description Default value
--expression* Input expression matrixes list file
--state* Input cell state list file
-o,--output* The output directory
-n,--name* The meta colname of reference data used

If the pipeline runs successfully, there would be CStreet result in the output directory.

CDesk scRNA trajectory cstreet example result

2. diffusion

Here is an example about how to use the CDesk scRNA trajectory diffusion module.

CDesk scRNA trajectory diffusion \
-i /.../input.rds \
-o /.../output_directory \
-m meta --start celltype
Parameters(*necessary) Description Default value
-i,--input* Input seurat rds file
--start* Cell type start point for trajectory analysis
-o,--output* The output directory
-m,--meta* Meta column for trajectory analysis
--width Plot width 8
--height Plot height 8

If the pipeline runs successfully, it would generate a PDF file containing all diffusion map embeddings and rds object with diffusion map.

CDesk scRNA trajectory diffusion example result

3. monocle

Here is an example about how to use the CDesk scRNA trajectory monocle module.

CDesk scRNA trajectory monocle \
i /.../input.rds \
-m meta --start celltype \
-o /.../output_directory
Parameters(*necessary) Description Default value
-i,--input* Input seurat rds file
--start* Cell type start point for trajectory analysis
-o,--output* The output directory
-m,--meta* Meta column for trajectory analysis
--width Plot width 8
--height Plot height 8

If the pipeline runs successfully, it generates the following outputs:

CDesk scRNA trajectory monocle pseudotime trajectory visualization example result
CDesk scRNA trajectory monocle feature plots for the top 10 DEGs example result
CDesk scRNA trajectory monocle feature co-expression module heatmap example result

4. velocity

Here is an example about how to use the CDesk scRNA trajectory velocity module.

CDesk scRNA trajectory velocity \
--cellranger /.../cellranger \
--rds /.../input.rds \
--genes /.../genes.txt \
-o /.../output_directory \
--meta meta -s speices
Parameters(*necessary) Description Default value
--cellranger* Input cellranger output directory
-s,--species Specify the species
--rds* Input seurat rds file
-o,--output* The output directory
-m,--meta* Meta column for trajectory analysis
--genes Interested genes file
-t,--thread Number of threads 10
--width Plot width 10
--height Plot height 10

If the pipeline runs successfully, it would generate:

CDesk scRNA trajectory velocity example result

2.6 scRNA: similarity

CDesk similarity module assesses gene expression similarity between cell clusters from two single-cell RNA-seq datasets. It supports multiple input formats, reads in both datasets, and performs gene filtering. Using Seurat’s workflow, the module normalizes data, selects highly variable features, and integrates the datasets into a shared embedding space via anchor-based integration. For user-specified pairs of cell clusters, it computes the Pearson correlation coefficient of average gene expression profiles between each cluster pair. The results are visualized as a heatmap annotated with correlation values, enabling intuitive comparison of transcriptional similarity across datasets.

Here is an example about how to use the CDesk scRNA similarity transfer module.

CDesk scRNA similarity \
--sample1 /.../scRNA1 \
--sample2 /.../scRNA2 \
--meta1 meta1 --meta2 meta2 \
--type1 /.../type1.txt \
--type2 /.../type2.txt \
-o /.../output_directory
Parameters(*necessary) Description Default value
--sample1* The first scRNA data (.h5,.txt/csv/tsv.gz,10x input directory,.rds for seurat mode,.h5ad for scanpy mode)
--sample2* The second scRNA data
--meta1* Meta column used in data 1
--meta2* Meta column used in data 2
--type1* Data 1 group file
--type2* Data 2 group file
-o,--output* The output directory
--nfeatures_used The number of variable features 2000
--integrate_features The number of features for integration 2000
--width Plot width 10
--height Plot height 8

If the pipeline runs successfully, there would be a correlation heatmap showing the cross-sample cell type similarity analysis in the outputdirectory.

CDesk scRNA similarity example result
What should the input file look like?

=== type1.txt  ===
a1
a2
a3
a4
=== type2.txt  ===
b1
b2
b3
b4

2.7 scRNA: interaction

CDesk interaction module provides two approaches for analyzing cellular interactions in single-cell data:

  1. CellChat: Performs cell-cell communication network analysis by leveraging ligand-receptor interaction databases (CellChatDB). It takes a Seurat object as input, infers signaling probabilities between cell clusters, and reconstructs intercellular communication networks. The method generates visualizations including signaling network graphs, ligand-receptor interaction counts, and signaling strength plots, enabling comprehensive analysis of intercellular crosstalk.

  2. PySCENIC: Analyzes gene regulatory networks (GRNs) by inferring transcription factor (TF)-target gene co-expression relationships within cells. The pipeline consists of three steps: (1) GRN inference using GRNBoost2, (2) cis-regulatory motif enrichment analysis with RcisTarget, and (3) regulon activity scoring via AUCell. Additionally, it computes Regulon Specificity Scores (RSS) and generates corresponding heatmaps, along with visualizations of the inferred gene regulatory networks.

1. Cellchat

Here is an example about how to use the CDesk scRNA cellchat module.

CDesk scRNA interaction cellchat \
-i /.../input.rds \
-o /.../output_directory \
--s human --group meta
Parameters(*necessary) Description Default value
-i,--input* Input Seurat object rds file
-o,--output* The output directory
-s,--species* Species (human or mouse)
--group* Meta column for grouping
-t,--thread Number of threads 16
--width Plot width 8
--height Plot height 8

If the pipeline runs successfully, it would generate:

CDesk scRNA interaction cellchat example result

2. Pyscenic

Here is an example about how to use the CDesk scRNA cellchat module.

CDesk scRNA interaction pyscenic \
-i /.../input.data \
-o /.../output_directory \
--tf /.../tfs.txt \
--db /.../db_dir \
--species hg38
Parameters(*necessary) Description Default value
-i,--input* Input scRNA file, h5ad/csv/tsv format
-o,--output* The output directory
--tf* TF list file
--db* RcisTarget database directory
-s,--species* Species (human/mouse/hg*/mm*)
-t,--thread Number of threads 16
--auc AUC threshold 0.05
--nes Normalized Enrichment Score threshod 3
--normalize Normalize scRNA data or not (True/False) False
--scale Scale scRNA data or not (True/False) False
--log Log scRNA data or not (True/False) False
--top_regulons Show top N regulons 20
--top_targets Show top N targets genes for each TF 20
--plot_tf Draw network for specific TF

If the pipeline runs successfully, it would generate:

CDesk scRNA interaction pyscenic example result
What should the input file look like?

=== tfs.txt  ===
BATF
BCL6
EGR1
EGR2
ETS1
FOS
FOXP3
GATA1
GATA3
IRF1
IRF4
IRF8
JUN
JUNB
JUND
LEF1
MYC
NFATC1
NFKB1
NFKB2
PAX5
PRDM1
RELA
RUNX1
RUNX3
STAT1
STAT3
STAT4
STAT5A
=== db_dir  ===
hg38_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather
mm10_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather
motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl
motifs-v10nr_clust-nr.mgi-m0.001-o0.0.tbl
motifs-v9-nr.hgnc-m0.001-o0.0.tbl
motifs-v9-nr.hgnc-m0.001-o0.0.tbl.1
motifs-v9-nr.mgi-m0.001-o0.0.tbl

Genomic region database file (.feather)
Transcription factor motif annotation file (.tbl)
HGNC refers to human gene symbols; MGI refers to mouse gene symbols.

2.8 scRNA: integrate

The integrate module provides two strategies for single-cell data integration: Seurat-based integration and bulk + scRNA-seq integration.

  1. The Seurat integration mode supports three algorithms—merge, CCA (Canonical Correlation Analysis), and Harmony—to correct batch effects across multiple single-cell datasets, enabling seamless data integration and visualization.
  2. The bulk + scRNA-seq integration mode innovatively aligns bulk RNA-seq data with single-cell data in a shared gene space, allowing joint visualization in a reduced-dimensional space. This interactive visualization facilitates comparative analysis of the two data types, revealing their relationships and transcriptional similarities.

1. seurat

Here is an example about how to use the CDesk scRNA integrate seurate module.

CDesk scRNA integrate seurat \
-i /.../input.csv \
--mode merge(CCA/harmony) \
-o /.../output_directory
Parameters(*necessary) Description Default value
-i,--input* Seurat data list file (multiple formats acceptable)
-o,--output* The output directory
--mode* Choose a integration method {merge,harmony,CCA}
--min_cells minimum cells threshold for data filtration preprocess 3
--min_features minimum feature threshold for data filtration preprocess 200
--nfeatures_used The number of variable features 2000
--integrate_features The number of features for integration 2000
--pc_num PCA dimensional reduction dimensions 50
--pc_reduce UMAP dimensional reduction dimensions 30
-max_iter_harmony Harmony maximal iteration 20
--k_anchor k.anchor for CCA integrate 20
--width Plot width 10
--height Plot height 10

If the pipeline runs successfully, there would be a merged Seurat object and a UMAP plot for batch effect assessment of the integrated data.

CDesk scRNA integrate seurat example result (merge,harmony,CCA)

What should the input file look like?

=== input.csv  ===
file,group
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/GSM5050521_G1counts.csv.gz,sample1
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/GSM5050523_G2counts.csv.gz,sample2

- Celltype: The celltypes to specify
- Marker: The corresponding markers

2. bulk

Here is an example about how to use the CDesk scRNA integrate bulk module.

CDesk scRNA integrate bulk \
--bulk /.../bulk_data_list.txt \
--scRNA /.../scRNA_data_list.txt \
--meta /.../meta.csv \
-o /.../output_directory
Parameters(*necessary) Description Default value
--bulk* bulkRNA data list file
--scRNA* scRNA data list file
-m,--meta* The meta file
-o,--output* The output directory
--nfeatures_used The number of variable features 2000
--integrate_features The number of features for integration 2000
--pc_num PCA dimensional reduction dimensions 50
--pc_reduce UMAP dimensional reduction dimensions 30
--width Plot width 10
--height Plot height 10

If the pipeline runs successfully, there would be a merged Seurat object and and static as well as interactive visualization plots (including PCA, UMAP, and t-SNE) for integrated data.

CDesk scRNA integrate bulk example result (PCA,UMAP,t-SNE)

What should the input file look like?

=== bulk_data_list.txt ===
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.inhouse.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a378_a388.txt
=== scRNA_data_list.txt ===
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271345_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271346_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271347_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271348_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271349_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271350_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271351_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271352_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271353_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271354_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271355_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271356_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271357_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271358_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271359_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271360_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271361_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271362_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271363_expmat.txt
/mnt/linzejie/CDesk_test/data/2.scRNA/8.integrate/data/GeneSymbol_counts.a4_CRR271364_expmat.txt
=== meta.csv ===
sample,tag
P1.AAACATCG,2C_E2
P1.AACAACCA,zygote_E1
P1.AACCGAGA,zygote_E1
P1.AACGCTTA,zygote_E1
P1.AACGTGAT,2C_E2
......
A379,LACIX5
A380,LACIX10
A381,LACIX10
A382,LACIX5_N2B27
A383,LACIX5_N2B27
A384,LACIX10_N2B27
A385,LACIX10_N2B27
A386,LXP10
A387,LXP10
A388,LXP10

- sample: The sample(column) name of bulk/scRNA data
- tag: The tag name to specify

CDesk handbook