API#
Python API#
Processing#
Calculate ambient profile
Calculate ambient profile for relevant features |
Training#
The core module of scar
The scar model |
Synthetic_dataset#
Generate synthetic datasets (scRNAseq, CITE-seq, scCRISPRseq) with ambient contamination
Generate synthetic single-cell RNAseq data with ambient contamination |
|
Generate synthetic ADT count data for CITE-seq with ambient contamination |
|
Generate synthetic sgRNA count data for scCRISPRseq with ambient contamination |
Plotting#
Plotting functions (under development).
Reporting#
Generate denoising reports (under development).
Command Line Interface#
scAR (single cell Ambient Remover): denoising drop-based single-cell omics data
usage: scar [-h] [--version] [-ap AMBIENT_PROFILE] [-ft FEATURE_TYPE]
[-o OUTPUT] [-m COUNT_MODEL] [-sp SPARSITY] [-hl1 HIDDEN_LAYER1]
[-hl2 HIDDEN_LAYER2] [-ls LATENT_DIM] [-epo EPOCHS] [-d DEVICE]
[-s SAVE_MODEL] [-batchsize BATCHSIZE]
[-batchsize_infer BATCHSIZE_INFER] [-adjust ADJUST]
[-cutoff CUTOFF] [-round2int ROUND2INT] [-clip_to_obs CLIP_TO_OBS]
[-moi MOI] [-verbose VERBOSE]
count_matrix [count_matrix ...]
Positional Arguments#
- count_matrix
The file of raw count matrix, 2D array (cells x genes) or the path of a filtered_feature_bc_matrix.h5
Named Arguments#
- --version
show program’s version number and exit
- -ap, --ambient_profile
The file of empty profile obtained from empty droplets, 1D array
- -ft, --feature_type
The feature types, e.g. mRNA, sgRNA, ADT, tag, CMO and ATAC
Default: “mRNA”
- -o, --output
Output directory
- -m, --count_model
Count model
Default: “binomial”
- -sp, --sparsity
The sparsity of expected native signals
Default: 0.9
- -hl1, --hidden_layer1
Number of neurons in the first layer
Default: 150
- -hl2, --hidden_layer2
Number of neurons in the second layer
Default: 100
- -ls, --latent_dim
Dimension of latent space
Default: 15
- -epo, --epochs
Training epochs
Default: 800
- -d, --device
Device used for training, either ‘auto’, ‘cpu’, or ‘cuda’
Default: “auto”
- -s, --save_model
Save the trained model
Default: False
- -batchsize, --batchsize
Batch size for training, set a small value upon out of memory error
Default: 64
- -batchsize_infer, --batchsize_infer
Batch size for inference, set a small value upon out of memory error
Default: 4096
- -adjust, --adjust
Only used for calculating Bayesfactors to improve performance,
‘micro’ – adjust the estimated native counts per cell. Default.‘global’ – adjust the estimated native counts globally.False – no adjustment, use the model-returned native counts.Default: “micro”
- -cutoff, --cutoff
Cutoff for Bayesfactors. See [Ly2020]
Default: 3
- -round2int, --round2int
Round the counts
Default: “stochastic_rounding”
- -clip_to_obs, --clip_to_obs
clip the predicted native counts by observed counts, use it with caution, as it may lead to overestimation of overall noise.
Default: False
- -moi, --moi
Multiplicity of Infection. If assigned, it will allow optimized thresholding, which tests a series of cutoffs to find the best one based on distributions of infections under given moi. See [Dixit2016] for details. Under development.
- -verbose, --verbose
Whether to print the logging messages
Default: True