API#
Python API#
Processing#
Calculate ambient profile
Calculate ambient profile for relevant features |
Training#
The core module of scar
The scar model |
Synthetic_dataset#
Generate synthetic datasets (scRNAseq, CITE-seq, scCRISPRseq) with ambient contamination
Plotting#
Plotting functions (under development).
Reporting#
Generate denoising reports (under development).
Command Line Interface#
scAR (single-cell Ambient Remover) is a deep learning model for removal of the ambient signals in droplet-based single cell omics
usage: scar [-h] [--version] [-ap AMBIENT_PROFILE] [-ft FEATURE_TYPE]
[-o OUTPUT] [-m COUNT_MODEL] [-sp SPARSITY] [-bk BATCHKEY]
[-cache CACHECAPACITY] [-gnf GET_NATIVE_FREQUENCIES]
[-hl1 HIDDEN_LAYER1] [-hl2 HIDDEN_LAYER2] [-ls LATENT_DIM]
[-epo EPOCHS] [-d DEVICE] [-s SAVE_MODEL] [-batchsize BATCHSIZE]
[-batchsize_infer BATCHSIZE_INFER] [-adjust ADJUST]
[-cutoff CUTOFF] [-round2int ROUND2INT] [-clip_to_obs CLIP_TO_OBS]
[-moi MOI] [-verbose VERBOSE]
count_matrix [count_matrix ...]
Positional Arguments#
- count_matrix
The file of raw count matrix, 2D array (cells x genes) or the path of a filtered_feature_bc_matrix.h5
Named Arguments#
- --version
show program’s version number and exit
- -ap, --ambient_profile
The file of empty profile obtained from empty droplets, 1D array
- -ft, --feature_type
The feature types, e.g. mRNA, sgRNA, ADT, tag, CMO and ATAC
Default:
'mRNA'- -o, --output
Output directory
- -m, --count_model
Count model
Default:
'binomial'- -sp, --sparsity
The sparsity of expected native signals
Default:
0.9- -bk, --batchkey
The batch key for batch correction
- -cache, --cachecapacity
The capacity of cache for batch correction
Default:
20000- -gnf, --get_native_frequencies
Whether to get native frequencies, 0 or 1, by default 0, not to get native frequencies
Default:
0- -hl1, --hidden_layer1
Number of neurons in the first layer
Default:
150- -hl2, --hidden_layer2
Number of neurons in the second layer
Default:
100- -ls, --latent_dim
Dimension of latent space
Default:
15- -epo, --epochs
Training epochs
Default:
800- -d, --device
Device used for training, either ‘auto’, ‘cpu’, or ‘cuda’
Default:
'auto'- -s, --save_model
Save the trained model
Default:
False- -batchsize, --batchsize
Batch size for training, set a small value upon out of memory error
Default:
64- -batchsize_infer, --batchsize_infer
Batch size for inference, set a small value upon out of memory error
Default:
4096- -adjust, --adjust
Only used for calculating Bayesfactors to improve performance,
‘micro’ – adjust the estimated native counts per cell. Default.‘global’ – adjust the estimated native counts globally.False – no adjustment, use the model-returned native counts.Default:
'micro'- -cutoff, --cutoff
Cutoff for Bayesfactors. See [Ly2020]
Default:
3- -round2int, --round2int
Round the counts
Default:
'stochastic_rounding'- -clip_to_obs, --clip_to_obs
clip the predicted native counts by observed counts, use it with caution, as it may lead to overestimation of overall noise.
Default:
False- -moi, --moi
Multiplicity of Infection. If assigned, it will allow optimized thresholding, which tests a series of cutoffs to find the best one based on distributions of infections under given moi. See [Dixit2016] for details. Under development.
- -verbose, --verbose
Whether to print the logging messages
Default:
True