Run consensus NMF
RunCNMF(
counts.fn,
K,
out.path = NULL,
run.name = NULL,
n.iter = 100,
n.var.genes = 2000,
genes.fn = NULL,
seed = 1024,
cores = -1,
n.top.genes = 100,
local.density.cutoff = 0.5,
local.neighborhood.size = 0.3,
show.clustering = FALSE
)
path to the cell x gene counts file. This is expected to be a tab-delimited text file or a `anndata` object saved in the h5ad format.
int, number of components (topics or dimensions) for cNMF.
the output directory into which all results will be placed. Default: current path.
a subdirectory out.path/run.name will be created and all output files will have name as their prefix. Default: current timestamp.
number of NMF iterations to run for each K. Default: 100.
(optional) the number of highest variance genes that will be used for running the factorization. Default: 2000
(optional) List of over-dispersed genes to be used for the factorization steps. One gene per line. If not provided, over-dispersed genes will be calculated automatically and the number of genes to use can be set by the `n.var.genes` parameter below. Default: NULL
the master seed that will be used to generate the individual seed for each NMF replicate. Default: 1024
specifies how many cores can be used in parallel. Default: all available cores detected by `parallel::detectCores()`.
number of the genes with the highest loadings in each gene expression program (GEP). Default: 100
the threshold on average distance to K nearest neighbors to use. 2.0 or above means that nothing will be filtered out. Default: 0.5
Percentage of replicates to consider as nearest neighbors for local density filtering. E.g. if you run 100 replicates, and set this to 0.3, 30 nearest neighbors will be used for outlier detection. Default: 0.3
whether or not the clustergram image is output. Default: FALSE