Run consensus NMF

RunCNMF(
  counts.fn,
  K,
  out.path = NULL,
  run.name = NULL,
  n.iter = 100,
  n.var.genes = 2000,
  genes.fn = NULL,
  seed = 1024,
  cores = -1,
  n.top.genes = 100,
  local.density.cutoff = 0.5,
  local.neighborhood.size = 0.3,
  show.clustering = FALSE
)

Arguments

counts.fn

path to the cell x gene counts file. This is expected to be a tab-delimited text file or a `anndata` object saved in the h5ad format.

K

int, number of components (topics or dimensions) for cNMF.

out.path

the output directory into which all results will be placed. Default: current path.

run.name

a subdirectory out.path/run.name will be created and all output files will have name as their prefix. Default: current timestamp.

n.iter

number of NMF iterations to run for each K. Default: 100.

n.var.genes

(optional) the number of highest variance genes that will be used for running the factorization. Default: 2000

genes.fn

(optional) List of over-dispersed genes to be used for the factorization steps. One gene per line. If not provided, over-dispersed genes will be calculated automatically and the number of genes to use can be set by the `n.var.genes` parameter below. Default: NULL

seed

the master seed that will be used to generate the individual seed for each NMF replicate. Default: 1024

cores

specifies how many cores can be used in parallel. Default: all available cores detected by `parallel::detectCores()`.

n.top.genes

number of the genes with the highest loadings in each gene expression program (GEP). Default: 100

local.density.cutoff

the threshold on average distance to K nearest neighbors to use. 2.0 or above means that nothing will be filtered out. Default: 0.5

local.neighborhood.size

Percentage of replicates to consider as nearest neighbors for local density filtering. E.g. if you run 100 replicates, and set this to 0.3, 30 nearest neighbors will be used for outlier detection. Default: 0.3

show.clustering

whether or not the clustergram image is output. Default: FALSE