Find optimal components or topics for cNMF

FindOptimalK(
  counts.fn,
  components,
  tpm.fn = NULL,
  out.path = NULL,
  run.name = NULL,
  n.iter = 100,
  n.var.genes = 2000,
  genes.fn = NULL,
  seed = 1024,
  cores = -1
)

Arguments

counts.fn

path to the cell x gene counts file. This is expected to be a tab-delimited text file or a anndata object saved in the h5ad format.

components

vector, list of K values that will be tested for cNMF.

tpm.fn

If provided, load tpm data from file. Otherwise will compute it from the counts file. Default: NULL

out.path

the output directory into which all results will be placed. Default: current path.

run.name

a subdirectory out.path/run.name will be created and all output files will have name as their prefix. Default: current timestamp.

n.iter

number of NMF iterations to run for each K. Default: 100.

n.var.genes

(optional) the number of highest variance genes that will be used for running the factorization. Default: 2000

genes.fn

(optional) List of over-dispersed genes to be used for the factorization steps. One gene per line. If not provided, over-dispersed genes will be calculated automatically and the number of genes to use can be set by the `n.var.genes` parameter below. Default: NULL

seed

the master seed that will be used to generate the individual seed for each NMF replicate. Default: 1024

cores

specifies how many cores can be used in parallel. Default: all available cores detected by `parallel::detectCores()`.