A wrapper function for addPerCellQC. Calculate general quality control metrics for each cell in the count matrix.

runPerCellQC(
  inSCE,
  useAssay = "counts",
  collectionName = NULL,
  geneSetList = NULL,
  geneSetListLocation = "rownames",
  geneSetCollection = NULL,
  percent_top = c(50, 100, 200, 500),
  use_altexps = FALSE,
  flatten = TRUE,
  detectionLimit = 0,
  BPPARAM = BiocParallel::SerialParam()
)

Arguments

inSCE

Input SingleCellExperiment object.

useAssay

A string specifying which assay in the SCE to use. Default "counts".

collectionName

Character. Name of a GeneSetCollection obtained by using one of the importGeneSet* functions. Default NULL.

geneSetList

List of gene sets to be quantified. The genes in the assays will be matched to the genes in the list based on geneSetListLocation. Default NULL.

geneSetListLocation

Character or numeric vector. If set to 'rownames', then the genes in 'geneSetList' will be looked up in rownames(inSCE). If another character is supplied, then genes will be looked up in the column names of rowData(inSCE). A character vector with the same length as geneSetList can be supplied if the IDs for different gene sets are found in different places, including a mixture of 'rownames' and rowData(inSCE). An integer or integer vector can be supplied to denote the column index in rowData(inSCE). Default 'rownames'.

geneSetCollection

Class of GeneSetCollection from package GSEAbase. The location of the gene IDs in inSCE should be in the description slot of each gene set and should follow the same notation as geneSetListLocation. The function getGmt can be used to read in gene sets from a GMT file. If reading a GMT file, the second column for each gene set should be the description denoting the location of the gene IDs in inSCE. These gene sets will be included with those from geneSetList if both parameters are provided.

percent_top

An integer vector. Each element is treated as a number of top genes to compute the percentage of library size occupied by the most highly expressed genes in each cell.

use_altexps

Logical scalar indicating whether QC statistics should be computed for alternative Experiments in x. If TRUE, statistics are computed for all alternative experiments. Alternatively, an integer or character vector specifying the alternative Experiments to use to compute QC statistics. Alternatively NULL, in which case alternative experiments are not used.

flatten

Logical scalar indicating whether the nested DataFrame-class in the output should be flattened.

detectionLimit

A numeric scalar specifying the lower detection limit for expression.

BPPARAM

A BiocParallelParam object specifying whether the QC calculations should be parallelized.

Value

A SingleCellExperiment object with cell QC metrics added to the colData slot. If geneSetList or geneSetCollection are provided, then the rownames for each gene set will be saved in metadata(inSCE)$scater$addPerCellQC$geneSets.

Examples

data(scExample, package = "singleCellTK") mito.ix = grep("^MT-", rowData(sce)$feature_name) geneSet <- list("Mito"=rownames(sce)[mito.ix]) sce <- runPerCellQC(sce, geneSetList = geneSet)
#> Thu Oct 28 15:47:20 2021 ... Running 'perCellQCMetrics'