MNN is designed for batch correction of single-cell RNA-seq data where the batches are partially confounded with biological conditions of interest. It does so by identifying pairs of MNN in the high-dimensional log-expression space. For each MNN pair, a pairwise correction vector is computed by applying a Gaussian smoothing kernel with bandwidth `sigma`.

runMNNCorrect(
  inSCE,
  useAssay = "logcounts",
  batch = "batch",
  assayName = "MNN",
  k = 20L,
  propK = NULL,
  sigma = 0.1,
  cosNormIn = TRUE,
  cosNormOut = TRUE,
  varAdj = TRUE,
  BPPARAM = BiocParallel::SerialParam()
)

Arguments

inSCE

Input SingleCellExperiment object

useAssay

A single character indicating the name of the assay requiring batch correction. Default "logcounts".

batch

A single character indicating a field in colData that annotates the batches of each cell; or a vector/factor with the same length as the number of cells. Default "batch".

assayName

A single characeter. The name for the corrected assay. Will be saved to assay. Default "MNN".

k

An integer scalar specifying the number of nearest neighbors to consider when identifying MNNs. See "See Also". Default 20.

propK

A numeric scalar in (0, 1) specifying the proportion of cells in each dataset to use for mutual nearest neighbor searching. See "See Also". Default NULL.

sigma

A numeric scalar specifying the bandwidth of the Gaussian smoothing kernel used to compute the correction vector for each cell. See "See Also". Default 0.1.

cosNormIn

A logical scalar indicating whether cosine normalization should be performed on the input data prior to calculating distances between cells. See "See Also". Default TRUE.

cosNormOut

A logical scalar indicating whether cosine normalization should be performed prior to computing corrected expression values. See "See Also". Default TRUE.

varAdj

A logical scalar indicating whether variance adjustment should be performed on the correction vectors. See "See Also". Default TRUE.

BPPARAM

A BiocParallelParam object specifying whether the PCA and nearest-neighbor searches should be parallelized.

Value

The input SingleCellExperiment object with assay(inSCE, assayName) updated.

References

Haghverdi L, Lun ATL, et. al., 2018

See also

Examples

data('sceBatches', package = 'singleCellTK')
logcounts(sceBatches) <- log1p(counts(sceBatches))
sceCorr <- runMNNCorrect(sceBatches)