SingleCellTK (SCTK) is equipped with the generic heatmap plotting functionality, which helps visualizing the feature expression matrices, with flexible methods of annotation and grouping on either cells or features. Our base function is a wrapper of ComplexHeatmap
[1].
To view detailed instructions on how to use heatmap, please select ‘Interactive Analysis’ for using heatmap in shiny application or ‘Console Analysis’ for using it on R console from the tabs below:
Entry of the Panel
The UI for plotting a generic heatmap is constructed with 5 sections: data input, data subsetting, annotation adding, parameter setting, and finally, plotting.
Basic Workflow
The data used for plotting a heatmap can only be a feature expression matrix, technically called assay
in R language. This will be set at the selection input “Assay to Plot”.
There is a second selection called “Import from analysis” which is used for the fast setting of plotting a differential expression (DE) analysis or marker detection specific style. The detail of this feature will be introduced later.
Similar to the condition definition in the DE analysis panel, here SCTK also adopts a data table to do a filtering and select the cells and features, in order to maintain the maximum flexibility. By default, all the cell annotation available will be displayed as shown in the screenshot above, as well as the feature annotation.
Generally, to make a selection, users will first find the annotations which can define the cells or features of interests, apply the filter, and then press “Add all filtered” to finish the selection. When there is no applicable annotation, we also allow manually entering a list of cell/feature identifiers to specify the contents to be plotted.
To make the operation cleaner, user can choose to only display the annotation classes of interests at the multi-selection input “Columns to display”. The outcome will look like the screenshot above. When the annotation of interests is categorical (in terms of R language, a character
or logical
vector or a factor
), filters can be applied by selecting the categories of interest, at the boxes under the variable names.
The screenshots above demonstrates how to filer by limiting a range to a continuous data (in R, a numeric
or integer
vector).
Finally, if there is no applicable annotation to use, SCTK provides a third tab that allows direct pasting a list of unique cell/feature identifiers. As shown above.
The identifiers pasted to the text box should be able to uniquely identify the cells or features, and they have to be able to be found in the background information (i.e. the background SingleCellExperiment
(SCE) object). However, they do not necessarily have to be the default identifiers. That is to say, identifiers that can be found in existing annotation are also acceptable. To accomplish this, users only need to select what class of annotation they intend to use at the selection input “Match input cell/feature identifiers by”. Note that the option “Row Names” indicates the default identifier when no annotation is available.
For example, usually in a “cellRanger” style data, the default feature identifiers would usually be “Ensembl IDs,” while “gene symbols” will be embedded in the feature annotation. When users only have a list of genes symbol, they can directly use this list, and select the annotation for “gene symbol,” instead of going for some other external tools to transfer the “symbols” into “IDs.”
In this sub-panel, users can attach the annotations to be shown in the heatmap legend, and changes in colors are allowed. As mentioned previously, there are two types of values - categorical and continuous.
For categorical information, the color setting will look like the screenshot above. Each category will be displayed with a color input. Users can click on the color box to change the color for a specific category if needed.
Note that when categorical annotation is wanted but too many categories are detected, SCTK will not display any color selection but set it with default color scheme, in order to avoid a UI “overflow.”
While for a continuous value, users will still be asked if the value is categorical or continuous. The reason is that, technically, sometimes there are annotations generated in a continuous numeric data type but indicating categorical information, such as clusters labeled with integers. If users select “Categorical,” the similar color selections will be displayed as in the previous screenshot (Refer to the previous folded paragraph). If “Continuous” is selected, a different color selection UI will be shown, as the screenshot above. Here users will be asked to set two colors for a color gradient.
This is the last sub-panel for the generic heatmap setting before users can generate a plot.
After going over all the settings shown in the screenshots of the previous sections, users can confirm to generate a heatmap by clicking on the button “Plot Heatmap”. A heatmap in the screenshot above will be generated at the bottom region of the UI.
Additional Usage
The documentation above is for the general usage of SCTK Heatmap page. While heatmap is an important visualization method for many types of analysis, SCTK allows importing some types analysis results. Currently, DE analysis and marker detection results are supported. Here, we will present an example of importing the DE analysis result.
First, users need to go back to the top of the Heatmap panel, and find “Import from analysis” selection input. On selecting "Differential Expression"
, the session will scan the background data and show a list of the analysis information for users to choose from. After selecting the analysis of interests, click “Import”.
The import functionality will automatically fill most of the options afterwards, including making up some temporary DE specific annotations and subset the data accordingly.
Note that the features automatically selected are all of the ones stored to the result. In other words, this is affected by the parameters used when running the DE analysis. However, sometimes users might not want to plot all of them but apply some further filters instead. For example, using features with the absolute value of Log2FC greater than 1
usually produces cleaner figure, than setting that bound to 0.25
. To apply this filter in the UI, users need to click on the input box below the column title in the table. If the column is for categorical information, then just click on the categories of interests; else if it is numeric, users can either manually drag the sliders or type a “formula” into the input box. The formula follows the rule: {low}...{high}
, where one of the two bounds can be omitted for a one-end range. For example if users want to apply a filter described above for Log2FC values, they need to first click “Clear selection”, type 1...
, click “Add all filtered”, then type ...-1
, and click “Add all filtered” again.
Other heatmap settings will also be automatically filled for a DE specific heatmap.
To present the usage of plotSCEHeatmap()
, we would like to use a small example provided with SCTK.
library(singleCellTK)
data("scExample") # This imports SCE object "sce"
sce
## class: SingleCellExperiment
## dim: 200 390
## metadata(0):
## assays(1): counts
## rownames(200): ENSG00000251562 ENSG00000205542 ... ENSG00000204472
## ENSG00000133872
## rowData names(2): feature_ID feature_name
## colnames(390): pbmc_4k_GACCAATTCCTAGAAC-1 pbmc_4k_AAATGCCGTTTCGCTC-1
## ... pbmc_4k_CTGAAGTTCCACTCCA-1 pbmc_4k_CATCGAACAGCCAGAA-1
## colData names(4): cell_barcode column_name sample type
## reducedDimNames(0):
## altExpNames(0):
“Raw” plotting
The minimum setting for plotSCEHeatmap()
is the input SCE object and the data matrix to plot (default "logcounts"
). In this way, all cells and features will be presented while no annotation or legend (except the main color scheme) will be shown.
plotSCEHeatmap(sce, useAssay = "counts")
Subsetting
SCTK allows relatively flexible approaches to select the cells/features to plot.
The basic way to subset the heatmap is to directly use an index vector that can subset the input SCE object to featureIndex
and cellIndex
, including numeric
, and logical
vectors, which are widely used, and character
vector containing the row/col names. Of course, user can directly use a subsetted SCE object as input.
# Make up random downsampling numeric vector
featureSubset <- sample(nrow(sce), 50)
cellSubset <- sample(ncol(sce), 50)
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = featureSubset, cellIndex = cellSubset)
rowData
/colData
In a more complex situation, where users might only have a set of identifiers which are not inside the row/col names (i.e. unable to directly subset the SCE object), we provide another approach. The subset, in this situation, can be accessed via specifying a vector that contains the identifiers users have, to featureIndexBy
or cellIndexBy
. This specification allows directly giving one column name of rowData
or colData
.
subsetFeatureName <- sample(rowData(sce)$feature_name, 50)
subsetCellBarcode <- sample(sce$cell_barcode, 50)
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = subsetFeatureName, featureIndexBy = "feature_name", cellIndex = subsetCellBarcode, cellIndexBy = "cell_barcode")
Adding Annotations
As introduced before, we allow directly using column names of rowData
or colData
to attach color bar annotations. To make use of this functionality, pass a character
vector to rowDataName
or colDataName
.
# Make up arbitrary annotation,
rowRandLabel <- c(rep('aa', 100), rep('bb', 100))
rowData(sce)$randLabel <- rowRandLabel
colRandLabel <- c(rep('cc', 195), rep('dd', 195))
colData(sce)$randLabel <- colRandLabel
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = featureSubset, cellIndex = cellSubset, rowDataName = "randLabel", colDataName = c("type", "randLabel"))
Fully customized annotation is also supported, though it can be complexed for users. For the labeling, it is more recommanded to insert the information into rowData
or colData
and then make use. For coloring, information should be passed to featureAnnotationColor
or cellAnnotationColor
. The argument must be a list
object with names matching the annotation classes (such as "randLabel"
and "type"
); each inner object under a name must be a named vector, with colors as the values and existing categories as the names. The working instance looks like this:
## [1] "For row annotation"
## $randLabel
## aa bb
## "#FF4D4D" "#4DFFFF"
## [1] "For column annotation"
## $sample
## pbmc_4k
## "#FF4D4D"
##
## $type
## Singlet Doublet EmptyDroplet
## "#4DFFFF" "#FFC04D" "#4D4DFF"
##
## $randLabel
## cc dd
## "#FFFF4D" "#BA4DFF"
Others
1. Grouping/Splitting In some cases, it might be better to do a “semi-heatmap” (i.e. split the rows/columns first and cluster them within each group) to visualize some expression pattern, such as evaluating the differential expression. For this need, use rowSplitBy
or colSplitBy
, and the arguments must be a character
vector that is a subset of the specified annotation.
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = featureSubset, cellIndex = cellSubset, rowDataName = "randLabel", colDataName = c("type", "randLabel"), rowSplitBy = "randLabel", colSplitBy = "type")
2. Cell/Feature Labeling Text labels of features or cells can be added via rowLabel
or colLabel
. Use TRUE
or FALSE
to specify whether to show the rownames
or colnames
of the subsetted SCE object. Additionally, giving a single string of a column name of rowData
or colData
can enable the labeling of the annotation. Furthermore, users can directly throw a character vector to the parameter, with the same length of either the full SCE object or the subsetted.
3. Dendrograms The dendrograms for features or cells can be removed by passing FALSE
to rowDend
or colDend
.
4. Row/Column titles The row title ("Genes"
) and column title ("Cells"
) can be changed or removed by passing a string or NULL
to rowTitle
or colTitle
, respectively.
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = featureSubset, cellIndex = cellSubset, rowLabel = "feature_name", colLabel = as.character(1:50), colDend = FALSE, rowTitle = "Downsampled features")
There are still some parameters not mentioned here, but they are not frequently used. Please refer to ?plotSCEHeatmap
as well as ?ComplexHeatmap::Heatmap
.