Hi,
I'm using Seurat v3 (dev version) and having issues with plotting a heatmap of my genes of interest. I am trying to input gene names that I have stored in a dataframe. There are no repeats of the gene symbols, and the DotPlot function works to plot the expression of all the genes of interest.
DoHeatmap(object = data, features=receptor_symbol$receptor_symbol) + scale_fill_gradientn(colors = c("blue", "white", "red"))

As you can tell, only a few genes from my list of ~300 genes are being plotted. Not sure what NA is.
Here are the warning messages that I get in my console:
Scale for 'fill' is already present. Adding another scale for 'fill', which will replace the existing scale.
Warning message:
In DoHeatmap(object = Sca1pos, features = receptor_symbol$receptor_symbol) :
"The following features were omitted as they were not found in the scale.data slot for the RNA assay: Apcdd1, Fzd9, Kel, Cd109, Art1, Gp1ba, Ccr4, Bdkrb2, Fzd10, Cx3cr1, Epha8, Musk, Alk, Dcc, Ntrk1, Drd4, Il7r, Sell, Gpr1, Cmklr1, Sele, Tlr1, Fgfr4, Acvr1c, S1pr5, P2ry12, Oprm1, Mtnr1a, Drd2, Cxcr3, Chrm1, Adra2b, Grm5, Grm1, Ephb1, Epha6, Epha3, Klrg1, Ptch2, Gpc5, Agtr2, Gpr162, Gp9, Cd19, Marco, Cd3g, Cd3d, Ptgdr2, Cxcr6, Erbb4, Hfe2, Itgb8, Itgam, Il2ra, Lrp8, Chrna4, Tlr9, Grin2d, Ghsr, Ghrhr, Mc5r, Mc4r, Mc3r, Ptgir, Ptgdr, Lhcgr, Glp1r, Crhr1, Avpr2, Fpr1, Selp"
The warning message returns genes that are not found in my dataset, which is fine. However, I am still missing a large portion of the genes that I'm trying to plot in the heatmap. Is there an issue with the DoHeatmap function in v3? Just wondering if there will be a solution when v3 is officially released next week. Thanks!
The genes need to be present in scale.data in order to be plotted in the heatmap.
What happens if you run:
sum(receptor_symbol$receptor_symbol %in% rownames(GetAssayData(data, slot = 'scale.data')))
You can scale all genes by doing:
data <- ScaleData(object = data, features = rownames(data))
When I run the first line of suggested code, I get:
Error in UseMethod(generic = "GetAssayData", object = object) :
no applicable method for 'GetAssayData' applied to an object of class "function"
However, I already scaled my data earlier on during my analysis. I'm not sure what the issue would be?
data <- ScaleData(object = data, features = all.genes, vars.to.regress = c("nCount_RNA", "percent.mt"))
What is all.genes? Can you try using rownames(object = data) instead? That will guarantee it scales all the genes in your dataset.
If your object isn't called data, replace data with whatever your object is called. You're getting that error because data is defined in your session as the function data in the utils package, not as a Seurat object
Thanks for the suggestion. I scaled my data again using rownames(object = Sobject) but the figure that is generated from DoHeatmap remains the same and is still missing many of the genes from my gene list.
When I run your other suggested line of code, I get an output of 236:
sum(receptor_symbol$receptor_symbol %in% rownames(GetAssayData(Sobject, slot = 'scale.data')))
236
@irwon I think I know what's going on. You have scaled your data twice, you don't need to do this. Scale your data once. When you scale your data twice, the scale.data object is replaced with scaled information now only specific to your variable genes. This is a problem, because your variable genes and your differentially expressed genes used for your heatmap may not be the same.
This is what I run (no previous scalling, or scalling afterwards):
sample.data <- ScaleData(object = sample.data, vars.to.regress = 'percent.mt', features = rownames(sample.data), block.size = 2000, do.par = TRUE, num.cores = 8)
You can turn do.par to FALSE, and change num.cores to 1 if it gives you an error (this is just a parrallel computing parameter).
Somehow the following doesnt work for me (it only scales the 2000 genes selected as variable genes):
all.genes <- rownames(x = sample.data)
pbmc <- ScaleData(object = sample.data, features = all.genes)
Hope that helps you!!!
@timoast Why is it default that scaling twice limits the genes to variable genes, and does it "over-scale"? For instance, you first scale all data, then scale to regress out cell cycle or other metadata vars of interest. Or else, save the cell cycle scores, erase scale.data, and start again? Version 3 needs a lot of polishing.
Thanks for your feedback. We discuss the reasons for performing scaling (by default) to variable genes here https://satijalab.org/seurat/v3.0/pbmc3k_tutorial.html. You are welcome to scale all genes, its up to the user.
I just came across this issue of having very strange heatmaps. It turns out that the problem for me was that the array I was using for the features argument in DoHeatmap was of type 'factor' instead of 'character'.
I had to convert using as.character() and then I got all of the genes I was looking for and no bizarre NA values. This is an example:
DoHeatmap(fCite.small, assay = "RNA",group.by= group,features = as.character(top100$gene)) + NoLegend()
So @irwon if you change features=receptor_symbol$receptor_symbol to features=as.character(receptor_symbol$receptor_symbol) you should get a more reasonable heatmap.
FWIW I did use the link posted by @satijalab to make sure that I was scaling with all genes instead of only the most variable genes. That was helpful, but it doesn't solve the plotting issue completely, only partially.
Most helpful comment
I just came across this issue of having very strange heatmaps. It turns out that the problem for me was that the array I was using for the features argument in DoHeatmap was of type 'factor' instead of 'character'.
I had to convert using as.character() and then I got all of the genes I was looking for and no bizarre NA values. This is an example:
DoHeatmap(fCite.small, assay = "RNA",group.by= group,features = as.character(top100$gene)) + NoLegend()
So @irwon if you change features=receptor_symbol$receptor_symbol to features=as.character(receptor_symbol$receptor_symbol) you should get a more reasonable heatmap.
FWIW I did use the link posted by @satijalab to make sure that I was scaling with all genes instead of only the most variable genes. That was helpful, but it doesn't solve the plotting issue completely, only partially.