I am analyzing a single cell RNA-seq dataset where one of the features is that one of the cell types of interest has notably lower RNA content than other cell types, and this difference is of biological importance. To be able to (approximately) study differences in gene expression while accounting for this, we were hoping to normalize our data to our ERCC spike-ins.
As an approach, I am trying to use scran, which does provide size factors based on ERCCs. I want to be able to take this normalized matrix and input into Seurat. Suppose i have this normalized matrix norm. Is it sufficient for me to do pbmc@data@x = as.numeric(log(norm + 1))? Or am I misunderstanding how the normalized data is stored in Seurat?
Hi @skannan4,
You have to replace your object@data slot with the desired gene expression matrix as follows:
pbmc@data = log(x = norm + 1))
Two details worth considering:
Seurat. But if you want to keep it you can always store it in object@misc as follows:pbmc@misc[["seurat_data"]] <- as.matrix(x = pbmc@data)
scran is not log transformed before computing log values.Best,
Leon
Also, if the scran normalized data is log transformed, make sure that the values are in natural log, and not log2. Seurat assumes that the normalized data is log transformed using natural log (some functions in Seurat will convert the data using expm1 for some calculations).
Best,
Leon
Hi,
I am writing to seek your help with using a TMM normalized input matrix as input to create Seurat object.
I get an error following the example in the code below to use zinbwave function:
[(https://github.com/drisso/zinbwave/issues/17)]
se <- SummarizedExperiment(as.matrix(input)) # put in the colData() part of the object at least batch
zinb <- zinbwave(se[[email protected],],K=10, epsilon=1000)
Error in .local(Y, ...) :
The input matrix should contain only whole numbers.
Can you please help with this error.
Thanks
Sharvari
Also, if the
scrannormalized data is log transformed, make sure that the values are in natural log, and not log2.Seuratassumes that the normalized data is log transformed using natural log (some functions inSeuratwill convert the data usingexpm1for some calculations).Best,
Leon
Thank you for this information, I would like to know which function of Seurat will use expm1?
Thank you in advance!
Best,
Pernille
Hi @PernilleYR, a lookup of the function name reveals where it is used:
https://github.com/satijalab/seurat/search?q=expm1&type=Code
src/data_manipulation.cpp: FastCov(), FastExpMean(), FastLogVMR()
R/utilities.R: ExpVar(), ExpSD(), ExpMean(), AverageExpression(), AddSmoothedScore()
R/differential_expression.R: FindMarkers()
R/plotting.R: DotPlot()
So it seems it is pretty widespread used and also be aware of indirect calls by functions that are not listed.
Best wishes
Hi @skannan4, based on @leonfodoulian 's snippets this works for me (_data_ is the Seurat object and _sce_ is the SingleCellExperiment object):
Normalize with scran
sce <- SingleCellExperiment(assays = list(counts = as.matrix(x = data@data))) # read data from Seurat
clusters = quickCluster(sce, min.size=100)
sce = computeSumFactors(sce, cluster=clusters)
sce = normalize(sce, return_log = FALSE) # without(!) log transform
Normalize with Seurat (backup elsewhere) and replace with scran normalization
data = NormalizeData(object = data, normalization.method = "LogNormalize", scale.factor = 10000)
data@misc[["seurat_norm_data"]] = as.matrix(x = data@data) # backup Seurat's norm data
data@data = log(x = assay(sce, "normcounts") + 1)
I am trying to do the same thing as @skannan4, but there seems to be a problem with classes. class(norm) gives
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
but when I try to replace the original matrix with pbmc[["RNA"]]@data = log(x = norm + 1) I get an error stating
Error in (function (cl, name, valueClass) :
assignment of an object of class “dgeMatrix” is not valid for @‘data’ in an object of class “Assay”; is(value, "AnyMatrix") is not TRUE
which is confusing, because is(norm, "AnyMatrix) gives me TRUE. I did a quick google search and apparently, BioC is built on S4 where the sparse matrix is dgeMatrix and Seurat is built on S3 where the sparse matrix is dgCMatrix. I tried converting the scran output matrix back and forth with
library(Matrix)
norm = as.matrix(norm)
norm = Matrix(norm, sparse=TRUE)
but no success.
Does anybody have an idea for a workaround? @tilofreiwald @andrewwbutler @sharvari14 @satijalab
Hi @marcmuellerETHZ, I did not experience this error myself but you could try two things:
M1 <- as(m, "dgCMatrix")
Most helpful comment
Also, if the
scrannormalized data is log transformed, make sure that the values are in natural log, and not log2.Seuratassumes that the normalized data is log transformed using natural log (some functions inSeuratwill convert the data usingexpm1for some calculations).Best,
Leon