Hello
I have simple question.
Is it possible to figure out PC variance explained for each PCs (ex. PC1, PC2, ...)in Seurat?
Or should i manually figure out ?
Thank you
You can easily get the sdev, and thus the Variance Explained, of the PCs from the SeratObject:
pca = SeuratObj@dr$pca
eigValues = (pca@sdev)^2 ## EigenValues
varExplained = eigValues / sum(eigValues)
Note that sum(varExplained) = 1 = Total Variance
Sorry if this message sounds blunt, but this answer is wrong.
One way to show the error is to extract the first Principal Component RunPCA(..., npcs=1) and follow @sansense instructions. According to those instructions you would be extracting 100% of the variance (and you are not).
sum(eigValues) will only be the total variance of the data when pca@sdev includes all the components. If you extract the first 50 components in RunPCA(..., npcs = 50) then sum(eigValues) will contain only the variance captured by those first 50 components and not the total variance of the data.
If you want the total variance, you can also get it from:
# On Seurat 2:
SeuratObj <- RunPCA(SeuratObj, verbose = FALSE)
mat <- [email protected]
pca <- SeuratObj@dr$pca
# On Seurat 3:
SeuratObj <- RunPCA(SeuratObj, verbose = FALSE)
mat <- Seurat::GetAssayData(SeuratObj, assay = "RNA", slot = "scale.data")
pca <- SeuratObj[["pca"]]
# Get the total variance:
total_variance <- sum(matrixStats::rowVars(mat))
eigValues = (pca@sdev)^2 ## EigenValues
varExplained = eigValues / total_variance
minor point: in Seurat3 the pca@sdev slot is called stdev
Most helpful comment
Sorry if this message sounds blunt, but this answer is wrong.
One way to show the error is to extract the first Principal Component
RunPCA(..., npcs=1)and follow @sansense instructions. According to those instructions you would be extracting 100% of the variance (and you are not).sum(eigValues)will only be the total variance of the data whenpca@sdevincludes all the components. If you extract the first 50 components inRunPCA(..., npcs = 50)thensum(eigValues)will contain only the variance captured by those first 50 components and not the total variance of the data.If you want the total variance, you can also get it from: