Seurat: SCTransform: Error in PrepDR; told to run FindVariableFeatures after SCTransform?

Created on 13 Apr 2020  路  1Comment  路  Source: satijalab/seurat

Hello.

I am having a similar issue with issue #2511, with RunPCA after SCTransform. I have 3 samples, and I SCTransformed them individually, before merging all 3 samples as one single Seurat object. I then ran RunPCA but had the following error:

Error in PrepDR(object = object, features = features, verbose = verbose) : Variable features haven't been set. Run FindVariableFeatures() or provide a vector of feature names.

The active assay is 'SCT', and I understand that I do not need to run FindVariableFeatures since it's part of SCTransform, so I'm unsure why I am getting this error. Below is the code I used:

# Import SCE objects that have been QC-ed using scater
sample_list <- list()
sample_list[[1]] <- contA.data.sce_filtered
sample_list[[2]] <- contB.data.sce_filtered
sample_list[[3]] <- cont1.data.sce_filtered
names(sample_list) <- c("Sample 1", "Sample 2, "Sample 3")

# Remove genes not found in ALL datasets
sample_common.genes <- intersect(rownames(sample_list[[1]]), rownames(sample_list[[2]]))
sample_common.genes <- intersect(samplel_common.genes, rownames(sample_list[[3]]))
for (i in names(sample_list)){
  sample_list[[i]] <- (sample_list[[i]])[sample_common.genes,]
}

# Import datasets into Seurat
sample1.seurat <- as.Seurat(sample_list[[1]], counts = "counts", data = NULL, project="sample1")
sample2.seurat <- as.Seurat(sample_list[[2]], counts = "counts", data = NULL, project="sample2")
sample3.seurat <- as.Seurat(sample_list[[3]], counts = "counts", data = NULL, project="sample3")

# Run SCtransform on each of the Seurat object
sample1.seurat <- SCTransform(sample1.seurat, vars.to.regress = "subsets_Mito_percent")
sample2.seurat <- SCTransform(sample2.seurat, vars.to.regress = "subsets_Mito_percent")
sample3.seurat <- SCTransform(sample3.seurat, vars.to.regress = "subsets_Mito_percent")

# Merge all samples into one Seurat object
sample_seurat <- merge(x = sample1.seurat, y = c(sample2.seurat, sample3.seurat), add.cell.ids = c("sample1", "sample2, "sample3"), project = "sample",merge.data = TRUE)
sample_seurat <- AddMetaData(sample_seurat, [email protected], col.name = "orig.ident")
sample_treatment.groups <- rep("sample", 11483)
sample_seurat <- AddMetaData(sample_seurat, sample_treatment.groups, col.name = "treatment")

# Clustering
sample_cluster <- sample_seurat
sample_cluster <- RunPCA(sample_cluster)

Another question: Before SCTransform: 31054 features, 11483 cells, After SCTransform: 14092 features, 11483 cells. Is this loss of features expected after SCTransform?

Thank you.

Analysis Question

Most helpful comment

Hi,

You are getting the first error because after the merge, the variable feature slot gets wiped since there could be different variable features in each original object. The default for RunPCA is to use those features and since it's empty, you get an error. You can either set the variable features of the merged SCT assay yourself (to something like the intersection or union of the individual object's variable features) or provide this vector of features to RunPCA itself.

There can be a loss of features after running SCTransform. I would read over the docs for ?sctransform::vst for more information. This will depend on your data but one explanation could be the min_cells parameter filtering out features that aren't detected in at least that many cells.

>All comments

Hi,

You are getting the first error because after the merge, the variable feature slot gets wiped since there could be different variable features in each original object. The default for RunPCA is to use those features and since it's empty, you get an error. You can either set the variable features of the merged SCT assay yourself (to something like the intersection or union of the individual object's variable features) or provide this vector of features to RunPCA itself.

There can be a loss of features after running SCTransform. I would read over the docs for ?sctransform::vst for more information. This will depend on your data but one explanation could be the min_cells parameter filtering out features that aren't detected in at least that many cells.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

whitleyo picture whitleyo  路  3Comments

farhanma picture farhanma  路  3Comments

milanmlft picture milanmlft  路  3Comments

camilliano picture camilliano  路  3Comments

bio-la picture bio-la  路  3Comments