Seurat: Anchor Feature & Can't find genes in integrated data

Created on 5 Mar 2019  路  11Comments  路  Source: satijalab/seurat

Dear Seurat Team,

I'm trying to use Seurat v3. to integrate my replica.

  1. I noticed that the default is anchor.features = 2000 for FindIntegrationAnchors. I'm wondering what are the criteria to set the number to 2000 and under which circumstances we are supposed to change it.

  2. Another issue that I'm encountering is when I try to do a violin plot using integrated data, I could not retrieve the plots for certain genes. It gives me an error : Could not find WNT3 in the default search locations, found in RNA assay insteadError in [.data.frame(data, , x, drop = FALSE) :
    undefined columns selected
    When I set the assay to "RNA", it would show the gene only for the violin plot. When I try to DoHeatmap setting assay to "RNA", it returns: "Error in DoHeatmap(gast44h.integrated, assay = "RNA", features = "WNT3") : No requested features found in the scale.data slot for the RNA assay."
    I got so confused when I'm supposed to use "integrated", "RNA", "scaled.data", or "data".

Really appreciate all your help.

Thanks!

Most helpful comment

An easy way to integrate all genes:

# find anchors between samples
x <- FindIntegrationAnchors(...)
# create list of common genes to keep
to_integrate <- Reduce(intersect, lapply([email protected], rownames))
# integrate data and keep full geneset
y <- IntegrateData(anchorset = x, features.to.integrate = to_integrate, ...)

In my opinion it is surprising for the user that all non-anchor genes are "thrown away" while performing the integration. I think it might be helpful to either change the default behavior or to add a parameter (integrate_all_genes=T) to keep all common genes.

All 11 comments

Hi,

  1. We found that 2,000 features worked well across a diverse range of data sets (and we keep this fixed in all of our analyses in the pre-print). We don't generally recommend altering this parameter but users can certainly try a range of values if they want to explore this.

  2. What is likely happening when the violin plot can't find certain genes is that those genes weren't included in the set of features.to.integrate in the IntegrateData function. By default this is set to only the features used in finding the anchors for efficiency (for large datasets, integrating everything results in a very large, non-sparse matrix). If memory isn't an issue for your analysis, feel free to set this to be all features or provide a vector of features to features.to.integrate that includes all genes you want to follow up on.

For a more detailed discussion about the difference between "data" and "scale data", please see FAQ 7. As for "RNA" vs "integrated", these are separate Assays. The "RNA" Assay stores all of original "uncorrected" data whereas "integrated" stores all the "corrected" data that is returned from the integration procedure.

The DoHeatmap error you're getting is likely because you haven't scaled the "WNT3" gene when you ran ScaleData for the data before integration (the "RNA" assay). This function by default, will only scale the variable features identified via FindVariableFeatures.

Thank you so much! it's really helpful!

How can I provide a vector of features to features.to.integrate that includes all genes? It might be a naive question but I really wanna know.

I would like to also know how to do this as well. Thank you.

How can I provide a vector of features to features.to.integrate that includes all genes? It might be a naive question but I really wanna know.

You could put your genes of interests into a vector(eg. my.genes<-c("BRACA1","BRACA2")) and then set IntegrateData(..., features = my.genes,...)

Not so sure if this answers your question.

How can I provide a vector of features to features.to.integrate that includes all genes? It might be a naive question but I really wanna know.

You could put your genes of interests into a vector(eg. my.genes<-c("BRACA1","BRACA2")) and then set IntegrateData(..., features = my.genes,...)

Hope this could be helpful.

I would like to also know how to do this as well. Thank you.

How can I provide a vector of features to features.to.integrate that includes all genes? It might be a naive question but I really wanna know.

An easy way to get a vector of all genes in Seurat v3 is to use rownames(x = object).

An easy way to integrate all genes:

# find anchors between samples
x <- FindIntegrationAnchors(...)
# create list of common genes to keep
to_integrate <- Reduce(intersect, lapply([email protected], rownames))
# integrate data and keep full geneset
y <- IntegrateData(anchorset = x, features.to.integrate = to_integrate, ...)

In my opinion it is surprising for the user that all non-anchor genes are "thrown away" while performing the integration. I think it might be helpful to either change the default behavior or to add a parameter (integrate_all_genes=T) to keep all common genes.

Hi there,

Is it more RAM-expensive to perform integration with all genes?

I'm running with only 128GB RAM and working with some hundreds of thousands of cells.

In my opinion it is surprising for the user that all non-anchor genes are "thrown away" while performing the integration. I think it might be helpful to either change the default behavior or to add a parameter (integrate_all_genes=T) to keep all common genes.

I'm running into the same problem. If I use default parameters, I'm loosing the majority of genes of interest. When integrating with all features, the cells from different datasets don't intermingle nicely. It would be very useful to have the option to keep all features regardless of the number used for integration.

@dpleone we have that option, you can set the features to be integrated using the features.to.integrate parameter in IntegrateDate. This is separate to the features used to find anchors.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sarahwajid picture sarahwajid  路  3Comments

RuiyangLiu94 picture RuiyangLiu94  路  3Comments

akhst7 picture akhst7  路  3Comments

rajasreemenon picture rajasreemenon  路  3Comments

kysbbubbu picture kysbbubbu  路  3Comments