Hi, thanks for providing the new vignette on integration after performing SCT.
Can I ask for clarification on the workflow:
Is this correct?
I'm also referencing some closed issues related to this.
thanks.
I have a related question to this. After sctransform based integration approach, there are assays 'SCT' and 'integrated'. What's the difference? Where in the workflow do I do QC corrections? Removing low quality genes/samples etc. At what point is the sequencing depth/num of reads/num of UMIs corrected for? And where in this workflow do I correct for mito.percent, cell cycle etc? Should I run sctransform again on 'integrated' using argument 'vars.to.regress'?
In reference to @royfrancis question, what I did was performing the QC and scaling first before integration. My current workflow is:
1) Create Seurat object
2) QC by filtering out cells based on percent.mito and nFeature_RNA
3) SCT normalize each dataset specifying the parameter vars.to.regress = percent.mito
4) Integrate all datasets
5) Run PCA, UMAP, FindClusters, FindNeighbors (on default assay which is "integrated")
6) Change default assay to "RNA"; normalize then generate FeaturePlots and perform differential expression analysis
I'm hoping Seurat developers can clarify if my workflow is correct. In addition to that, I wanted to ask if we should perform another round of SCTransform on the integrated dataset, either in the standard workflow or the SCTransform integration workflow.
Thanks - yes the workflow @nicodemus88 lists is the one we recommend. Do not run a second round of SCTransform on the integrated assay.
The only additional point I would make is that for point 6, you can also use the SCT assay instead of the RNA assay (this represents the SCT normalized values for each dataset, prior to integration).
Most helpful comment
In reference to @royfrancis question, what I did was performing the QC and scaling first before integration. My current workflow is:
1) Create Seurat object
2) QC by filtering out cells based on
percent.mitoandnFeature_RNA3) SCT normalize each dataset specifying the parameter
vars.to.regress = percent.mito4) Integrate all datasets
5) Run PCA, UMAP, FindClusters, FindNeighbors (on default assay which is "integrated")
6) Change default assay to "RNA"; normalize then generate FeaturePlots and perform differential expression analysis
I'm hoping Seurat developers can clarify if my workflow is correct. In addition to that, I wanted to ask if we should perform another round of SCTransform on the integrated dataset, either in the standard workflow or the SCTransform integration workflow.