I know y'all are probably working on a vignette for this, but until then, I just want to clarify the current workflow, utilizing SCTransform and integration. Much of this I got from reading other issues, though I don't remember most of the issue #'s (from here and SCTransforms repo) to reference.
First normalize with vst (NOT SCT) and find hvg on each data set (because SCT normalized data can't be used for integration currently according to #1611). Possibly scale it also to regress appropriate features.
Find integration anchors with RNA assay and integrate data. Do this without scaling if scaling was done previously.
Scale integrated assay.
At some point after this point perform SCTransform.
Run PCA and dimentionality reduction and clustering using integrated assay.
Use SCT assay for DEG calculation as well for plots such as feature plots and violin plots.
Would you say this sounds correct?
Thanks,
Gervaise
I would also like to confirm the current workflow for sctransform & integration. My workflow is almost similar to @GHAStVHenry as detailed below:
1) Normalize data with vst and find hvg in each dataset
2) Find integration anchors with RNA assay
3) Integrate data
4) Perform scaling on integration assay
5) Perform SCTransform
6) Run PCA and dimensionality reduction & clustering using SCT assay
7) Change to RNA assay for DEG calculations
Does this workflow sounds reasonable?
I was thinking if SCTransform is necessary after integration and if I can perform PCA & dimensionality reduction using the SCT assay. My results looks totally different from the workflow without SCTransform; where I run PCA & dimensionality reduction on the integrated assay after scaling. Although I should say, the clustering post SCTransform looks much more better.
We will be releasing a vignette in the near future demonstrating our recommended workflow for this task, which would involve first running SCTransform, and then directly integrating the Pearson residuals. I apologize for the delay, these projects were run independently in the lab and happened to be completed around similar time lines, so it took us some more time to combine the workflows.
We do see improved results with the combined approach, and hope that you will see this in your data as well.
Thanks, I totally understand the wait, I know I can speak for a lot of people that we appreciate all y'all's work. We are anxiously awaiting this!
Most helpful comment
I would also like to confirm the current workflow for sctransform & integration. My workflow is almost similar to @GHAStVHenry as detailed below:
1) Normalize data with vst and find hvg in each dataset
2) Find integration anchors with RNA assay
3) Integrate data
4) Perform scaling on integration assay
5) Perform SCTransform
6) Run PCA and dimensionality reduction & clustering using SCT assay
7) Change to RNA assay for DEG calculations
Does this workflow sounds reasonable?
I was thinking if SCTransform is necessary after integration and if I can perform PCA & dimensionality reduction using the SCT assay. My results looks totally different from the workflow without SCTransform; where I run PCA & dimensionality reduction on the integrated assay after scaling. Although I should say, the clustering post SCTransform looks much more better.