Seurat: Batch effect correction for multiple conditions and replicates

Created on 11 Mar 2019  路  2Comments  路  Source: satijalab/seurat

Hi,
I was writing to seek advice on how best to integrate datasets that are from different conditions and experiments. For ex, COND1 had Exp1.1 and Exp1.2 and COND2 had Exp2.1 and Exp2.2. The process I followed is:

  • merge COND1/Exp1.1 and COND2/Exp1.2
  • after the usual pre-processing of the merged object for COND1, correct for batch in ScaleData using the expt id.
  • Do the same for COND2
  • then merge the two objects - COND1 and COND2 for a combined analysis.
    The problem is that on merging COND1 and COND2 in the last step I have normalize and ScaleData again which would lose the batch corrected expression values.

Could you please suggest the best way to analyze this kind of data set.
Is using CCA an option here? I thought it was not because it would automatically correct for batch across all conditions which would lose the biological differences across conditions.

Thanks,

  • Pankaj

Most helpful comment

Hi Pankaj, my advice would be to first integrate all the datasets together using the integration methods in Seurat v3 (see https://satijalab.org/seurat/pancreas_integration_label_transfer.html). You can then perform clustering on the integrated data to identify common cell states across conditions and replicates. You can find differentially expressed genes between clusters or between control/treatment within a cluster using the uncorrected data, using logistic regression with replicate as a latent variable (FindMarkers with test.use="LR", and latent.vars="replicate").

All 2 comments

Hi Pankaj, my advice would be to first integrate all the datasets together using the integration methods in Seurat v3 (see https://satijalab.org/seurat/pancreas_integration_label_transfer.html). You can then perform clustering on the integrated data to identify common cell states across conditions and replicates. You can find differentially expressed genes between clusters or between control/treatment within a cluster using the uncorrected data, using logistic regression with replicate as a latent variable (FindMarkers with test.use="LR", and latent.vars="replicate").

@timoast , would there be any reason not to make use of sctransform:::compare_expression? Would require some extra work as compared to FindMarkers, but seems (?) to more directly integrate with the SCTransform results.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bio-la picture bio-la  路  3Comments

camilliano picture camilliano  路  3Comments

igordot picture igordot  路  3Comments

kathirij picture kathirij  路  3Comments

akhst7 picture akhst7  路  3Comments