Hi,
Recently ,I was using Seurat software to process single-cell RNA data, which can use FindMarkers function to find differential genes in two defined clusters. But I was confused with results of “avg_logFC”, because I try to calculate it independently using raw_data , data and scaled data stored in Seurat object and I can’t get the same results with FindMarkers’s. So, I do want to know how FindMarkers works on Seurat object, and it would be better with mathematics process if possible. Thanks very much!
@QuKunLab: Have a look here:
https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L176-L178
Note that genes.use is used to consider only a subset of genes:
https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L160-L164
Maybe adjusting min.pct gives you want you want:
https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L82
Additionally, cells.1 and cells.2 are used to subset the cells, too:
https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L103-L120
Finally, you might not apply the same pseudocount in your implementation:
https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L91
Hi,
I will appreciate feedbacks from you all on whether my interpretation on the following result that I got from FindMarkers is correct or not. First, the average expression of the two genes from group 1 cells is higher than group 2 cells; second, the expression is detected in only about 50% of group 1 cells but the average expression is assumed to be calculated over all groups 1 cells, so the average expression in the group 1 cells whose expression is detectable is about doubled, while the expression in the other 50% of group 1 cells is undetectable therefore is lower than group 2 cells because the expression in group 2 cells is at least on the detectable level.
p_val avg_logFC pct.1 pct.2 p_val_adj
MARCKSL1 1.27E-08 0.389350674 0.465 0.925 2.53E-05
CD82 2.33E-07 0.268748748 0.549 0.95 0.000465365
Chenyi
Most helpful comment
@QuKunLab: Have a look here:
https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L176-L178
Note that
genes.useis used to consider only a subset of genes:https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L160-L164
Maybe adjusting
min.pctgives you want you want:https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L82
Additionally,
cells.1andcells.2are used to subset the cells, too:https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L103-L120
Finally, you might not apply the same pseudocount in your implementation:
https://github.com/satijalab/seurat/blob/96d07d80bc4b6513b93e9c10d8a9d57ae7016f9f/R/differential_expression.R#L91