Hi,
I am trying to use a list of genes as canonical markers for several cell types that are present in the data. However, these markers are not in the top10 gene list for clusters as obtained from FindAllMarkers output. These cell types (each having 4-5 signature genes) are dispersed and not segregating based on the clustering. Is there a way to use this signature gene list to identify the cell types and the use that information to label the tSNE?
Thanks,
Bibaswan
Hello Bibaswan,
I guess you should consider a supervised clustering approach in your case. To do that, you can create a vector containing those genes in as.character(), and pass this vector to the pc.genes argument in the RunPCA() function, before computing FindClusters() and RunTSNE().
Best,
Leon
Thanks Leon. I should have 9 cell types each with signature genes. Should I put the pc.genes argument for all the genes together in separate character vectors?
The easiest is to simply create a single character vector with all the genes. Otherwise you can combine the individual vectors when passing them to pc.genes, using c(vector_1, vector_2, ...). The idea is to use all these genes for dimensionality reduction and clustering. Also, make sure to adjust the pcs.compute argument in RunPCA() depending on the number of genes you are passing to the function.
Best,
Leon
Hi Leon,
Thanks again. Just to clarify, how does the pcs.compute vary with the number of genes in the pc.genes?
Bibaswan
Hello Bibaswan,
If you input only 10 genes for PCA dimensionality reduction, you cannot compute 20 PCs, as PCA is used to reduce the dimension of your input data. That is, the number of input genes should be higher than the number of PCs to be computed.
Best,
Leon
Okay understood. Thanks a lot.
Hi!!
Thanks for asking this question ghoshal.
I am actually trying a supervised clustering approach to identify subpopulations.
I did create a single character vector with all the canonical markers, then run RunPCA, FindClusters and RunTSNE with that gene list instead of genes.use (for FindClusters and RunTSNE) or pc.genes (for RunPCA).
Then when I run FindMarkers function, I get:
_Error in intI(i, n = d[1], dn[[1]], give.dn = FALSE) : invalid character indexing_ with a vector list
or get this:
_Error in data.use[genes.use, cells.1, drop = F] : invalid or not-yet-implemented 'Matrix' subsetting_
with a .txt of the same list of genes.
What should I do to run the FindMarkers function?
thanks all!
Most helpful comment
The easiest is to simply create a single character vector with all the genes. Otherwise you can combine the individual vectors when passing them to
pc.genes, usingc(vector_1, vector_2, ...). The idea is to use all these genes for dimensionality reduction and clustering. Also, make sure to adjust thepcs.computeargument inRunPCA()depending on the number of genes you are passing to the function.Best,
Leon