Thanks your good software, I have a question about it.
How to extract ensembl id from seurat object. Because I found that one gene corresponds to two ID. Such as ATXN7 corresponds to ENSG00000285258 and ENSG00000163635, but there are ATXN7 and ATXN7.1 in seurat object, so I don't know which id corresponds to ATXN7 and which id corresponds to ATXN7.1.
Hi, the functions Read10X has an argument gene.column that can be used to change between using the gene name and the ensembl ID when reading in 10x Genomics datasets. You can also use the use.names parameter when using Read10X_h5.
Alternatively, you can work out which gene ID in the Seurat object corresponds to which ensembl ID by loading the features.tsv file for your dataset and running make.unique on the gene names. For example:
> genes <- read.table("/home/stuartt/data/10x_scrna/pbmc10k_v3/filtered_feature_bc_matrix/features.tsv", stringsAsFactors = FALSE)
> genes[genes$V2 == "ATXN7", ]
V1 V2 V3 V4
6094 ENSG00000285258 ATXN7 Gene Expression
6095 ENSG00000163635 ATXN7 Gene Expression
> genes$V2 <- make.unique(genes$V2)
> head(genes)
V1 V2 V3 V4
1 ENSG00000243485 MIR1302-2HG Gene Expression
2 ENSG00000237613 FAM138A Gene Expression
3 ENSG00000186092 OR4F5 Gene Expression
4 ENSG00000238009 AL627309.1 Gene Expression
5 ENSG00000239945 AL627309.3 Gene Expression
6 ENSG00000239906 AL627309.2 Gene Expression
> genes[genes$V2 == "ATXN7.1", ]
V1 V2 V3 V4
6095 ENSG00000163635 ATXN7.1 Gene Expression
@timoast I have a similar question, I have a merged seurat object and I want to convert the gene names to ensembl gene ids, however since the gene names are modified I can't simply use biomaRt for example as it does not recognize some of the gene names. Do you have any suggestion? Thank you
I follow the question of @kaizen89 . I have the same issue
Most helpful comment
Hi, the functions
Read10Xhas an argumentgene.columnthat can be used to change between using the gene name and the ensembl ID when reading in 10x Genomics datasets. You can also use theuse.namesparameter when usingRead10X_h5.Alternatively, you can work out which gene ID in the Seurat object corresponds to which ensembl ID by loading the
features.tsvfile for your dataset and runningmake.uniqueon the gene names. For example: