Dear,
I am using Seurat for scRNAseq analysis in R language and I am experiencing the following error:
Running CCA
Merging objects
Error in MergeSeurat(object1 = object, object2 = object2, do.scale = FALSE, :
Duplicate cell names, please provide 'add.cell.id1' and/or 'add.cell.id2' for unique names
Calls: RunCCA -> MergeSeurat
Execution halted
After a through investigation, I found out that the error is caused because I am calling the following line (line 29 in the attached file):
Data.combined <- RunCCA(iPSCs171, treat180, genes.use = genes.use, num.cc = 30)
This error is generated only when I am specifically using these two datasets, and according to my understanding, the error is because I am having duplicate cell names
that appear in the both datasets.
Please could you advise me?
Attached is the R script.
library(Seurat)
iPSCs171.data <- Read10X(data.dir = "/media/ibex_scratch/171220_D00658_0041_AH3LLWBCX2-Lane2-iPSCs/outs/filtered_gene_bc_matrices/GRCh38")
treat180.data <- Read10X(data.dir = "/media/ibex_scratch/180221_D00658_0046_AH3LG3BCX2-iPSCs/outs/filtered_gene_bc_matrices/GRCh38")
iPSCs171 <- CreateSeuratObject(raw.data = iPSCs171.data, project = "iPSCs171_data", min.cells = 5)
[email protected]$run2 <- "iPSCs171"
iPSCs171 <- FilterCells(iPSCs171, subset.names = "nGene", low.thresholds = 500, high.thresholds = Inf)
iPSCs171 <- NormalizeData(iPSCs171)
iPSCs171 <- ScaleData(iPSCs171, display.progress = F)
treat180 <- CreateSeuratObject(raw.data = treat180.data, project = "treat180_data", min.cells = 5)
[email protected]$run2 <- "treat180"
treat180 <- FilterCells(treat180, subset.names = "nGene", low.thresholds = 500, high.thresholds = Inf)
treat180 <- NormalizeData(treat180)
treat180 <- ScaleData(treat180, display.progress = F)
iPSCs171 <- FindVariableGenes(iPSCs171, do.plot = F)
treat180 <- FindVariableGenes(treat180, do.plot = F)
g.1 <- head(rownames([email protected]), 1000)
g.2 <- head(rownames([email protected]), 1000)
genes.use <- unique(c(g.1, g.2))
genes.use <- intersect(genes.use, rownames([email protected]))
genes.use <- intersect(genes.use, rownames([email protected]))
Data.combined <- RunCCA(iPSCs171, treat180, genes.use = genes.use, num.cc = 30)
Did you check if there are duplicate cell names? If that is the case, just go ahead and add a suffix/prefix, e.g. xxxxx-iPSC or xxxxx-treat. You can do it either manually before merging the data sets or during the merging by using the parameters mentioned in the error message: add.cell.id1 and/or add.cell.id2
Keep in mind that the cell names are simply the cell barcodes which come from a known (IIRC ~750k) library so it's not unlikely to have an overlap of barcodes between two data sets.
Yes it does have duplicated cell names. Thank you very much, it is now working.
I replace the RunCCA with the following:
Data.combined <- RunCCA(iPSCs171, treat180, add.cell.id1 = "iPSCs171", add.cell.id2 = "treat180", genes.use = genes.use, num.cc = 30)
Hi @farhanma ,
I had the same duplicate call names issue today and solved it with the following code:
colnames(data1@data) <- paste("S1", colnames(data1@data), sep = "_")
colnames([email protected]) <- paste("S1", colnames([email protected]), sep = "_")
colnames([email protected]) <- paste("S1", colnames([email protected]), sep = "_")
rownames([email protected]) <- paste("S1", rownames([email protected]), sep = "_")
[email protected] <- paste("S1", [email protected], sep = '_')
# and do the same for the other dataset, replace the 'S1' prefix with something like 'S2'