Seurat: Visualize CCA after merge

Created on 30 Oct 2017  路  15Comments  路  Source: satijalab/seurat

Putting together two 10X datasets. Seurat 2.1. Following Seurat Alignment Tutorial. Got past the "add.cell.id" issue. RunCCA seems to have worked (after allotting more RAM). But I'm getting a persistent DimPlot error. See below:

> MergP2_16 <- RunCCA(object = mp2.data,object2 = me16p5.data, add.cell.id1 = 'mp2.data', add.cell.id2 = 'me16p5.data', genes.use = hvg.union)
Running CCA
Merging objects
Performing log-normalization
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
**************************************************|
[1] "Scaling data matrix"
  100%
> p1 <- DimPlot(object = MergP2_16, reduction.use = "cca", group.by = "protocol", pt.size = 0.5, do.return = TRUE)
Error in `$<-.data.frame`(`*tmp*`, "pt.size", value = 0.5) : 
  replacement has 1 row, data has 0

All 15 comments

Here are the other scripts in this project, up to the DimPlot error.

library(Seurat)
library(dplyr)
library(Matrix)
library(cowplot)
me16p5.data <- CreateSeuratObject(raw.data = me16p5.data)
me16p5.data <- NormalizeData(object = me16p5.data)
me16p5.data <- ScaleData(object = me16p5.data)
me16p5.data <- FindVariableGenes(object = me16p5.data, do.plot = FALSE)
mp2.data <- CreateSeuratObject(raw.data = mp2.data)
mp2.data <- NormalizeData(object = mp2.data)
mp2.data <- ScaleData(object = mp2.data)
mp2.data <- FindVariableGenes(object = mp2.data, do.plot = FALSE)
hvg.m2p.data <- rownames(x = head(x = [email protected], n = 2000))
hvg.me16p5.data <- rownames(x = head(x = [email protected], n = 2000))
hvg.union <- union(x = hvg.mp2.data, y = hvg.me16p5.data)
[email protected][, "protocol"] <- "Mp2"
[email protected][, "protocol"] <- "Me16.5"
MergP2_16 <- RunCCA(object = mp2.data,object2 = me16p5.data, add.cell.id1 = 'mp2.data', add.cell.id2 = 'me16p5.data', genes.use = hvg.union)
p1 <- DimPlot(object = MergP2_16, reduction.use = "cca", group.by = "protocol", pt.size = 0.5, do.return = TRUE)

I note that #184 shows the same error, and it was commented on that this was a bug that is fixed in "develop branch". Please explain how I implement that fix. thx

Dear AndyR2,

Please follow the instructions in this page to install the developmental version of Seurat. When the version of Seurat will be updated, the fix will also be updated. So, then you can reinstall the new version of Seurat from CRAN following the instructions in this page.

Best,
Leon

Dear AndyR2,

I have received an e-mail from you in which you state that installing the development version of Seurat did not fix your issue. However, I don't see the post here on GitHub. Did the issue get fixed?

Regarding the version of Seurat, it is still the version Seurat v2.1.0, but the development one.

Best,
Leon

Leon,
I deleted that note because I didn't have the install completed correctly at the time. btw, the install instructions should add the line "library(devtools)" in order to be complete.

Anyway, I did get the install done and re-ran the project. Getting past DimPlot command, but now getting an error at the next line: VlnPlot. Pasting that chunk of scripts below.

> [email protected][, "protocol"] <- "Mp2"
> [email protected][, "protocol"] <- "Me16.5"
> MergP2_16 <- RunCCA(object = mp2.data,object2 = me16p5.data, add.cell.id1 = 'mp2.data', add.cell.id2 = 'me16p5.data', genes.use = hvg.union)
Running CCA
Merging objects
Performing log-normalization
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
**************************************************|
[1] "Scaling data matrix"
  |=============================================| 100%
Warning messages:
1: In rownames(cca.data) == [email protected] :
  longer object length is not a multiple of shorter object length
2: In rownames(cca.data) == [email protected] :
  longer object length is not a multiple of shorter object length
3: In rownames(cca.data) == [email protected] :
  longer object length is not a multiple of shorter object length
4: In rownames(cca.data) == [email protected] :
  longer object length is not a multiple of shorter object length
> p1 <- DimPlot(object = MergP2_16, reduction.use = "cca", group.by = "protocol", pt.size = 0.5, do.return = TRUE)
> p2 <- VlnPlot(object = MergP2_16, features.plot = "CC1", group.by = "protocol", do.return = TRUE)
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 6946, 3464

Yes, I'm getting exactly the same issue as AndyR2, with the development version.
I installed the development version to fix the described bug with the RunCCA function.
My code will now run the DimPlot function, but fails on the VlnPlot function.
Since AndyR2 is reporting exactly the same problem, I'm assuming the problem is with the Seurat source code, and not my own code, but it's always possible that I'm making a mistake.
Thank you.

> # We set the sample name in each dataset for easy identification.                                                                                                                                                                                                             
> # Later, it will be transferred to the merged object in RunCCA.                                                                                                                                                                                                               
> [email protected][, "sample"] <- sample.1         

> [email protected][, "sample"] <- sample.2                                                                                                                                                                                                                               

> data.merged <- RunCCA(object = data.sample.1, object2 = data.sample.2, genes.use = hvg.union,                                                                                                                                                                                 
+              add.cell.id1=sample.1, add.cell.id2=sample. .... [TRUNCATED]                                                                                                                                                                                                     
Running CCA                                                                                                                                                                                                                                                                     
Merging objects                                                                                                                                                                                                                                                                 
Performing log-normalization                                                                                                                                                                                                                                                    
0%   10   20   30   40   50   60   70   80   90   100%                                                                                                                                                                                                                          
|----|----|----|----|----|----|----|----|----|----|                                                                                                                                                                                                                             
**************************************************|                                                                                                                                                                                                                             
[1] "Scaling data matrix"                                                                                                                                                                                                                                                       
  |======================================================================================================================================================================| 100%                                                                                                 

> # Visualize results of CCA plot CC1 versus CC2 and look at a violon plot.                                                                                                                                                                                                     
> p1 <- DimPlot(object=data.merged, reduction.use="cca", group.by="sample" .... [TRUNCATED]                                                                                                                                                                                     

> p2 <- VlnPlot(object=data.merged, features.plot="CC1", group.by="sample", do.return=TRUE)                                                                                                                                                                                     
Error in data.frame(..., check.names = FALSE) :                                                                                                                                                                                                                                 
  arguments imply differing number of rows: 8871, 7471                                                                                                                                                                                                                          
In addition: Warning messages:                                                                                                                                                                                                                                                  
1: In rownames(cca.data) == [email protected] :                                                                                                                                                                                                                                 
  longer object length is not a multiple of shorter object length                                                                                                                                                                                                               
2: In rownames(cca.data) == [email protected] :                                                                                                                                                                                                                                 
  longer object length is not a multiple of shorter object length                                                                                                                                                                                                               
3: In rownames(cca.data) == [email protected] :                                                                                                                                                                                                                                
  longer object length is not a multiple of shorter object length                                                                                                                                                                                                               
4: In rownames(cca.data) == [email protected] :                                                                                                                                                                                                                                
  longer object length is not a multiple of shorter object length

I've attached a file with all my code, in case I'm doing something foolish.

alignment_R_Code.txt

Thanks for the quick fix attempt, but I'm still getting an error message.
If we look at it on the bright side, the difference between the numbers of rows is lower now, so we're making progress?

> p1 <- DimPlot(object=data.merged, reduction.use="cca", group.by="sample" ....
[TRUNCATED]

> p2 <- VlnPlot(object=data.merged, features.plot="CC1", group.by="sample", do.r
eturn=TRUE)                                    
Error in data.frame(..., check.names = FALSE) :       
  arguments imply differing number of rows: 8871, 8857

I should mention that I have underscores in my sample names, in case that poses a problem.

> head([email protected])                                                                                                                                                                                                                                          
                                       nGene  nUMI    orig.ident                                                                                                                                                                                                       
GCRC1735_Control_1of2_AAACCTGAGATTACCC  4387 26353 SeuratProject                                                                                                                                                                                                       
GCRC1735_Control_1of2_AAACCTGAGCGAAGGG  2192  6626 SeuratProject                                                                                                                                                                                                       
GCRC1735_Control_1of2_AAACCTGAGGGTTTCT  3792 19108 SeuratProject                                                                                                                                                                                                       
GCRC1735_Control_1of2_AAACCTGAGTAACCCT  4834 25026 SeuratProject                                                                                                                                                                                                       
GCRC1735_Control_1of2_AAACCTGAGTGTACTC  2919 13846 SeuratProject                                                                                                                                                                                                       
GCRC1735_Control_1of2_AAACCTGCACAAGACG  2693 15357 SeuratProject                                                                                                                                                                                                       
                                                      sample                                                                                                                                                                                                           
GCRC1735_Control_1of2_AAACCTGAGATTACCC GCRC1735_Control_1of2                                                                                                                                                                                                           
GCRC1735_Control_1of2_AAACCTGAGCGAAGGG GCRC1735_Control_1of2                                                                                                                                                                                                           
GCRC1735_Control_1of2_AAACCTGAGGGTTTCT GCRC1735_Control_1of2
GCRC1735_Control_1of2_AAACCTGAGTAACCCT GCRC1735_Control_1of2
GCRC1735_Control_1of2_AAACCTGAGTGTACTC GCRC1735_Control_1of2
GCRC1735_Control_1of2_AAACCTGCACAAGACG GCRC1735_Control_1of2
> tail([email protected])
                                       nGene  nUMI    orig.ident
GCRC1735_Control_2of2_TTTGGTTGTCAGATAA  3686 18866 SeuratProject
GCRC1735_Control_2of2_TTTGGTTGTGACTACT  4771 32442 SeuratProject
GCRC1735_Control_2of2_TTTGGTTTCTCCAGGG  4138 18704 SeuratProject
GCRC1735_Control_2of2_TTTGTCAGTATTACCG  2003  7919 SeuratProject
GCRC1735_Control_2of2_TTTGTCAGTGTCAATC  4316 28769 SeuratProject
GCRC1735_Control_2of2_TTTGTCATCCGCGTTT  3733 19250 SeuratProject
                                                      sample
GCRC1735_Control_2of2_TTTGGTTGTCAGATAA GCRC1735_Control_2of2
GCRC1735_Control_2of2_TTTGGTTGTGACTACT GCRC1735_Control_2of2
GCRC1735_Control_2of2_TTTGGTTTCTCCAGGG GCRC1735_Control_2of2
GCRC1735_Control_2of2_TTTGTCAGTATTACCG GCRC1735_Control_2of2
GCRC1735_Control_2of2_TTTGTCAGTGTCAATC GCRC1735_Control_2of2
GCRC1735_Control_2of2_TTTGTCATCCGCGTTT GCRC1735_Control_2of2

blancha,
hi. thx for getting involved in fixing this issue. I guess I am over my head, but how did implement the fix provided by mohaveazure? Did you modify the seurat code, or download a modified development version?

Ok. Assumed the right course was to reinstall Seurat dev. version. That went well. Re-ran the project and Voila! It worked through the plot_grid(p1, p2) command with no errors or warnings. Thx. Moving on to next step, hopefully.

Great. Works for me too.
Thanks!

On an unrelated subject, I'm still not quite clear on how the algorithm works exactly, so I'd love an opportunity to better understand the algorithm. If you ever provide any other information other than the published article, like a filmed conference presentation or a video-conference, I'd love to know.

@AndyR2
As you found out, you can just reinstall the package to squash the previous version.
If you want, you can also keep different versions in different folders, in case you don't want the latest version to squash the older one, or you don't want the development version to replace the master version.

library(devtools)
dir.create("~/R_alternative_packages")
withr::with_libpaths(new = "~/R_alternative_packages", install_github("satijalab/seurat", ref = "develop"), action="prefix")
withr::with_libpaths(new = "~/R_alternative_packages", library("Seurat"), action="prefix")

Hurray! Made it through clustering and tsne plotting. Thx for the help folks!

I do have a related Q: Suppose I have THREE data sets that I want to merge. Do I put them all in together at the start, e.g. expanding the number of initial Seurat objects created to 3, or some thing like taking my new MergP2_16 dataset and adding another data set to that?

Good to see this is working now. See also #176

Hi Leon

I try to install dev.version of seurat but got the error below. What should I do? Thx.

library(devtools)
install_github("satijalab/seurat", ref = "develop")
Installation failed: Problem with the SSL CA cert (path? access rights?)

Bests,
Na

@xnaer: This is not related to Seurat (see https://github.com/COMBINE-lab/wasabi/issues/6 and https://github.com/Fuzzy-Logix/AdapteR/issues/13 as other examples).

However,

install.packages(c("curl", "httr"))

might help (according to https://github.com/hadley/devtools/issues/1079#issuecomment-182390147).


In case this helps, here is how I found out about it. :wink:

@mschilli87 : Thanks for your kind reply. I changed the command to install_git("git://github.com/satijalab/seurat.git", branch = "develop")
and it works.

Was this page helpful?
0 / 5 - 0 ratings