Seurat: nCount_integrated & nFeature_integrated column value "NA"

Created on 5 Apr 2019  Â·  11Comments  Â·  Source: satijalab/seurat

I am using Seurat v3 to integrate two datasets. After "IntegrateData" and "RunUMAP", I got the UMAP with cell clusters.
But when I checked the meta data of data.combined, I found the columns of nCount_integrated & nFeature_integrated filled with NA. I am wondering if it is normal to have NA values in these two columns?

Thanks a lot

Most helpful comment

Hi,

Sorry for the progress bar confusion. I've copied your example above and indicated where the progress bar starts and stops. Because of how printing to the console is handled in R, the progress bar essentially gets interrupted and continues on the lines bellow the messages about "merging objects, finding neighborhoods, etc ".

Computing 2000 integration features
Scaling features for provided objects
   |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 03s
Finding all pairwise anchors
   |                                                  | 0 % ~calculating  Running CCA ### START
Merging objects
Finding neighborhoods
Finding mutual nearest neighborhoods
    Found 8967 anchors
Filtering Anchors
    Retained 7622 anchors
Extracting within-dataset neighbors!
   |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 32s ### END

If you see the integrated assay in your Seurat object, this should indicate the function ran without error.

There shouldn't be any counts in the integrated assay as it is no longer "count" data, it's normalized/corrected at this point after integration so we fill the "data" slot. It's also expected that there are no meta.features immediately after running integration as these are filled in by other functions (like FindVariableFeatures). You'll notice similar behavior when you first create a Seurat object (the meta.features of the RNA assay will be empty initially).

As for the NA's in the nCount_integrated and nFeature_integrated columns of meta.data, these can be ignored. It looks like by default we're calculating these based on the counts slot whenever an assay is added and it obviously doesn't make much sense for integrated as that doesn't have anything in counts (hence the NAs).

All 11 comments

I have also found another problem about my running shown below:
> Sample.combined <- FindIntegrationAnchors(object.list = list(WT, KO), dims = 1:50) Computing 2000 integration features Scaling features for provided objects |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 04s Finding all pairwise anchors | | 0 % ~calculating Running CCA Merging objects Finding neighborhoods Finding mutual nearest neighborhoods Found 10464 anchors Filtering Anchors Retained 6835 anchors Extracting within-dataset neighbors! |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 01m 22s
The line with "Running CCA" is shown as "0 %", I am wondering if it failed to find pairwise anchors?

Thank you

I have the same problem.

I'm using Seurat (v 3.0.0.9000 - downloaded this morning (9.4.2019) using devtools) with R (v3.5.2). Below is a condensed version of my code. I create the seurat object, and normalize using SCTransform as indicated in this post.

## Normalize using SCTransform after CreateSeuratObject() and subset()
seu_con <- SCTransform(object = seu_con, vars.to.regress = "percent.mt",verbose = FALSE)
seu_exp <- SCTransform(object = seu_exp, vars.to.regress = "percent.mt",verbose = FALSE)

## Perform integration
seuAnchors <- FindIntegrationAnchors(object.list = list(seu_con,seu_exp), assay = c("SCT","SCT"), verbose = TRUE, dims = 1:15)

For me Running CCA also ends with 0% however I do get the following output. And the slots "anchors", "offsets", and "anchor.features" all have something in them.So it seems like the FindIntegrationAnchors() is probably working.

Computing 2000 integration features
Scaling features for provided objects
   |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 03s
Finding all pairwise anchors
   |                                                  | 0 % ~calculating  Running CCA
Merging objects
Finding neighborhoods
Finding mutual nearest neighborhoods
    Found 8967 anchors
Filtering Anchors
    Retained 7622 anchors
Extracting within-dataset neighbors!
   |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 32s

Running the following also gives me metadata columns filled with NAs.

integ.data <- IntegrateData(anchorset = seuAnchors, new.assay.name = 'integrate', dims = 1:15)

I've tried looking through the integration.R script but i'm having trouble figuring what is saved where and why (when I try to run the code line by line, some things aren't called in correctly so this probably isn't the best way to debug :) ).

One other note, there are no counts or meta features in my integrated assay. Is this expected? Could you suggest a way for us to check whether or not it worked correctly, or is a new "integrated assay" in the output enough to say that it worked?

Thank you!

> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] cowplot_0.9.4     ggplot2_3.1.0     Seurat_3.0.0.9000

loaded via a namespace (and not attached):
 [1] httr_1.4.0          tidyr_0.8.3         jsonlite_1.6       
 [4] viridisLite_0.3.0   splines_3.5.2       lsei_1.2-0         
 [7] R.utils_2.8.0       gtools_3.8.1        Rdpack_0.10-1      
[10] assertthat_0.2.1    ggrepel_0.8.0       globals_0.12.4     
[13] pillar_1.3.1        lattice_0.20-38     reticulate_1.11.1  
[16] glue_1.3.1          digest_0.6.18       RColorBrewer_1.1-2 
[19] SDMTools_1.1-221    colorspace_1.4-1    htmltools_0.3.6    
[22] Matrix_1.2-17       R.oo_1.22.0         plyr_1.8.4         
[25] pkgconfig_2.0.2     bibtex_0.4.2        tsne_0.1-3         
[28] listenv_0.7.0       purrr_0.3.2         scales_1.0.0       
[31] RANN_2.6.1          gdata_2.18.0        Rtsne_0.15         
[34] tibble_2.1.1        withr_2.1.2         ROCR_1.0-7         
[37] pbapply_1.4-0       lazyeval_0.2.2      survival_2.44-1.1  
[40] magrittr_1.5        crayon_1.3.4        R.methodsS3_1.7.1  
[43] future_1.12.0       nlme_3.1-137        MASS_7.3-51.4      
[46] gplots_3.0.1.1      ica_1.0-2           tools_3.5.2        
[49] fitdistrplus_1.0-14 data.table_1.12.0   gbRd_0.4-11        
[52] stringr_1.4.0       plotly_4.8.0        munsell_0.5.0      
[55] cluster_2.0.7-1     irlba_2.3.3         compiler_3.5.2     
[58] rsvd_1.0.0          caTools_1.17.1.2    rlang_0.3.3        
[61] grid_3.5.2          ggridges_0.5.1      htmlwidgets_1.3    
[64] igraph_1.2.4        bitops_1.0-6        npsurv_0.4-0       
[67] gtable_0.3.0        codetools_0.2-16    R6_2.4.0           
[70] zoo_1.8-5           dplyr_0.8.0.1       future.apply_1.2.0 
[73] KernSmooth_2.23-15  metap_1.1           ape_5.3            
[76] stringi_1.4.3       parallel_3.5.2      Rcpp_1.0.1         
[79] png_0.1-7           tidyselect_0.2.5    lmtest_0.9-36      

Hi,

Sorry for the progress bar confusion. I've copied your example above and indicated where the progress bar starts and stops. Because of how printing to the console is handled in R, the progress bar essentially gets interrupted and continues on the lines bellow the messages about "merging objects, finding neighborhoods, etc ".

Computing 2000 integration features
Scaling features for provided objects
   |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 03s
Finding all pairwise anchors
   |                                                  | 0 % ~calculating  Running CCA ### START
Merging objects
Finding neighborhoods
Finding mutual nearest neighborhoods
    Found 8967 anchors
Filtering Anchors
    Retained 7622 anchors
Extracting within-dataset neighbors!
   |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 32s ### END

If you see the integrated assay in your Seurat object, this should indicate the function ran without error.

There shouldn't be any counts in the integrated assay as it is no longer "count" data, it's normalized/corrected at this point after integration so we fill the "data" slot. It's also expected that there are no meta.features immediately after running integration as these are filled in by other functions (like FindVariableFeatures). You'll notice similar behavior when you first create a Seurat object (the meta.features of the RNA assay will be empty initially).

As for the NA's in the nCount_integrated and nFeature_integrated columns of meta.data, these can be ignored. It looks like by default we're calculating these based on the counts slot whenever an assay is added and it obviously doesn't make much sense for integrated as that doesn't have anything in counts (hence the NAs).

Thanks so much for your answer! That makes sense and it's good to know that I don't need to worry about it.

Thank you very much. I really appreciate the answer and discussion from all
of you.
All the best!

On Wed, Apr 10, 2019, 10:13 AM Andrew Butler notifications@github.com
wrote:

Hi,

Sorry for the progress bar confusion. I've copied your example above and
indicated where the progress bar starts and stops. Because of how printing
to the console is handled in R, the progress bar essentially gets
interrupted and continues on the lines bellow the messages about "merging
objects, finding neighborhoods, etc ".

Computing 2000 integration featuresScaling features for provided objects
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 03sFinding all pairwise anchors
| | 0 % ~calculating Running CCA ### STARTMerging objectsFinding neighborhoodsFinding mutual nearest neighborhoods
Found 8967 anchorsFiltering Anchors
Retained 7622 anchorsExtracting within-dataset neighbors!
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 32s ### END

If you see the integrated assay in your Seurat object, this should
indicate the function ran without error.

There shouldn't be any counts in the integrated assay as it is no longer
"count" data, it's normalized/corrected at this point after integration so
we fill the "data" slot. It's also expected that there are no
meta.features immediately after running integration as these are filled
in by other functions (like FindVariableFeatures). You'll notice similar
behavior when you first create a Seurat object (the meta.features of the
RNA assay will be empty initially).

As for the NA's in the nCount_integrated and nFeature_integrated columns
of meta.data, these can be ignored. It looks like by default we're
calculating these based on the counts slot whenever an assay is added and
it obviously doesn't make much sense for integrated as that doesn't have
anything in counts (hence the NAs).

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/satijalab/seurat/issues/1330#issuecomment-481732947,
or mute the thread
https://github.com/notifications/unsubscribe-auth/Arta9muRp0lPNSPTTpblxXfV8OpJ2f4Yks5vff-PgaJpZM4cfwqd
.

Hi,

I am having same trouble like @allie-burns . Do not wonder I have very few cells (629 and 112) so not many anchors. I was able to run SCTransform in before for my two datasets and after running Integration command I ended up like this:
In addition I got the warning that I have duplicate cell names in my data, how can I check that in Seurat or in my Cellranger Version 3.0.0. files (barcodes.tsv)??

Computing 2000 integration features
Scaling features for provided objects
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 03s
Finding all pairwise anchors
| | 0 % ~calculating Running CCA
Merging objects
Finding neighborhoods
Finding mutual nearest neighborhoods
Found 555 anchors
Filtering Anchors
Error in nn2(data = cn.data2[nn.cells2, ], query = cn.data1[nn.cells1, :
Cannot find more nearest neighbours than there are points
In addition: Warning message:
In CheckDuplicateCellNames(object.list = object.list) :
Some cell names are duplicated across objects provided. Renaming to enforce unique cell names.

I would very glad for any advice.

Cheers,
Michael

Here is my Sessioninfo:
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] sctransform_0.0.0.900 cowplot_0.9.4 ggplot2_3.1.0 networkD3_0.4 tidyr_0.8.3
[6] beepr_1.3 usethis_1.4.0 devtools_2.0.1 Matrix_1.2-15 dplyr_0.8.0.1
[11] Seurat_3.0.0.9000

Sorry I forgot to implement my script:

Integrating Sample 1 vs. 2 (Seurat V3)

sample 1

Gr1 <- CreateSeuratObject(counts = Gr1.data, min.cells = 3, min.features = 200, project = "Gr1")
Gr2 <- CreateSeuratObject(counts = Gr2.data, min.cells = 3, min.features = 200, project = "GR2")

store mitochondrial percentage in object meta data

Gr1 <- PercentageFeatureSet(object = Gr1, pattern = "^mt-", col.name = "percent.mt")
Gr2 <- PercentageFeatureSet(object = Gr2, pattern = "^mt-", col.name = "percent.mt")

run sctransform

Gr1 <- SCTransform(object = Gr1, vars.to.regress = "percent.mt", verbose = TRUE)
Gr2 <- SCTransform(object = Gr2, vars.to.regress = "percent.mt", verbose = TRUE)

Perform Integration

immune.anchors <- FindIntegrationAnchors(object.list = list(Gr1, Gr2), assay = c("SCT","SCT"), dims = 1:20, verbose = TRUE)

Hi,

I managed to get rid of the first error by

Gr1 <- RenameCells(Gr1, add.cell.id = "Gr1", for.merge = FALSE)
Gr2 <- RenameCells(Gr2, add.cell.id = "Gr2", for.merge = FALSE)

but after running SCTransform and then Integration:
immune.anchors <- FindIntegrationAnchors(object.list = list(Gr1, Gr2), assay = c("SCT","SCT"), dims = 1:20, verbose = TRUE)
still I get this error:
Computing 2000 integration features Scaling features for provided objects |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 03s
Finding all pairwise anchors | | 0 % ~calculating Running CCA
Merging objects
Finding neighborhoods
Finding mutual nearest neighborhoods
Found 555 anchors
Filtering Anchors
Error in nn2(data = cn.data2[nn.cells2, ], query = cn.data1[nn.cells1, : Cannot find more nearest neighbours than there are points

Does anyone has an idea?
I would be very grateful!

@MichaelStber Hi, I guess you have less than a 100 cells in one of your inputs, as the default k.weight parameter is 100, i.e. it considers the 100 closest neighboring cells during its analysis. Perhaps changing this parameter to a lower value will work.

Hi @FarzanT
Indeed I have unfortunately 96 cells in my second sample.
Thats a good idea, which parameter do you mean?

Is it k.anchor of the FindIntegrationAnchors command?

Jipiiiiee :)

Now it worked also for me!! I set k.filter = 95
Thank you @FarzanT for your help!!

Computing 2000 integration features
Scaling features for provided objects
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 03s
Finding all pairwise anchors
| | 0 % ~calculating Running CCA
Merging objects
Finding neighborhoods
Finding anchors
Found 478 anchors
Filtering anchors
Retained 476 anchors
Extracting within-dataset neighbors|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 03s`

Was this page helpful?
0 / 5 - 0 ratings