Seurat: Read10X failing

Created on 18 Mar 2020  ·  14Comments  ·  Source: satijalab/seurat

Hello,

I am using Read10X to read from a directory with the following 3 files as needed: barcodes.tsv, genes.tsv, matrix.mtx

When I use Read10X I get the following error -

Error in dimnamesGets(x, value) :
invalid dimnames given for “dgTMatrix” object

The matrix.mtx file looks like this -

%%MatrixMarket matrix coordinate integer general
54574 66187 78830323
76 1 1
111 1 1
194 1 1
333 1 1
358 1 1
401 1 1

genes.tsv looks like this -

ENSEMBL SYMBOL CHR START END STRAND BIOTYPE
ENSG00000223972 DDX11L1 1 11869 14409 + transcribed_unprocessed_pseudogene
ENSG00000278267 MIR6859-1 1 17369 17436 - miRNA
ENSG00000284332 MIR1302-2 1 30366 30503 + miRNA
ENSG00000237613 FAM138A 1 34554 36081 - lincRNA
ENSG00000268020 OR4G4P 1 52473 53312 + unprocessed_pseudogene
ENSG00000240361 OR4G11P 1 62948 63887 + unprocessed_pseudogene
ENSG00000186092 OR4F5 1 69091 70008 + protein_coding

barcodes.tsv looks like this -

AAACCTGAGAGCCCAA-1-D20171109_A
AAACCTGGTGCTAGCC-1-D20171109_A
AAACGGGAGAGACTAT-1-D20171109_A
AAACGGGAGCGTAATA-1-D20171109_A
AAACGGGAGCTGAAAT-1-D20171109_A
AAACGGGAGGCCATAG-1-D20171109_A
AAACGGGAGGGAACGG-1-D20171109_A

sessionInfo() turns out:

R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936 LC_CTYPE=Chinese (Simplified)_China.936
[3] LC_MONETARY=Chinese (Simplified)_China.936 LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.936

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] Seurat_3.1.0 dplyr_0.8.3

loaded via a namespace (and not attached):
[1] nlme_3.1-140 tsne_0.1-3 bitops_1.0-6 RcppAnnoy_0.0.12
[5] RColorBrewer_1.1-2 httr_1.4.0 sctransform_0.2.0 tools_3.6.1
[9] backports_1.1.5 R6_2.4.0 irlba_2.3.3 KernSmooth_2.23-15
[13] uwot_0.1.3 lazyeval_0.2.2 colorspace_1.4-1 npsurv_0.4-0
[17] tidyselect_0.2.5 gridExtra_2.3 compiler_3.6.1 plotly_4.9.0
[21] caTools_1.17.1.2 scales_1.0.0 lmtest_0.9-37 ggridges_0.5.1
[25] pbapply_1.4-2 stringr_1.4.0 digest_0.6.20 R.utils_2.9.0
[29] pkgconfig_2.0.2 htmltools_0.3.6 bibtex_0.4.2 htmlwidgets_1.3
[33] rlang_0.4.0 rstudioapi_0.10 zoo_1.8-6 jsonlite_1.6
[37] ica_1.0-2 gtools_3.8.1 R.oo_1.22.0 magrittr_1.5
[41] Matrix_1.2-17 Rcpp_1.0.2 munsell_0.5.0 ape_5.3
[45] reticulate_1.13 lifecycle_0.1.0 R.methodsS3_1.7.1 stringi_1.4.3
[49] gbRd_0.4-11 MASS_7.3-51.4 gplots_3.0.1.1 Rtsne_0.15
[53] plyr_1.8.4 grid_3.6.1 parallel_3.6.1 gdata_2.18.0
[57] listenv_0.7.0 ggrepel_0.8.1 crayon_1.3.4 lattice_0.20-38
[61] cowplot_1.0.0 splines_3.6.1 SDMTools_1.1-221.1 zeallot_0.1.0
[65] pillar_1.4.2 igraph_1.2.4.1 future.apply_1.3.0 reshape2_1.4.3
[69] codetools_0.2-16 leiden_0.3.1 glue_1.3.1 packrat_0.5.0
[73] lsei_1.2-0 metap_1.1 RcppParallel_4.4.3 data.table_1.12.2
[77] vctrs_0.2.0 png_0.1-7 Rdpack_0.11-0 gtable_0.3.0
[81] RANN_2.6.1 purrr_0.3.2 tidyr_1.0.0 future_1.14.0
[85] assertthat_0.2.1 ggplot2_3.2.1 rsvd_1.0.2 survival_3.1-6
[89] viridisLite_0.3.0 tibble_2.1.3 cluster_2.1.0 globals_0.12.4
[93] fitdistrplus_1.0-14 ROCR_1.0-7

I would greatly appreciate if someone could help me point what I missed.

Thank you,
Forest Lee

Most helpful comment

Removed. Now, I am getting

Error: Error in FUN(X[[i]], ...) : subscript out of bounds

Can you double check whether you mistakenly flipped rows and columns (aka transposed the matrix)??
That's how I got my bug

All 14 comments

The code and file stuctures seem to be correct- there is likely an issue with one of your input files that we can't see here. Can you e-mail all 3 files to [email protected] ? We'll take a look @k3yavi

OK, I have sent the email. Thank you.

Hi @restore1997 ,

Thanks for reaching out. A couple of things are inconsistent with the expected file format.
1.) genes.tsv doesn't expect header in the file name, the file parser ends up reading one more line and giving the error you observe above.
2.) The expected 10x format for the genes.tsv file has only 3 columns: Gene Id, Gene name, feature type (optional and from V3).

I'd suggest running the following script on your genes.tsv file to resolve the issue.

tail -n +2 genes.tsv | cut -f1,2 > tmpfile ; mv tmpfile genes.tsv

Feel free to reopen or ask again if the issue doesn't resolves.
Hope it helps.

Thank you very much!

------------------ 原始邮件 ------------------
发件人: "Avi Srivastava"<[email protected]>;
发送时间: 2020年3月24日(星期二) 中午11:16
收件人: "satijalab/seurat"<[email protected]>;
抄送: "Forest Lee"<[email protected]>;"Mention"<[email protected]>;
主题: Re: [satijalab/seurat] Read10X failing (#2740)

Hi @restore1997 ,

Thanks for reaching out. A couple of things are inconsistent with the expected file format.
1.) genes.tsv doesn't expect header in the file name, the file parser ends up reading one more line and giving the error you observe above.
2.) The expected 10x format for the genes.tsv file has only 3 columns: Gene Id, Gene name, feature type (optional and from V3).

I'd suggest running the following script on your genes.tsv file to resolve the issue.
tail -n +2 genes.tsv | cut -f1,2 > tmpfile ; mv tmpfile genes.tsv
Feel free to reopen or ask again if the issue doesn't resolves.
Hope it helps.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

I am getting the same error when I use Read10X:

Error in dimnamesGets(x, value) :
invalid dimnames given for “dgTMatrix” object

The matrix.mtx file:
%%MatrixMarket matrix coordinate integer general
%
31053 10070 15464435
7 1 1
10 1 2
16 1 3
24 1 4
33 1 1
37 1 2
39 1 1
52 1 1
54 1 1

Barcodes file:
AAACCCAAGGATTTGA-1
AAACCCAAGGGAGATA-1
AAACCCAAGTGGCAGT-1
AAACCCACACACTTAG-1
AAACCCACATCCGTTC-1
AAACCCAGTACTGACT-1
AAACCCAGTGGCAACA-1
AAACCCAGTGTCACAT-1
AAACCCAGTTGCTGAT-1

Genes file:
ENSMUSG00000051951 Xkr4
ENSMUSG00000089699 Gm1992
ENSMUSG00000102343 Gm37381
ENSMUSG00000025900 Rp1
ENSMUSG00000025902 Sox17

sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] dplyr_0.8.4 HiTC_1.30.0 rtracklayer_1.46.0 GenomicRanges_1.38.0
[5] GenomeInfoDb_1.22.0 IRanges_2.20.2 S4Vectors_0.24.3 BiocGenerics_0.32.0
[9] Seurat_3.1.4

loaded via a namespace (and not attached):
[1] TH.data_1.0-10 Rtsne_0.15 colorspace_1.4-1
[4] ggridges_0.5.2 XVector_0.26.0 leiden_0.3.3
[7] listenv_0.8.0 npsurv_0.4-0 ggrepel_0.8.1
[10] mvtnorm_1.1-0 codetools_0.2-16 splines_3.6.2
[13] mnormt_1.5-6 lsei_1.2-0 TFisher_0.2.0
[16] jsonlite_1.6.1 Rsamtools_2.2.3 ica_1.0-2
[19] cluster_2.1.0 png_0.1-7 uwot_0.1.5
[22] sctransform_0.2.1 compiler_3.6.2 httr_1.4.1
[25] assertthat_0.2.1 Matrix_1.2-18 lazyeval_0.2.2
[28] htmltools_0.4.0 tools_3.6.2 rsvd_1.0.3
[31] igraph_1.2.4.2 gtable_0.3.0 glue_1.3.1
[34] GenomeInfoDbData_1.2.2 RANN_2.6.1 reshape2_1.4.3
[37] rappdirs_0.3.1 Rcpp_1.0.3 Biobase_2.46.0
[40] Biostrings_2.54.0 vctrs_0.2.3 multtest_2.42.0
[43] gdata_2.18.0 ape_5.3 nlme_3.1-144
[46] gbRd_0.4-11 lmtest_0.9-37 stringr_1.4.0
[49] globals_0.12.5 lifecycle_0.1.0 irlba_2.3.3
[52] gtools_3.8.1 XML_3.99-0.3 future_1.16.0
[55] MASS_7.3-51.5 zlibbioc_1.32.0 zoo_1.8-7
[58] scales_1.1.0 SummarizedExperiment_1.16.1 sandwich_2.5-1
[61] RColorBrewer_1.1-2 reticulate_1.14 pbapply_1.4-2
[64] gridExtra_2.3 ggplot2_3.2.1 stringi_1.4.6
[67] mutoss_0.1-12 plotrix_3.7-7 caTools_1.18.0
[70] BiocParallel_1.20.1 bibtex_0.4.2.2 matrixStats_0.55.0
[73] Rdpack_0.11-1 rlang_0.4.4 pkgconfig_2.0.3
[76] bitops_1.0-6 lattice_0.20-40 ROCR_1.0-7
[79] purrr_0.3.3 GenomicAlignments_1.22.1 patchwork_1.0.0
[82] htmlwidgets_1.5.1 cowplot_1.0.0 tidyselect_1.0.0
[85] RcppAnnoy_0.0.15 plyr_1.8.5 magrittr_1.5
[88] R6_2.4.1 gplots_3.0.3 multcomp_1.4-12
[91] DelayedArray_0.12.2 pillar_1.4.3 sn_1.5-5
[94] fitdistrplus_1.0-14 survival_3.1-8 RCurl_1.98-1.1
[97] tibble_2.1.3 future.apply_1.4.0 tsne_0.1-3
[100] crayon_1.3.4 KernSmooth_2.23-16 plotly_4.9.2
[103] grid_3.6.2 data.table_1.12.8 metap_1.3
[106] digest_0.6.25 tidyr_1.0.2 numDeriv_2016.8-1.1
[109] RcppParallel_4.4.4 munsell_0.5.0 viridisLite_0.3.0

ENSMUSG00000104328 Gm37323
ENSMUSG00000033845 Mrpl15
ENSMUSG00000025903 Lypla1
ENSMUSG00000104217 Gm37988

Can you remove the second line that just has % in the mtx file ?

Removed. Now, I am getting

Error: Error in FUN(X[[i]], ...) : subscript out of bounds

Removed. Now, I am getting

Error: Error in FUN(X[[i]], ...) : subscript out of bounds

Can you double check whether you mistakenly flipped rows and columns (aka transposed the matrix)??
That's how I got my bug

Hi, I'm also getting the Read10X error:
Error in dimnamesGets(x, value) :
invalid dimnames given for “dgTMatrix” object

These are the files I'm trying to load:

matrix.mtx.gz file:
%%MatrixMarket matrix coordinate integer general
33694 138672 221020078
33 1 2
84 1 1
102 1 1
154 1 2
164 1 1
198 1 4

features.tsv.gz file:
ENSG00000243485 RP11-34P13.3
ENSG00000237613 FAM138A
ENSG00000186092 OR4F5
ENSG00000238009 RP11-34P13.7
ENSG00000239945 RP11-34P13.8
ENSG00000239906 RP11-34P13.14
ENSG00000241599 RP11-34P13.9
ENSG00000279928 FO538757.3
ENSG00000279457 FO538757.2

barcodes.tsv.gz file:
cell_ids
24_Day.AAACCTGAGACAATAC-1
24_Day.AAACCTGAGACTAGGC-1
24_Day.AAACCTGAGCACGCCT-1
24_Day.AAACCTGAGCCCGAAA-1
24_Day.AAACCTGAGGCAATTA-1
24_Day.AAACCTGAGGTACTCT-1

If you have any advice it would be greatly appreciated, thank you!

Removed. Now, I am getting
Error: Error in FUN(X[[i]], ...) : subscript out of bounds

Can you double check whether you mistakenly flipped rows and columns (aka transposed the matrix)??
That's how I got my bug

Yes,this runs well by rewriting the Read10x with t(data) after readMM(data).

Hi @k3yavi, I'm getting the same error as above (invalid dimnames given for “dgTMatrix” object). I think my file formats are correct, but I am missing something I guess. Can I send them over? I tried what you suggested above and it doesn't work for me. It would be very useful for me. Thank you in advance

Sorry for the late reply @bmanzato , I saw you closed the issue https://github.com/satijalab/seurat/issues/3821 , hopefully the issue was resolved ? Let us know if have any questions.

Hi,
thank you for the reply. I solved using
https://mojaveazure.github.io/seurat-disk/articles/convert-anndata.html,
anyway I'd like to know what is wrong with my files, since I think I'll
need Read10X again in the future. Please find attached my files. Thank you.
read10x_data.zip
https://drive.google.com/file/d/1TVvqx9ViQnqnDM_3BePpRiLHLbn-BcJY/view?usp=drive_web

On Wed, Dec 16, 2020 at 2:34 PM Avi Srivastava notifications@github.com
wrote:

Sorry for the late reply @bmanzato https://github.com/bmanzato , I saw
you closed the issue #3821
https://github.com/satijalab/seurat/issues/3821 , hopefully the issue
was resolved ? Let us know if have any questions.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/satijalab/seurat/issues/2740#issuecomment-746291977,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AR35ULJA2AP6G2WF2BYRL7TSVCZN3ANCNFSM4LOLFYYA
.

I've shared an item with you:

matrix.mtx
https://drive.google.com/file/d/1kBp1yzwZVsxgE1_YyQVQGImAxEEgOuaX/view?usp=sharing&invite=CJaD5NoP&ts=5fda1a0c

It's not an attachment -- it's stored online. To open this item, just click
the link above.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

htc502 picture htc502  ·  3Comments

bio-la picture bio-la  ·  3Comments

kysbbubbu picture kysbbubbu  ·  3Comments

camilliano picture camilliano  ·  3Comments

mvalenzuelav picture mvalenzuelav  ·  3Comments