Hi there,
I'm trying to read an h5 file from a published data set (available on GEO accession GSM2561498), using the Read10X_h5 function, but keep getting the following error.
> ley.ctrl.data <- Read10X_h5('GSM2561498.h5')
Error in x$exists(name) : HDF5-API Errors:
error #000: H5L.c in H5Lexists(): line 879: unable to get link info
class: HDF5
major: Symbol table
minor: Object not found
error #001: H5L.c in H5L__exists(): line 2962: path doesn't exist
class: HDF5
major: Symbol table
minor: Object already exists
error #002: H5Gtraverse.c in H5G_traverse(): line 867: internal path traversal failed
class: HDF5
major: Symbol table
minor: Object not found
error #003: H5Gtraverse.c in H5G_traverse_real(): line 594: can't look up component
class: HDF5
major: Symbol table
minor: Object not found
error #004: H5Gobj.c in H5G__obj_lookup(): line 1156: can't locate object
class: HDF5
major: Symbol table
minor: Object not found
error #005: H5Gstab.c in H5G__stab_lookup(): line 886: can't read message
class: HDF5
major: Symbol table
minor: Unrecognized message
Any insight into what might be going on would be hugely helpful!
Thanks so much,
Emily
Hi Emily,
It looks like that file isn't consistent with 10X's documentation on how the H5 output file should be structured and therefore the Read10X_h5 function isn't going to work here. However, you can still read in the file with
library(hdf5r)
infile <- H5File$new("GSM2561498.h5")
Alternatively, you could try cellrangerRkit from 10X as they recommend on that documentation page.
Hi Andrew,
Thank you so much for the reply. Clearly, I am not very familiar with the h5 format. I was able to read in the file with the command you suggested above, but it is unclear to me where to go from here to create the Seurat object, or if that is even possible? I would very much like to continue using Seurat if possible, since I would like to use the RunMultiCCA as well as additional packages that require a Seurat object, but using infile as the raw.data for CreateSeuratObject yields the following error:
> library(hdf5r)
> infile <- H5File$new("GSM2561498.h5")
> library(Seurat)
> ley.ctrl <- CreateSeuratObject(raw.data = infile, project = "Ley.Ctrl", min.cells = 3, min.genes = 200)
Error in object.raw.data > is.expr :
comparison (6) is possible only for atomic and list types
Thank you so much for your help!
Emily
Hi Emily,
You need to convert the data in the H5 file into a matrix before passing that to CreateSeuratObject.
You can read a little more about how to use hdf5 files in R here. For specific details on that particular dataset, I would recommend emailing the contact on the GEO page as that gets a bit beyond the scope of Seurat.
Thanks so much for all your help, Andrew. I will follow up with the the authors if I can't get it to work on my own.
Thanks again,
Emily
Hi, andrewwbutler.
I got an error when using Read10X_h5 to read the h5 file from the ouput of cellranger-3.0.0 count. The error told me that data["matrix/gene_names"] does not exist. And I found that gene_names in cellranger.h5 is the data["matrix/features/name"]. I didnt' test the data["matrix/genes"], but I think it won't be work either.
Below is the cellranger h5 data structures, according to the structure, neither "genes" nor "gene_names" will not be contained in cellranger h5 file. Am I right?
(root)
└── matrix [HDF5 group]
├── barcodes
├── data
├── indices
├── indptr
├── data
├── shape
└── features [HDF5 group]
├─ _all_tag_keys
├─ feature_type
├─ genome
├─ id
├─ name
├─ pattern [Feature Barcoding only]
├─ read [Feature Barcoding only]
└─ sequence [Feature Barcoding only]
Hi Emily, I got the same error while trying to read molecule_info.h5 files instead of gene barcodes matrices. You can re-generate gene-barcode matrices with the cellranger aggr command.
JB
I just faced the same issue and came up with this solution. Maybe this will help anybody even though the same function in Seurat v3 works fine for me.
h5_data <- hdf5r::H5File$new('filtered_feature_bc_matrix.h5', mode = 'r')
feature_matrix <- Matrix::sparseMatrix(
i = h5_data[['matrix/indices']][],
p = h5_data[['matrix/indptr']][],
x = h5_data[['matrix/data']][],
dimnames = list(
h5_data[['matrix/features/name']][],
h5_data[['matrix/barcodes']][]
),
dims = h5_data[['matrix/shape']][],
index1 = FALSE
)
Most helpful comment
I just faced the same issue and came up with this solution. Maybe this will help anybody even though the same function in Seurat v3 works fine for me.