In my environment, md5 value of meta.bin used in ImageNet dataset is different from the value defined in imagenet.py.
meta.bin is generated by torch.save in the code. I found python2 and3 generate different files.
md5sum hashes are as follows.
7e0d3cf156177e4fc47011cdd30ce706a36fd93cf3900286d99e24ad0a73ce04ca981e8aac175178e80e7949d90ee85ccc @pmeier
This might be due to the fact that the information in meta is a python dict, which doesn't guarantee any order of how the keys are stored.
So I think we should not have just different hashes per python version, because it could also be system-dependent.
Maybe the simplest is to remove the md5 checking from meta.bin?
So I think we should not have just different hashes per python version, because it could also be system-dependent.
Agreed.
This might be due to the fact that the information in
metais a python dict, which doesn't guarantee any order of how the keys are stored.
I will try if we get different hashes if we use an OrderedDict. If that is still the case:
[...] remove the
md5checking frommeta.bin
Using OrderedDicts does not fix this. I'll send a fix soon.
@fmassa Is the logic in check_integrity correct?
Shouldn't we check if the fpath exist regardless of the md5 check? Otherwise this function does no checking at all if md5=None.
@pmeier yes, it seems that this logic it flawed in check_integrity. Can you send a PR fixing the order of the checks?
Most helpful comment
Agreed.
I will try if we get different hashes if we use an
OrderedDict. If that is still the case: