Notes: Converting IPFS hash format to SHA-256

Created on 14 Oct 2017  路  4Comments  路  Source: ipfs/notes

I was looking into IPFS's design and noticed that it doesn't use SHA-256 as the final hash, but rather "Qm"+stuff, which appears to be a base58-encoding of a merkle dag (with a SHA-256 hash in there somewhere).

Anyways, I was wondering if it is possible to only have knowledge of the IPFS hash and not the file and still obtain the SHA-256 hash via base58-decoding and then reversing whatever operations that IPFS does. I know this probably wouldn't work in the case that the IPFS hash is that of a directory, but maybe it would work for the IPFS hash of a specific file?

Most helpful comment

iPFS is using multihash to address data, the Qm.. is in fact multihash prefix of sha256 hash.

When you add a file to IPFS it gets chunked into unixfs protobufs - https://github.com/ipfs/go-ipfs/blob/master/unixfs/pb/unixfs.pb.go, so the hash won't be direct sha256 of the file you added.

There is --raw-leaves option for ipfs add, adding files smaller than 256k will yield multibase-base58 CIDv1, which will contain raw sha256 of the file.

So no, by default it's not possible to derive file sha256 using hashes from add, --raw-leaves allows to do that for smaller files

All 4 comments

iPFS is using multihash to address data, the Qm.. is in fact multihash prefix of sha256 hash.

When you add a file to IPFS it gets chunked into unixfs protobufs - https://github.com/ipfs/go-ipfs/blob/master/unixfs/pb/unixfs.pb.go, so the hash won't be direct sha256 of the file you added.

There is --raw-leaves option for ipfs add, adding files smaller than 256k will yield multibase-base58 CIDv1, which will contain raw sha256 of the file.

So no, by default it's not possible to derive file sha256 using hashes from add, --raw-leaves allows to do that for smaller files

Thanks for the quick reply. Answered my question so I'll close this issue!

One thing we've been hoping for is that the IPLD produced by files.add would start including as a field the multihash of the content so we could check that what we got back is what was intended.

Apart from adding a confirmatory check that we got back the expected exact file, it would also allow for better tracking of IPFS bugs such as in https://github.com/ipfs/js-ipfs/issues/1049.

Unfortunately, that would double the amount of hashing we'd have to do and hashing is already one of our more expensive operations. At the end of the day, it's IPFS's job to verify the file's hash and give you the correct file.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

czanella picture czanella  路  3Comments

pgte picture pgte  路  4Comments

reit-c picture reit-c  路  4Comments

nicola picture nicola  路  5Comments

daviddias picture daviddias  路  4Comments