Go-ipfs: Forbid non-cryptographic hashes within /ipfs namespace

Created on 5 Nov 2017  路  23Comments  路  Source: ipfs/go-ipfs

It just occured to me that by supporting non-cryptographic hashes (such as murmur3) in multihash, these are also available for CIDs, and can thus be used within the /ipfs namespace (example: ipfs/ipfs-companion#305).

This is not good! /ipfs paths MUST be cryptographically verifiable, it's their primary property. A lack of cryptographic verifiability seriously breaks so many assumptions on so many levels.

This would probably be appropriate to enforce within the respective IPLD modules.

P1 kinbug neecommunity-input topisecurity

Most helpful comment

Here is my action plan to switching to Strict CIDs. At the same time we will add support for recognizing identity hashes and being able to use them without storing them in the datastore (see https://github.com/ipfs/go-ipfs/issues/4766#issuecomment-374726490)

Note I used gx-go link to test how badly https://github.com/ipfs/go-cid/pull/40 breaks things by returning errors in the NewCid functions. It turns out not too much. There where one or two fixes I had to make outside of this repo. The main pain will be bubbling up a new version of go-cid.

@whyrusleeping @Kubuxu @Stebalien (and anyone else) sound good?

All 23 comments

cc @Stebalien @Kubuxu @whyrusleeping

Giving this a bug label. ipfs.io frontpage says:

Here's how IPFS works

Each file and all of the blocks within it are given a unique fingerprint called a cryptographic hash.

[...]

Yeah... we also have to check:

  1. Hash lengths.
  2. For sha1, we'd want to verify that the hash isn't in the set of class of known vulnerable hashes. Note: we can't do this by just looking at the hash; we need the data as well.

We may want to introduce a new type/constructor to enforce this. Personally, I'd rename the current Multihash type to InsecureMultihash or something scary like that and introduce an InsecureSum method etc.

My idea of solving this issue, as well as issue of short hashes is:

For each hash function define its # of security bits ratio. As an example, SHA1 has length of 160 bits, if it was perfect hash function it would have 80 bits of security for collision but SHA1 is partially broken, it currently is deemed to have 61 bits of security so its security bits ratio is 61/80 = 0.76 (which could be further reduced because of SHAttered).

Then every time we see hash for content addressing we evaluate how many bits of security it really have (formula would be mhLenInBits / 2 * secBitRatio) and compared with required number of security bits. This value would by default set to allow SHA1 and could be increased in config.

Full round variant of SHA256 is yet to be successfully cryptoanalysed so its security bit factor would be 1 same as many other cryptographically sound hash functions.

Murmur would be cut off for two reasons, low bit count and low security factor (0 ??), same with other hash functions that are currently obsolete.

  • I like the ideas for calculating the bit security, specially because can easily calculate what length of a hash would still be secure.
  • But the problem is that relies on having up-to-date knowledge on all attacks and increases surface for run-time attacks (trick the program into calculating incorrectly).
  • It would be best if we just had a whitelist of the hashes we think are secure, and minimum lengths. We can keep rolling that up and decommissioning hashes. This could also be altered by people recompiling go-ipfs for their own use cases, but our standard distribution should be fixed (and not allow an editable text config to compromise security)

This is a bit urgent, so ideally a partial solution (like denying all hashes not in a short whitelist or with lengths < 256 bits) could be deployed in the next release.a

(and not allow an editable text config to compromise security)

That is why in my writeup I only mentioned increasing number of security bits. Decreasing bellow some set minimum (or default) would not be possible.

Team, we need this. To make it simpler for now, go with:

  • Have a _compiled_ whitelist of the hashes we think are secure, w/ minimum lengths. We can keep rolling that up and decommissioning hashes. We can always expand the list should we think t ok. We can always change to a smarter function in the future.

Allowed hashes from https://github.com/multiformats/multihash/blob/master/hashtable.csv:

identity (once #4697 lands)
sha2-*
dbl-sha2-256  (lol, i'm sot surprised)
sha3-*
shake-*   ? (confirm someone)
keccak-*
blake2*
  • As said above, this could also be altered by people recompiling go-ipfs for their own use cases, but our standard distribution should be fixed (and not allow an editable text config to compromise security)
  • Let's leave the "smart hash picker" @Kubuxu described for the future.
  • For old content, we could have something like /ipfs-insecure/<hash> that allows insecure hashes. This will make sure old content is still retrievable and viewable locally, especially during flag days. let's leave this (or an alternate, better solution) for the future

For old content, we could have something like...

Do note that there already is such "old content" from a test I did back in 2016 ( 17 cases, specifically everything in the rows v1pb-sha1-20 and v1raw-sha1-20 and the column v1pb-sha1-20 ).

You can use this matrix both for validating "yep, we no longer resolve this link" and for a possible future "yes, we still can open it under /ipfs-insecure/..."

Started a solution here: https://github.com/ipfs/go-cid/pull/40

My approach will be to restrict the creation of invalid cids. This is a more surefire way restricting content than trying to police all the different places data can get into the system.

cc @Stebalien @magik6k @Kubuxu @kevina

I am not sure whether this has been fully fixed or not, but there is improvement: I now get a 504 after a massive timeout ( 10 minutes or something, should likely be lowered ) if I try to access <redacted>

Kudos to the team!

What is under this hash?

It was solved by: https://github.com/ipfs/go-ipfs/pull/4751
but we want to move to a permanent solution of ipfs/go-cid#40

What is under this hash?

On an unpatched go-ipfs:

~$ ipfs cat QmVpsktzNeJdfWEpyeix93QJdQaBSgRNxebSbYSo9SQPGx
This is some text self-referentially addressable via https://ipfs.io/ipfs/QmVpsktzNeJdfWEpyeix93QJdQaBSgRNxebSbYSo9SQPGx
I can of course just as easily replace it with anything else at any point in the future :)

From my email to security@ back in November:

This works because QmVpsktzNeJdfWEpyeix93QJdQaBSgRNxebSbYSo9SQPGx is a linknode with a sole pointer to https://ipfs.io/ipfs/z9iepse, which is the CID:

01 - version
55 - type raw
12 - sha256
01 - hash subset length
6d - first byte of hash

The attack would be mounted by providing some legitimate content with the intent to replace benign short-addressed nodes within the DAG with malicious ones later.

Looking over #4751 it seems there is no work related to the DHT. I still strongly recommend the following:

I recommend limiting the length attribute of CidV1 to at least 160 bits, and adding code ensuring the DHT will not relay anything shorter ( to protect clients which are unable/unwilling to upgrade ).

@mib-kd743naq my plan for ipfs/go-cid#40 is to forbid any creation of CID with the number of security bits lower than SHA1.


I wonder why it times out.

my plan ... is to forbid any creation of CID

You can not really do this in an open p2p system: you can not control what shows up on the network. Case in point: I assembled all of the above without any of the go-ipfs libraries.

All you can realistically do is harden various readers/repeaters.

Yes, that what I meant. If we are not able to create an insecure CID on our node then we won't be able to propagate info about it.

Here is my action plan to switching to Strict CIDs. At the same time we will add support for recognizing identity hashes and being able to use them without storing them in the datastore (see https://github.com/ipfs/go-ipfs/issues/4766#issuecomment-374726490)

Note I used gx-go link to test how badly https://github.com/ipfs/go-cid/pull/40 breaks things by returning errors in the NewCid functions. It turns out not too much. There where one or two fixes I had to make outside of this repo. The main pain will be bubbling up a new version of go-cid.

@whyrusleeping @Kubuxu @Stebalien (and anyone else) sound good?

@kevina sounds reasonable to me.

Btw, for any readers of this issue, The discussion now is about a 'more correct' fix to the problem. Currently, we just prevent the use of these hashes via the blockstore in ipfs. This effectively solves the problem, but its not a clean solution.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

JesseWeinstein picture JesseWeinstein  路  4Comments

daviddias picture daviddias  路  3Comments

lidel picture lidel  路  3Comments

jonchoi picture jonchoi  路  3Comments

funkyfuture picture funkyfuture  路  3Comments