go-ipfs version: 0.4.13-
Repo version: 6
System version: amd64/darwin
Golang version: go1.9.2
Bug (or just my misunderstanding)
So I don't think I've actually found a real collision or weakness (just seems so unlikely) - but here's the data anyway.
I have at least 1 other file that also does this (besides the {} base case), FYI
$ echo "{}" | shasum -a 256
ca3d163bab055381827226140568f3bef7eaac187cebd76878e0b63e9e442356
$ echo "{}" | ipfs dag put -f raw --input-enc raw --hash sha2-256
zb2rhkFjaEEfsGTTeHVAGXB3qq7RzLHJojqpucfYFzoL2gB9P
# for comparison
$ echo "{}" | ipfs block put
Qmbx76B2S21SWTRwBAktMwSeficnbKR6v8pbEb7DZkhzwf
$ node
> d = s => require("bs58").decode(s).toString('hex')
[Function: d]
> d("zb2rhkFjaEEfsGTTeHVAGXB3qq7RzLHJojqpucfYFzoL2gB9P")
'82c1fdecaf85bde607d66fee097164d691cb4a4e3618fe71857fc2f0a4eefec8e6a6f1c5'
object put: 1220ca3d163bab055381827226140568f3bef7eaac187cebd76878e0b63e9e442356
original: ca3d163bab055381827226140568f3bef7eaac187cebd76878e0b63e9e442356
dag put: 82c1fdecee918eeaf57f8d872bba2c96ba09040f1a2164bae9bd9bd01ed9b63e9e442356
Nothing lines up except for the last 6 bytes.
Needless to say I'm a bit weirded out by this.
So, actually, all three of those hashes are identical. The part that's tripping you up is the multibase prefix. So... When you wrote echo "{}" | ipfs block put, we spit out a v0 CID. v0 CIDs are just raw multihashes. However ,when you wrote echo "{}" | ipfs dag put ... we spit out a v1 CID (next-gen). These CIDs use something we call multibase. That is, we prefix them with some symbol indicating the base. For base58, that symbol is z. So, to properly decode it, you need to strip of the z before decoding. If you do that, you get the same hash (ca3d163...).
Ahh! Neato. The more you know!
Most helpful comment
So, actually, all three of those hashes are identical. The part that's tripping you up is the multibase prefix. So... When you wrote
echo "{}" | ipfs block put, we spit out a v0 CID. v0 CIDs are just raw multihashes. However ,when you wroteecho "{}" | ipfs dag put ...we spit out a v1 CID (next-gen). These CIDs use something we call multibase. That is, we prefix them with some symbol indicating the base. For base58, that symbol isz. So, to properly decode it, you need to strip of thezbefore decoding. If you do that, you get the same hash (ca3d163...).