Protobuf: sha256 changes on v3.3.0, v3.3.1?

Created on 11 Sep 2017  Â·  13Comments  Â·  Source: protocolbuffers/protobuf

Hello,

We're using protobuf v3.3.0 in our project, exactly like this example in tink, but the sha256 seems to have changed? Specifically, we (and tink) point at 94c414775f275d876e5e0e4a276527d155ab2d0da45eed6b7734301c330be36e, but we're getting 9a36bc1265fa83b8e818714c0d4f08b8cec97a1910de0754a321b11e66eb76de instead

The build error:

ERROR: /home/admin/app/BUILD:66:1: no such package '@com_google_protobuf_well_known_protos//': Error downloading [https://github.com/google/protobuf/archive/v3.3.0.tar.gz] to /home/admin/.cache/bazel/_bazel_admin/61e657d6884c63362b8b441914a1bc68/external/com_google_protobuf_well_known_protos/v3.3.0.tar.gz: 
Checksum was 9a36bc1265fa83b8e818714c0d4f08b8cec97a1910de0754a321b11e66eb76de but wanted 94c414775f275d876e5e0e4a276527d155ab2d0da45eed6b7734301c330be36e and referenced by '//app:test_proto'.

Similarly, if we just wget the file and sha256sum on it:

% wget https://github.com/google/protobuf/archive/v3.3.0.tar.gz
--2017-09-11 18:39:50--  https://github.com/google/protobuf/archive/v3.3.0.tar.gz
Resolving github.com (github.com)... 192.30.255.112, 192.30.255.113
Connecting to github.com (github.com)|192.30.255.112|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/google/protobuf/tar.gz/v3.3.0 [following]
--2017-09-11 18:39:51--  https://codeload.github.com/google/protobuf/tar.gz/v3.3.0
Resolving codeload.github.com (codeload.github.com)... 192.30.255.121, 192.30.255.120
Connecting to codeload.github.com (codeload.github.com)|192.30.255.121|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4336644 (4.1M) [application/x-gzip]
Saving to: ‘v3.3.0.tar.gz’

v3.3.0.tar.gz                                                                              100%[==========================================================================================================================================================================================================================================>]   4.14M  --.-KB/s   in 0.07s

2017-09-11 18:39:51 (59.7 MB/s) - ‘v3.3.0.tar.gz’ saved [4336644/4336644]

% sha256sum v3.3.0.tar.gz
9a36bc1265fa83b8e818714c0d4f08b8cec97a1910de0754a321b11e66eb76de  v3.3.0.tar.gz

Further, we noticed the same for v3.3.1 where we expect df77b0e60afcd3d90b2654cd305e61ae8ae2e2281b4d6540c7093da4c4245d75, but get c280314bd8aea07855178419881c53ec66c63c6ca0d1fa82918e9ee52c12b014

% wget https://github.com/google/protobuf/archive/v3.3.1.tar.gz
--2017-09-11 18:45:18--  https://github.com/google/protobuf/archive/v3.3.1.tar.gz
Resolving github.com (github.com)... 192.30.255.113, 192.30.255.112
Connecting to github.com (github.com)|192.30.255.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/google/protobuf/tar.gz/v3.3.1 [following]
--2017-09-11 18:45:18--  https://codeload.github.com/google/protobuf/tar.gz/v3.3.1
Resolving codeload.github.com (codeload.github.com)... 192.30.255.120, 192.30.255.121
Connecting to codeload.github.com (codeload.github.com)|192.30.255.120|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4336995 (4.1M) [application/x-gzip]
Saving to: ‘v3.3.1.tar.gz’

v3.3.1.tar.gz                                                                              100%[==========================================================================================================================================================================================================================================>]   4.14M  --.-KB/s   in 0.07s

2017-09-11 18:45:18 (55.4 MB/s) - ‘v3.3.1.tar.gz’ saved [4336995/4336995]

% sha256sum v3.3.1.tar.gz
c280314bd8aea07855178419881c53ec66c63c6ca0d1fa82918e9ee52c12b014  v3.3.1.tar.gz

Additionally, here's an example in grpc/grpc-java, which references the old df77b0e60afcd3d90b2654cd305e61ae8ae2e2281b4d6540c7093da4c4245d75 sha

Most helpful comment

We certainly can publish the full archive tarball as well. It was not done so because we assumed most users only need one or two languages rather than all. This incident seems a good reason for us to do a full archive. I'll make sure that's covered in future releases.

All 13 comments

We're seeing this from rules_go as well, which pins to v3.2.0: https://github.com/bazelbuild/rules_go/blob/release/0.4.2/proto/go_proto_library.bzl#L312

That sha no longer matches.

I am seeing this as well. It appears that github's archives' tgz headers changed in a way that leaves the unpacked files identical, but the the archive block headers have changed. Something in the tar part, not the gz part, related to how >100-character file paths are represented.

This is pervasive for many tagged source releases, not just 3.3.x etc., and is pervasive for other projects hosted on github as well.

FYI, it's not specific to this particular repository. It's Github who is making this change.

Protobuf releases don't have a download equivalent to the "full archive" provided by Github, it only has language-specific subsets. Could this be mitigated by adding archives to the old releases?

Attached is a cached copy of the 3.2.0 archive tarball.
v3.2.0.tar.gz

See also bazelbuild/bazel#3722 and tensorflow/tensorflow#12979.

See: https://github.com/libgit2/libgit2/issues/4343#issuecomment-328631745

turns out the source .tar.gz is generated by github and the checksum isn't guaranteed to be the same.

The https://github.com/google/protobuf/releases has "Source code" links with the automatically generated tarballs, whose checksum can change over time. This seems misleading at best. Perhaps there is an issue still to be fixed, where the release process also publishes a stable source tarball and its sha256 file, instead of relying on github to generate it.

@jwnimmer-tri We never wanted to rely on github to generate tarballs. Actually I wanted to get rid these two "Source code" links since the first release we made on github because they are just not the tarballs we want to provide as part of a protobuf release, but they are created by github automatically and there doesn't seem to be a way to disable them.

I understand the github's misleading links can't be removed. That's sad, but nothing google can change. But it seems like google could still publish a full source archive as it exists in git, instead of the language-specific subsets that are already offered as individual tarballs as part of the release?

We certainly can publish the full archive tarball as well. It was not done so because we assumed most users only need one or two languages rather than all. This incident seems a good reason for us to do a full archive. I'll make sure that's covered in future releases.

@xfxyjwf Can you also publish tarballs for existing releases? I'm sure users would be happy to recover from build caches if it means not having to rebuild everything. The sha256 checkums can be used to verify they're the same data as before.

@jmillikin-stripe If we publish our own full archive tarballs for existing releases, the URL for these tarballs will be different from the tarballs produced automatically by github. It's also very unlikely for these tarball to have the same sha256. From what I can tell, it won't help the build cache problem people are having after the github checksum change.

@xfxyjwf Changing the URL isn't a big problem IMO, as long as the sha256 is the same. Bazel makes it easy to add additional URLs to an existing artifact.

You can upload an "old" tarball recovered from build caches. I posted the one for 3.2.0 above -- its sha256 can be used to verify it's the same as what Github's artifact link used to serve.

Was this page helpful?
0 / 5 - 0 ratings