Kaniko: --reproducible leads to different layers in resulting image

Created on 3 Dec 2019  路  5Comments  路  Source: GoogleContainerTools/kaniko

Actual behavior
Building an image twice using the --reproducible flag leads to different layers in the final images.

Expected behavior
Image layers should have the same hash as the content is the same and the reproducible flag was used.

To Reproduce

mkdir ctx && cd ctx
echo "Foo" >foo
echo "Fooo" >fooo
echo "Bar" >bar

cat >Dockerfile <<'EOF'
FROM gcr.io/distroless/java
USER nonroot
COPY foo* /
COPY bar /
EOF

docker run -it -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy --rm -v $(pwd):/data  \
         -v ~/.docker/config.json:/kaniko/.docker/config.json \
         gcr.io/kaniko-project/executor:v0.14.0 \
              --reproducible \
              --context /data \
              --dockerfile Dockerfile \
              --destination myrepo.com/img:foo

docker run -it -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy --rm -v $(pwd):/data  \
         -v ~/.docker/config.json:/kaniko/.docker/config.json \
         gcr.io/kaniko-project/executor:v0.14.0 \
              --reproducible \
              --context /data \
              --dockerfile Dockerfile \
              --destination myrepo.com/img:bar 

Layers in myrepo.com/img:foo and myrepo.com/img:bar have different hashes.

Additional Information
see steps to reproduce

also described this in in #710 - this issue is however not releated to caching

Triage Notes for the Maintainers

| Description | Yes/No |
|----------------|---------------|
| Please check if this a new feature you are proposing |

  • - [ ]
|
| Please check if the build works in docker but not in kaniko |
  • - [ ]
|
| Please check if this error is seen when you use --cache flag |
  • - [ ]
|
| Please check if your dockerfile is a multistage dockerfile |
  • - [ ]
|

arebehavior kinbug prioritp3

Most helpful comment

This map in filesWithParentDirs causes that function to return files in a different order on each invocation. I refactored it a bit to preserve order, and now my layers remain consistent, but only with a warm cache. The initial image built w/ cold cache differs.

All 5 comments

Having the same issue. I ran all checks supplied by container-diff, but it didn't reveal any differences. Yet, sha256 still differs..

As mentioned in #710 I have already debugged this and the different hash is resulting from the different order of the files in this tar-gz json diff layer information. I think the files listed in that JSON file need to be in a consistent order.

This map in filesWithParentDirs causes that function to return files in a different order on each invocation. I refactored it a bit to preserve order, and now my layers remain consistent, but only with a warm cache. The initial image built w/ cold cache differs.

Why is this influenced by caching? I had a look on the code - I am not a Go expert but do those maps preserve order? Wouldn't it be sufficient to just sort the resulting array of files in lexicographical order?

There is still a case that leads to different layers that I observed today in one of our builds:

  • If files are deleted between layers, the deleted files will be iterated over a map (unstable) and collected in a slice
  • The whiteouts that are written to the tar are thus not stable and result in a tar file with different order (but same contents)

The code should be quite easy to fix by adding sorting for the whiteouts.

Was this page helpful?
0 / 5 - 0 ratings