Kaniko: Subsequent Builds fail when using cache

Created on 17 Aug 2019  路  4Comments  路  Source: GoogleContainerTools/kaniko

Actual behavior
When building with --cache, the first build succeeds, but subsequent builds fail.

Expected behavior
If a build succeeded once, I expect the build to work every single time (as long as the code didn't change)

To Reproduce
Dockerfile, build context, logs, and other materials can be found here

Additional Information

The Error is always as follows. (the folder is different)

INFO[0001] Found cached layer, extracting to filesystem 
error building image: error building stage: extracting fs from image: open /usr/bin/somefolder: not a directory. 
ERROR
ERROR: build step 0 "gcr.io/kaniko-project/executor:debug" failed: exit status 1

Kaniko Image

This problem was replicated with both images

gcr.io/kaniko-project/executor:debug@sha256:a54d167d7c4b7ce0c7a622f17dcf473c652b29341b321ca507425c8fa3525842
gcr.io/kaniko-project/executor:latest@sha256:78d44ec4e9cb5545d7f85c1924695c89503ded86a59f92c7ae658afa3cff5400
arecaching kinbug prioritp1

Most helpful comment

After investigating this, it turns out the issue is NOT with caching. The problem is in how we interpret Dockerfile.

The command in the example Dockerfile:
COPY postgres_monitor /usr/bin
incorrectly copies postgres_monitor OVER the directory /usr/bin, turning /usr/bin into a file, and then all subsequent cache work fails.

Changing the Dockerfile to:
COPY postgres_monitor /usr/bin/
allows the example to work correctly.

Doing a docker build of the Dockerfile with:
COPY postgres_monitor /usr/bin
behaves correctly, so our work is cut out for us to correct the COPY behaviour.

All 4 comments

@BenHizak Thank you for the detail report along with the files to reproduce this.
Me and @donmccasland will take a look at this soon.

Thanks
Tejal

I have been able to reproduce this issue on my GCP project using a local build.

I'm finding interesting behavior. If I clean the cache between runs I get different errors:
error building image: error building stage: extracting fs from image: mkdir /usr/bin: not a directory

OR

DEBU[0005] creating file /usr/bin/reindexdb
error building image: error building stage: extracting fs from image: error removing /usr/bin/reindexdb to make way for new file.: unlinkat /usr/bin/reindexdb: not a directory

OR

DEBU[0005] creating file /usr/bin/reindexdb
error building image: error building stage: extracting fs from image: error removing /usr/bin/pw_receivewal to make way for new file.: unlinkat /usr/bin/pw_receivewal: not a directory

If I pull the cache and look in the image's overlay2 I see all mentioned files/directories:
donmccasland@rocinante:~$ docker pull gcr.io/projectgut-215417/kaniko_example/cache@sha256:1e0163ed3c2dddcd1dfa196249c3203d70023bcdae2acc9c3136b1e11e588607
sha256:1e0163ed3c2dddcd1dfa196249c3203d70023bcdae2acc9c3136b1e11e588607: Pulling from projectgut-215417/kaniko_example/cache
9fb30a5620e7: Pull complete
Digest: sha256:1e0163ed3c2dddcd1dfa196249c3203d70023bcdae2acc9c3136b1e11e588607
Status: Downloaded newer image for gcr.io/projectgut-215417/kaniko_example/cache@sha256:1e0163ed3c2dddcd1dfa196249c3203d70023bcdae2acc9c3136b1e11e588607
donmccasland@rocinante:~$ docker inspect gcr.io/projectgut-215417/kaniko_example/cache@sha256:1e0163ed3c2dddcd1dfa196249c3203d70023bcdae2acc9c3136b1e11e588607
[
{
"Id": "sha256:cc27e34c29c5d4b3ef4ec5f7383f267d28539f57b5b8f014d46e88a9d52d4991",
"RepoTags": [],
"RepoDigests": [
"gcr.io/projectgut-215417/kaniko_example/cache@sha256:1e0163ed3c2dddcd1dfa196249c3203d70023bcdae2acc9c3136b1e11e588607"
],
"Parent": "",
"Comment": "",
"Created": "2019-09-05T22:30:49.536867558Z",
"Container": "",
"ContainerConfig": {
"Hostname": "",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": null,
"Cmd": null,
"Image": "",
"Volumes": null,
"WorkingDir": "",
"Entrypoint": null,
"OnBuild": null,
"Labels": null
},
"DockerVersion": "",
"Author": "",
"Config": {
"Hostname": "",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": null,
"Cmd": null,
"Image": "",
"Volumes": null,
"WorkingDir": "",
"Entrypoint": null,
"OnBuild": null,
"Labels": null
},
"Architecture": "",
"Os": "linux",
"Size": 14631176,
"VirtualSize": 14631176,
"GraphDriver": {
"Data": {
"MergedDir": "/usr/local/google/docker/overlay2/89a45e2f4e71b0f6b905290b09427df30f202dc6f7479abb4088a46d07e10637/merged",
"UpperDir": "/usr/local/google/docker/overlay2/89a45e2f4e71b0f6b905290b09427df30f202dc6f7479abb4088a46d07e10637/diff",
"WorkDir": "/usr/local/google/docker/overlay2/89a45e2f4e71b0f6b905290b09427df30f202dc6f7479abb4088a46d07e10637/work"
},
"Name": "overlay2"
},
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:349a69f58d831f623948be7307c246f5cb3706f34ef91112d73c31fe673b8c37"
]
},
"Metadata": {
"LastTagTime": "0001-01-01T00:00:00Z"
}
}
]
donmccasland@rocinante:~$ sudo su
[sudo] password for donmccasland:
root@rocinante:/usr/local/google/home/donmccasland# cd /usr/local/google/docker/overlay2/89a45e2f4e71b0f6b905290b09427df30f202dc6f7479abb4088a46d07e10637
root@rocinante:/usr/local/google/docker/overlay2/89a45e2f4e71b0f6b905290b09427df30f202dc6f7479abb4088a46d07e10637# ls
diff link
root@rocinante:/usr/local/google/docker/overlay2/89a45e2f4e71b0f6b905290b09427df30f202dc6f7479abb4088a46d07e10637# ls diff/usr/bin
clusterdb dropdb jq pg_dumpall pg_recvlogical pip2 psql
createdb dropuser pg_basebackup pg_isready pg_restore pip2.7 reindexdb
createuser easy_install-2.7 pg_dump pg_receivewal pip postgres_monitor vacuumdb

After investigating this, it turns out the issue is NOT with caching. The problem is in how we interpret Dockerfile.

The command in the example Dockerfile:
COPY postgres_monitor /usr/bin
incorrectly copies postgres_monitor OVER the directory /usr/bin, turning /usr/bin into a file, and then all subsequent cache work fails.

Changing the Dockerfile to:
COPY postgres_monitor /usr/bin/
allows the example to work correctly.

Doing a docker build of the Dockerfile with:
COPY postgres_monitor /usr/bin
behaves correctly, so our work is cut out for us to correct the COPY behaviour.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tejal29 picture tejal29  路  4Comments

PatrickXYS picture PatrickXYS  路  4Comments

priyawadhwa picture priyawadhwa  路  4Comments

cdupuis picture cdupuis  路  4Comments

ErikWegner picture ErikWegner  路  4Comments