Actual behavior
I have a two stage docker image that first build sphinx docs and then copies it to the second nginx image:
FROM python:3.7.5-alpine3.10 AS builder
RUN apk add --no-cache build-base libxml2-dev libxslt-dev graphviz
COPY requirements.txt /
RUN pip install --upgrade -r /requirements.txt
WORKDIR /build
COPY . ./
RUN make html
########################
FROM nginx:1.17.5-alpine AS runner
COPY --from=builder /build/_build/html /usr/share/nginx/html
I build it on Google Cloud Build using gcr.io/kaniko-project/executor:latest builder.
The problem is the last copy instruction - even through generated HTML files differ from the previously built revision, kaniko still prefers to use the cached version:
Step #0: build succeeded.
Step #0:
Step #0: The HTML pages are in _build/html.
Step #0:
Step #0: Build finished. The HTML pages are in _build/html.
Step #0: INFO[0062] Taking snapshot of full filesystem...
Step #0: INFO[0064] Pushing layer gcr.io/xxx/docs-server/cache:cd602b7af9c972414b3241eb3a192a606755174cf665f9806155078915975932 to cache now
Step #0: INFO[0068] Saving file /build/_build/html for later use.
Step #0: INFO[0068] Deleting filesystem...
Step #0: INFO[0068] Downloading base image nginx:1.17.5-alpine
Step #0: INFO[0069] Error while retrieving image from cache: getting file info: stat /cache/sha256:b4c0378c841cd76f0b75bc63454bfc6fe194a5220d4eab0d75963bccdbc327ff: no such file or directory
Step #0: INFO[0069] Downloading base image nginx:1.17.5-alpine
Step #0: INFO[0069] Checking for cached layer gcr.io/xxx/docs-server/cache:5de22bf6a7763f10cd0ffa0592b8f51c876286a1d0f50564cf2f21bfd6262dc4...
Step #0: INFO[0069] Using caching version of cmd: COPY --from=builder /build/_build/html /usr/share/nginx/html
Step #0: INFO[0069] Skipping unpacking as no commands require it.
Step #0: INFO[0069] Taking snapshot of full filesystem...
Step #0: INFO[0070] COPY --from=builder /build/_build/html /usr/share/nginx/html
Step #0: INFO[0070] Found cached layer, extracting to filesystem
Expected behavior
Detect that stage 0 image files' have changes and discard cache for COPY --from directive in the stage 1.
To Reproduce
Steps to reproduce the behavior:
mkdir /tmp/bug; cd /tmp/bug
python3.7 -mvenv /tmp/bug-env && source /tmp/bug-env/bin/activate
mkdir docs
echo 'Sphinx==2.2.1' >> docs/requirements.txt
pip install -r docs/requirements.txt
cd docs
echo no | sphinx-quickstart -p foo -a foo -v 0.0.1 -r 0.0.1 -l en
Then add the above Dockerfile under docs/.
steps:
- name: gcr.io/kaniko-project/executor:latest
args:
- --cache=true
- --cache-ttl=336h # 2 weeks
- --context=/workspace/docs
- --dockerfile=/workspace/docs/Dockerfile
- --destination=gcr.io/$PROJECT_ID/docs-server:$BRANCH_NAME-$COMMIT_SHA
docs/index.rst, commit, trigger new build and see that resulting images still has the old version of the docsAdditional Information
Triage Notes for the Maintainers
| Description | Yes/No |
|----------------|---------------|
| Please check if this a new feature you are proposing |
--cache flag | I experience this problem too.
Current workaround for me is using gcr.io/kaniko-project/executor:v0.13.0.
Just found this having the exact same issue using kaniko on cloud build. will try suggested work aorund/roll back.
this appears to the same underlying problem as #845 and #589
@cvgw Any movement on getting that fix merged?
@mcfedr hoping to get merged this week
I can also confirm this does not exist in 0.13.0 but not sure why
hope the fixes can get merged soon and released
This is how I recreated the issue locally using a test Dockerfile and AWS ECR as a Cache repo
We can replicate this bug on our machine using Docker
Dockerfile# set some variables
docker_image="gcr.io/kaniko-project/executor:debug-v0.14.0"
ecr_endpoint="YOUR_AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com"
ecr_repo="YOUR_ECR_REPO_NAME"
# create our test Dockerfile
mkdir -p ./tmp/
cat <<'EOF' > ./tmp/Dockerfile
FROM busybox:latest as builder
COPY ./date.txt /tmp/date.txt
FROM busybox:latest
COPY --from=builder /tmp/date.txt /tmp/test.txt
EOF
# run kaniko debug image with some env vars
docker run \
--tty \
--interactive \
--rm \
--volume "$(pwd)/tmp:/app" \
--volume "$HOME/.aws":/root/.aws \
--env "AWS_DEFAULT_REGION=us-east-1" \
--env "ECR_ENDPOINT=${ecr_endpoint}" \
--env "ECR_REPO=${ecr_repo}" \
--entrypoint "" \
"${docker_image}" \
sh
# create ECR config file for kaniko
mkdir -p /kaniko/.docker/
cat <<EOF > /kaniko/.docker/config.json
{
"credHelpers": {
"${ECR_ENDPOINT}": "ecr-login"
}
}
EOF
# Create first image
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
--context "/app" \
--dockerfile "/app/Dockerfile" \
--destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
--cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
--cache=true \
--cleanup
echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image1.txt"
# Create 2nd Image
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
--context "/app" \
--dockerfile "/app/Dockerfile" \
--destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
--cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
--cache=true \
--cleanup
echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image2.txt"
# Create 3rd Image w/o Cache
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
--context "/app" \
--dockerfile "/app/Dockerfile" \
--destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
--cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
--cache=false \
--cleanup
echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image3.txt"
$ cat image{1..3}.txt
cat image1.
Tue Dec 10 14:30:10 UTC 2019
Tue Dec 10 14:30:10 UTC 2019
Tue Dec 10 15:09:24 UTC 2019
image3.txt gave us an updated date.txt file because it did not use the cache.Lets try again to build but with docker image gcr.io/kaniko-project/executor:debug-v0.13.0
0.13.0:# Run Kaniko Image 0.13.0
docker_image="gcr.io/kaniko-project/executor:debug-v0.13.0"
# run kaniko debug image with some env vars
docker run \
--tty \
--interactive \
--rm \
--volume "$(pwd)/tmp:/app" \
--volume "$HOME/.aws":/root/.aws \
--env "AWS_DEFAULT_REGION=us-east-1" \
--env "ECR_ENDPOINT=${ecr_endpoint}" \
--env "ECR_REPO=${ecr_repo}" \
--entrypoint "" \
"${docker_image}" \
sh
# create ECR config file for kaniko
mkdir -p /kaniko/.docker/
cat <<EOF > /kaniko/.docker/config.json
{
"credHelpers": {
"${ECR_ENDPOINT}": "ecr-login"
}
}
EOF
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
--context "/app" \
--dockerfile "/app/Dockerfile" \
--destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
--cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
--cache=true \
--cleanup
echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image4.txt"
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
--context "/app" \
--dockerfile "/app/Dockerfile" \
--destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
--cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
--cache=true \
--cleanup
echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image5.txt"
$ cat image{4..5}.txt
Tue Dec 10 15:32:45 UTC 2019
cat image5. Tue Dec 10 15:35:38 UTC 2019
Once https://github.com/GoogleContainerTools/kaniko/issues/899 is fixed this issue should be resolved
This should be fixed as of master@https://github.com/GoogleContainerTools/kaniko/commit/a675098b452d020bf678063a3ac2d3be84b0e545
Some assistance verifying that it is fixed would be appreciated. Thank you
This should be fixed as of master@a675098
Some assistance verifying that it is fixed would be appreciated. Thank you
Is there a prebuilt image from master that I can test with? it looks like gcr.io/kaniko-project/executor:latest is fixed at the last release
@mcfedr I think you can use the git hash, like gcr.io/kaniko-project/executor:a675098b452d020bf678063a3ac2d3be84b0e545
@mcfedr you can check image versions in GCR https://console.cloud.google.com/gcr/images/kaniko-project/GLOBAL/executor?gcrImageListsize=30
This should be fixed as of master@a675098
Some assistance verifying that it is fixed would be appreciated. Thank you
@cvgw fix is working on the latest version for me.
Dockerfile
FROM maven:3.6.2-jdk-11-slim AS build
WORKDIR /usr/src/app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn clean package
FROM openjdk:11.0.4-jre-slim
WORKDIR /home/user
COPY --from=build /usr/src/app/target/app-0.0.1.jar /home/user/app.jar
RUN chmod +x /home/user/app.jar
ENTRYPOINT ["java", "-jar", "app.jar"]
EXPOSE 8080
kaniko output running multiple builds, changing source code
INFO[0390] No cached layer found for cmd COPY --from=build /usr/src/app/target/app-0.0.1.jar /home/user/app.jar
@cvgw works for me too. Thanks!
Thanks everyone for the help solving and verifying this issue. Closing, but feel free to reopen if it isn't 100% fixed
Any info on when release version lands(0.15.0?)? (I can test commit based version, but won't be able to use non-release version in production.)
@stroncium plan is to release v0.15.0 this week
Most helpful comment
@mcfedr hoping to get merged this week