Kaniko: Design caching

Created on 16 Apr 2018 · 16Comments · Source: GoogleContainerTools/kaniko

Source

priyawadhwa

👍8

Most helpful comment

In my CI setup that currently uses dind, i use docker build --cache-from for caching - where i pull the current version from the registry before starting the build - this greatly improves build times.

The cache is for all the layers, not just the base, which is important, as many layers dont change with every PR that is made.

It would be great to see kaniko support something similar. Personally it seems docker registry is the best place to store the cache.

mcfedr on 23 Aug 2018

👍3

All 16 comments

these PRs might be helpful to look at:
https://github.com/GoogleCloudPlatform/container-diff/pull/118/files
https://github.com/kubernetes/minikube/pull/1881/files

aaron-prindle on 17 Apr 2018

👍1

Caching what?

r2d4 on 17 Apr 2018

@r2d4 lm guessing the layers, as this was my first thought. Though I haven't looked yet looked under the hood.

shadycuz on 18 Apr 2018

Yup, the layers of the base image

priyawadhwa on 18 Apr 2018

Where? :)

It wouldn't make sense to cache locally when running on k8s. Caching remotely would be the same as pushing/pulling to GCR.

r2d4 on 18 Apr 2018

Caching "locally" in a PD could make sense.

dlorenc on 19 Apr 2018

As we work on caching, I'd like to focus on cleanly separating base image layer caching from RUN command caching.

I think most users want base image layers cached as part of a CI build, but probably don't want RUN commands to be cached (at least by default). Kaniko is designed to run in a CI system, not interactively during development.

Caching RUN commands can lead to unpredictable builds when dealing with multiple build hosts (not knowing whether that apt-get upgrade command will actually run or not).

In all of the images I maintain, I explicitly build them on clean hosts each time (using GCB) or force the build to run with a cleared build cache.

dlorenc on 19 Apr 2018

👍2

There are very few PV types that support ReadWriteMany, which I think would be required for caching

https://kubernetes.io/docs/concepts/storage/persistent-volumes

r2d4 on 19 Apr 2018

Caching the base layers you use ahead of time on a ReadMany GCE PD would work though

r2d4 on 19 Apr 2018

Can we at least enable caching the layers for the FROM some-image:xyz? In our CI, we already have a awslinux+java+maven image which already has many layers takes 400-500 mbs, downloading them each time would be slow and unnecessary. Another example; we cache the dependencies as maven is very slow and we periodically prepare a base Docker image with dependencies already present. While they can be solved differently, e.g. mounting .m2 from outside, downloading the layers each time for CI is takes some little time even the layers are fetched from free and fast storage as S3.

mustafaakin on 5 May 2018

👍2

As an alternative cache storage could an object store like S3/GCS/minio be considered? I think it solves both the problem with ReadWriteMany PV complexity/availability (as caching should not require full-blown filesystem), and the problem with ephemeral build pods in K8s.
An object store could be quite performant if provided by the cloud or run locally alongside the build pods.

And if necessary, local transient caching between command runs in the same container (for example, if building 2 similar images in a single CI step) could be implemented independently.

Also, I would argue that caching layers in an object storage is not the same as pushing them into a container registry. First of all, a registry implies certain structure/metadata and visibility of the content, and those should not be needed for the temporary layers cache. For example, in a CI system I'd like to not push image layers resulting from building PRs to a central image registry where regular builds from master/tags are published. But I'd like to have a way to cache these layers in a fast object store local to the CI system to reduce the PR build times/CI load.

himikof on 11 May 2018

Will there be support for HTTP_PROXY_HOST ENV for accelerating apt-get installs?

diclophis on 5 Jul 2018

to @himikof 's point, caching in GCS/S3 would be ideal for CI jobs tremendously. That tied with caching the layers locally in a persistent volume would be perfect!

ekimia on 5 Aug 2018

👍3

The cache is for all the layers, not just the base, which is important, as many layers dont change with every PR that is made.

It would be great to see kaniko support something similar. Personally it seems docker registry is the best place to store the cache.

mcfedr on 23 Aug 2018

👍3

@mcfedr I agree with you. kaniko could even check against cache hits _before_ pulling the layers and fetch them one by one until there is a mismatch. With docker, you first pull everything and then use --cache-from.

An important feature would be to be able to --cache-from multiple images (e.g. current branch / master), which can be useful in PRs.

Here's what I currently do in many of my GitLab CI pipelines (just to give folks a flavour of what I mean):

build image:
  stage: build-image
  image: docker:stable
  services:
   - docker:dind
  script:
    - echo $CI_REGISTRY_PASSWORD | docker login --username $CI_REGISTRY_USER $CI_REGISTRY --password-stdin
    - docker pull $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG || docker pull $CI_REGISTRY_IMAGE:master || true
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --cache-from=$CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG --cache-from=$CI_REGISTRY_IMAGE:master --shm-size 512M .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
    - docker tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG

See script rows 2 and 3. $CI_COMMIT_REF_SLUG is our current branch, $CI_REGISTRY_IMAGE is something like registry.gitlab.com/group/repo. || true means that if the images I want to re-use cannot be pulled, the build carries on instead of crashing.

kachkaev on 23 Aug 2018

👍2

Talked with @sharifelgamal and we feel that the work done for caching so far (see the README https://github.com/GoogleContainerTools/kaniko#caching) covers this particular issue, we can open more issues if we want to add more specific caching (e.g. ADD command caching)