moby 🚀 - Pulling build cache

With the new content-addressable storage, there is no "parent chain"; an image is a collection of layers, and those layers are directly linked to the image (i.e. no need to traverse the parent-images to collect the dependent layers).

afaik, The build cache of an image is now separate, i.e. you can only make use of the build cache on the machine that actually _built_ the image; because the build cache depends on both the instructions in the Dockerfile, _and_ the build-context (the files used during build).

Did this change for you? I.e. were you previously able to docker pull an image on an empty machine, and see docker build skipping lines with "using cache..."?

thaJeztah on 15 Feb 2016

Yeah, our build agents depend on the "using cache" behaviour to be fast.

The build agents get destroyed at night and recreated in the morning, so
they need to recreate the cache by pulling down the latest version of the
images.

On Tue, Feb 16, 2016, 02:30 Sebastiaan van Stijn [email protected]
wrote:

With the new content-addressable storage, there is no "parent chain"; an
image is a collection of layers, and those layers are directly linked to
the image (i.e. no need to traverse the parent-images to collect the
dependent layers).

afaik, The build cache of an image has always been separate, i.e. you can
only make use of the build cache on the machine that actually _built_ the
image; because the build cache depends on both the instructions in the
Dockerfile, _and_ the build-context (the files used during build).

Did this change for you? I.e. were you previously able to docker pull an
image on an empty machine, and see docker build skipping lines with
"using cache..."?

—
Reply to this email directly or view it on GitHub
https://github.com/docker/docker/issues/20316#issuecomment-184254155.

delfick on 15 Feb 2016

👍44

Just chiming in here to say that we (Highland team) also depended on this functionality and are doing really ugly things to retain caching features in face of this change.

dustinlacewell on 16 Feb 2016

👍9

+1 this makes CI builds very, very painful.

fx on 17 Feb 2016

👍15

@dustinlacewell I'm curious about your _really ugly things_. Is that something you can share? I opened https://github.com/docker/docker/issues/20380 but I suppose the root cause is the same and I'd like to know a way to use cache even if it's very ugly.

kimh on 20 Feb 2016

😄2

+1 this broke caching functionality for us, and for our customers, any workarounds and fix ETA would be appreciated

bfosberry on 3 Mar 2016

+1. I'm one of those customers.

kreisys on 3 Mar 2016

+1 another customer

sdornan on 4 Mar 2016

Any thoughts @thaJeztah? Is there a way we can masquerade the build context? Previously if the relevant files for an ADD/COPY were identical it used the cache, even on another machine. How does the new image cache prevent caches from other machines from being used?

bfosberry on 7 Mar 2016

This is starting to make sense, so is there a way we can specify for image build layers to be included in the pull?

bfosberry on 7 Mar 2016

+1 another customer

luben93 on 8 Mar 2016

+1 as this does increase build times significantly, and also slows down day-to-day work for anyone who spins up lots of disposable VMs during development. @dustinlacewell I would also be very interested in hearing your _very ugly things_. We attempted to tag and push each individual layer during the build process and push/pull those as a way to recreate the cache, but to no avail.

rheinwein on 8 Mar 2016

Following some internal convo here at Docker —

This is addressing a security issue; and the associated threat model is "as an attacker, I know that you are going to do FROM ubuntu and then RUN apt-get update in your build, so I'm going to trick you into pulling an image that _pretents_ to be the result of ubuntu + apt-get update so that next time you build, you will end up using my fake image as a cache, instead of the legit one."

With that in mind, we can start thinking about an alternate solution that doesn't compromise security.

jpetazzo on 8 Mar 2016

👍5

That makes sense. It seems like we should be able to come up with a sensible middleground that does not compromise security using notary, or at least in the meantime allow users to bypass the security protections in situations where they are confident of the source of the layers

bfosberry on 8 Mar 2016

Surely if an attacker has access to where you are building your images, you
have bigger problems?

Also, if they can fake an intermediate image, what stops them faking the
final image?

On Wed, Mar 9, 2016, 04:21 Brendan Fosberry [email protected]
wrote:

That makes sense. It seems like we should be able to come up with a
sensible middleground that does not compromise security using notary, or at
least in the meantime allow users to bypass the security protections in
situations where they are confident of the source of the layers

—
Reply to this email directly or view it on GitHub
https://github.com/docker/docker/issues/20316#issuecomment-193875644.

delfick on 8 Mar 2016

👍4

After a discussion with a friend at work, I can see it from a different viewpoint.

So let's say we pull down evil/foo which is FROM ubuntu followed by RUN apt-get update except with a small surprise included in the image.

Subsequent builds using those same commands will be compromised.

Now if we base a build on evil/foo we get the same problem regardless of the intermediate images, but in this case we are trusting evil/foo has a whole is not compromised and we shouldn't have to trust that downloading evil/foo will negatively affect subsequent builds of other images.

So, my proposal is can we put trust at a per registry level?

So I can say to docker, I trust that only I can put images into this specific registry and that you may download intermediate images from it as well, because only I have the ability to put them there in the first place. And for public registries I only download the final image.

delfick on 8 Mar 2016

I propose adding support for loading parent chains in the load endpoint(this already works in legacy mode). I think docker load has a bit different security properties than pull. Then we can provide an external tool for loading/saving build cache metadata without restarting the daemon. So in CI, you could do docker pull and then try to apply the build cache data on it.

tonistiigi on 9 Mar 2016

Are there any known workaround to this issue currently?

mleventi on 18 Mar 2016

+1 We were using tar to be sure that our build context is always the same, so we could share cached layers. Now its useless and we have to build everything from scratch on every machine. We are using jenkins ec2 plugin so this means complete rebuild of all our images multiple times per day. We are using private registry with ssl so we are sure what are it the layers.

kramarz on 21 Mar 2016

👍3

This broke caching functionality for us as well. Is the attack vector registry poisoning or a MITM on docker during a FROM? AFAIK docker securely pulls from the registry.

amrali on 23 Mar 2016

The security concern is understandable, but an option to allow the trusting or pulling of cache would be a big win. It doesn't sound like people in this thread have an issue with not having the cache history for images pulled from the docker hub as much as between their own machines. It would seem if it was possible to pull the build cache from another docker host (or push and pull it from an internal registry for example ) would solve most peoples problems. It would also avoid the security issue (as you are simply pulling it from a trusted host that originally followed the security practices). I do not see any security decrease in this practice.

For more details:
This is a complicated issue but one that most likely needs some sort of shared caching solution otherwise it requires more aggressive tagging and complex client infrastructure.
Right now we build all the images and distribute them to various machines. Some clusters use additional images so those get built in that cluster. Before it would pull down everything in the hub, and go through and run build on all the dockerfiles for the images on that cluster. 99% of these images were build previously so it uses the cache and was done quickly. Then it pushes all images to the registry which again 99% were previously pushed so very few new bits are pushed.

With the new system obviously every build on any cluster that didnt do the originating build ends up having to rebuild every image. In addition it then ends up repushing every image to the registry (which is a lot of overhead).

This could possibly be changed by updating framework to rather than just try to build and push every image to only do so for ones that have changed. This is not easy however as it means additional tagging for every dockerfile revision or a client side database of the 'current' image id. Essentially for attempting to replicate the old behavior one needs to have a client side build client that hashes the dockerfile build context like the build daemon.

mitchcapper on 25 Mar 2016

This issue is effectively stopping me from using docker. Please add to "docker pull" an option to also pull build cache/intermediate images (at our own risk).

I'm new to docker, probably i'm wrong, but i don't see any security issue there. If the image hash now is a secure hash instead of a UUID (for example SHA-2 512) the probability of collision almost zero.
If an attacker could trick your docker client to pull its image instead of yours is because it computes the same hash. Then, or both are the same byte by byte, or he spent years on brute-force to find a collision. If someone could do this, then he could trick also ssl certificates, for example. Or full images. I don't see why this would be an attack on intermediate images and is not a concern on final images.

Anyway, I'm all for an option that enables this kind of pull "at our own risk". Please.

deavid on 7 Apr 2016

i don't see any security issue there. If the image hash now is a secure hash instead of a UUID

Build cache is not based on the image IDs but uses a different method that tries to map Dockerfile commands to configurations.

But yes, for some cases like CI that only download a single image from a trusted source there is no issue. We have merged #21385 that will ship in v1.11 that can be used by external tools to import chains of image configurations. I'll try to start working on one of such tools soon.

tonistiigi on 7 Apr 2016

👍6

@tonistiigi this is an important for us, let us know is we can assist with the development of such tooling

bfosberry on 7 Apr 2016

@tonistiigi would the behavior be to docker pull, docker save | docker load ? Or would we have to docker save from a hub and use docker load on the machine?

mitchcapper on 7 Apr 2016

Build cache is not based on the image IDs but uses a different method that tries to map Dockerfile commands to configurations.

@tonistiigi Why doesn't the build cache use image IDs?

simonvanderveldt on 14 Apr 2016

This problem is affecting us too. I hope this comment isn't OT, but I'd like to share the work-around that makes things somewhat manageable when using docker as part of a CI system.

In a nutshell we have a small cluster of EC2 instances that each run a Buildkite daemon. We've defined a build pipeline, basically a list of steps to be performed either in sequence or in parallel. The first step builds a docker image that we run the tests within, and uploads it to a private registry. All the other steps wait for it to finish, and then run in parallel: each of them downloads the image and runs the tests within a container.

Here's an example of our pipeline config:

https://github.com/MountainRoseHerbs/spree/blob/master/.buildkite/pipeline.yml#L2-L11

We use Buildkite's named job queue to have one worker listening to queue for building containers (mrh-build-container), and all the other workers are listening to the normal queue (mrh-build). You would think with one "builder worker" that it'd be a bottleneck, but since it can use the cache layers aggressively we can often build new versions of containers in a few seconds. Also pushing and pulling isn't usually that time consuming since only the last few layers change of the test image change between revisions.

Yes, it does take time to push the initial container and then pull it for each job but it's still not as time consuming as rebuilding for each job. I really wish we still had the pre-1.10 cache behaviour though, it was much nicer not having to pin the build job to a single worker.

I don't know if this thread is the appropriate place to answer further questions about this, so if anyone has any questions feel free to email me at the address in my profile.

dkubb on 3 May 2016

I'll try to start working on one of such tools soon.

@tonistiigi Any news on this? :)

jayniz on 3 May 2016

We have merged #21385 that will ship in v1.11 that can be used by external tools to import chains of image configurations. I'll try to start working on one of such tools soon.

Pardon if this is obvious, but how would one use this to share the build cache between machines?

Also, any updates on the tool mentioned?

itamaro on 9 May 2016

@itamaro The patch allows to use load endpoint to load in multiple image configurations that will restore their parent/child relationships. The parent chain is used in build cache to find images for configuration comparison. You can also use the load endpoint to load configuration without layers if the layers are already downloaded with pull.

I've noticed a problem though on getting the image configurations from the API. I thought save endpoint would allow this but apparently there seems to be a bug on exporting multiple images. Because of the pre-v1.10 compatibility in this endpoint if you save multiple images they don't seem to share duplicate layers, making the whole process inefficient.

tonistiigi on 9 May 2016

@itamaro The patch allows to use load endpoint to load in multiple image configurations that will restore their parent/child relationships. The parent chain is used in build cache to find images for configuration comparison. You can also use the load endpoint to load configuration without layers if the layers are already downloaded with pull.

I've noticed a problem though on getting the image configurations from the API. I thought save endpoint would allow this but apparently there seems to be a bug on exporting multiple images. Because of the pre-v1.10 compatibility in this endpoint if you save multiple images they don't seem to share duplicate layers, making the whole process inefficient.

Thanks @tonistiigi !

So, sorry for being dense, but still not sure I understand how using this would look like.

You mentioned the load & save endpoints & API's - does this mean that exporting and restoring parent/child relationships is something that can be done only when working with some low level API's, as opposed to docker save and docker load CLI commands?

Also, what is the relation between "parent chains" and the "build cache"? Does a "parent chain" contain the "build cache", or is it something that I need to copy between hosts somehow separately?

Does the bug you refer to affect only efficiency, or the ability to make this work altogether in Docker 1.10 / 1.11?

I'm sure I didn't get something right, because the following experiment I did failed:

On host A:

mkdir foo && cd foo
cat <<EOF>Dockerfile
FROM ubuntu:15.10
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update -y
RUN apt-get install -y --no-install-recommends curl wget autoconf
RUN echo "Hello, world"
EOF
docker pull ubuntu:15.10
docker build -t silly/test:foo .
docker save -o silly.test.tar silly/test

Transfer the tar and Dockerfile to host B, then on host B:

docker load -i silly.test.tar
docker pull ubuntu:15.10
cd foo
docker build -t silly/test:foo .

This still resulted a full rebuild on host B...

Both hosts are Linux hosts (ubuntu 14.04.4) with Docker 1.11.1.

itamaro on 10 May 2016

@itamaro docker save doesn't save the parent images but if you save multiple images like docker save img1 img1parent then loading this tar back will create both images and restore the parent-child relationship. Build cache essentially works by predicting an image configuration and finding an existing image that has the same configuration. The extra requirement is that the previous Dockerfile command must be cache matched with the parent of that image.
The bug is for efficiency but it seems quite bad. Every layer would need to be exported separately. Maybe the solution is to directly use the data in graph directory instead of using remote API.

tonistiigi on 10 May 2016

@tonistiigi thanks for clarifying how it would have to function. Certainly that would be tough to implement and require a whole separate distribution channel than a registry to move the image cache history between machines in a platform. At best it would be good if one could run an 'export history' command on a build machine (that exported a json/xml file with the graph data and corresponding image names), and then as long as the target machine has pulled the same image down use that history to match all the graph entries to restore the history on the target machine. While this requires infrastructure to distribute the 'export history' file it would atleast avoid having to transfer all the image data through a separate channel also.

mitchcapper on 10 May 2016

I've tried to use EBS to cache docker build artefacts but docker doesn't seem to pick them up either. Does anyone know what directory to cache and when to mount it

graingert on 12 May 2016

@dustinlacewell what are the ugly things you're doing?

graingert on 23 May 2016

+1 it would be great to get some official clarity on what is happening with this issue. It seems very strange to me to break functionally that users are depending on and for this issue to still be open with no conclusions 3 months later. It's one thing to lock down default behaviour in the name of security but it is not ok to not provide a work around when people control their own infrastructure.

@tonistiigi I have to wonder what is the point of the save and load commands if caching is not preserved. I have to admit that I don't follow your comments at all about how you actually save and load all the image layers, can you provide a real example, even if it is inefficient it may still be better than us all having to rebuild our complete images on every build.

netproteus on 23 May 2016

@netproteus Here is a quick example of how this works

Given this Dockerfile

FROM busybox
RUN mkdir this-is-a-test
RUN echo "hello world"

run docker build -t caching-test .
Then we can see the layers comprising the image with docker history caching-test

3e4a484f0e67        About an hour ago   /bin/sh -c echo "Hello world!"                  0 B                 
6258cdec0c4b        About an hour ago   /bin/sh -c mkdir this-is-a-test                 0 B                 
47bcc53f74dc        9 weeks ago         /bin/sh -c #(nop) CMD ["sh"]                    0 B                 
<missing>           9 weeks ago         /bin/sh -c #(nop) ADD file:47ca6e777c36a4cfff   1.113 MB

The change to save/load in 1.11 preserves the relationship between parent and child layers, but only when they are saved via docker save together. We can see the parent of the final test image by running docker inspect test | grep Parent.

$ docker inspect caching-test | grep Parent
"Parent": "sha256:6258cdec0c4bef5e5627f301b541555883e6c4b385d0798a7763cb191168ce09",

This is the second-to-top layer from our Docker history output.

In order to recreate the cache using save and load, you need to save out all of the images and layers that are referenced as parents. In practice this typically means that you need to save each layer, as well as the FROM image, in the same command.

docker save caching-test 6258cdec0c4b busybox > caching-test.tar -- note that we can also give the layer names instead of IDs to the save command.

Let's purge everything and then reload the image from the tar file.
docker rmi $(docker images -q). Confirm that no images exist.

Then run docker load -i caching-test.tar. If you look at the images, you'll see busybox, and then caching-test. Running docker history caching-test will show you the exact same output as when the image was initially built. This is because the parent/child relationships were preserved via save and load. You can even run docker inspect caching-test | grep Parent and see the exact same ID given as the parent layer.

And running a rebuild of the same Dockerfile will show you that the cache is being used.

Sending build context to Docker daemon 5.391 MB
Step 1 : FROM busybox
 ---> 47bcc53f74dc
Step 2 : RUN mkdir this-is-a-test
 ---> Using cache
 ---> 6258cdec0c4b
Step 3 : RUN echo "hello world"
 ---> Using cache
 ---> 3e4a484f0e67
Successfully built 3e4a484f0e67

If we changed the last line in the Dockerfile to echo "Hello, world!" instead, you will see the cache used for the mkdir command, and then a new layer ID for the last command.

It's a bit of a pain, but you can get nearly the same result with save/load as with the remote registry pull.

You can save the .tar file to S3 or EBS, and then share it among build machines. I've tested this with S3 and machines in multiple AZs on Digital Ocean and had no issues.

rheinwein on 24 May 2016

👍25

The problem I see is this requires a whole secondary infrastructure to do what the registry had already done. For example what if the host you want to restore history on already has 1/2 of the layers, you are now transferring a bunch of extra data to the host. In addition when you go from one image out to say several dozen this process becomes a bit more complex to make sure everything is restored correctly and minimal duplication (or without one just giant tar). If you change only a small thing on the master builder and then need to deploy all the images to all the rest unless you do smart diffing you are now transferring gigs of date for each change.

It would seem the ability to simply export the build cache metadata on one host and then import it on multiple other hosts that already have all the images (through a registry) should be very data efficient and effective (and the size of the metadata itself is probably not very big).

mitchcapper on 24 May 2016

👍5

I'm working on a tool to generate stripped history files:

import json
import tarfile
import itertools
import io
import codecs

from docker import utils as docker_utils
from docker import errors as docker_errors
import docker

writer = codecs.getwriter('utf8')

def add_file(tar_file, name, obj):
    info = tarfile.TarInfo(name)
    with io.BytesIO() as f:
        json.dump(obj, writer(f))
        info.size = f.getbuffer().nbytes
        f.seek(0)
        tar_file.addfile(info, f)


def write_tar(cli, tar_file):
    manifest = []
    count = itertools.count()

    for image in cli.images():
        for parent_image in cli.history(image['Id']):
            parent_image_id = parent_image['Id']
            if parent_image_id == '<missing>':
                continue
            inspection = cli.inspect_image(parent_image_id)
            json_file = "{}.json".format(next(count))
            add_file(tar_file, json_file, inspection)
            manifest_entry = {
                'Config': json_file,
                'RepoTags': None,
                'Layers': [],
            }
            parent = inspection.get('Parent')
            if parent:
                manifest_entry['Parent'] = parent

            manifest.append(manifest_entry)

    add_file(tar_file, 'manifest.json', manifest)


def main():
    cli = docker.Client(**docker_utils.kwargs_from_env())
    with tarfile.open('out.tar', 'w') as tar_file:
        write_tar(cli, tar_file)


if __name__ == "__main__":
    main()

graingert on 24 May 2016

👍2

docker save caching-test $(sudo docker history -q caching-test | tail -n +2 | grep -v \<missing\> | tr '\n' ' ') > caching-test.tar

pmbauer on 24 May 2016

👍9

nice @pmbauer, super handy. It's probably helpful to also strip out all the <missing> layers from the layer ID list as well, otherwise docker save will return an error.

rheinwein on 24 May 2016

👍1

@pmbauer I don't want the actual images, I can distribute those through a container registry

graingert on 24 May 2016

@rheinwein yep; grep -v as appropriate. But from a build server, ~~I wouldn't expect missing layers~~.
edit: actually, that statement of mine is wrong, fixed

pmbauer on 24 May 2016

👍3

I appreciate everyone working on this. Both approaches are interesting.

From a UX point of view, it would be nice if it was just an extra flag for push,pull, save, and load.

I'm not super worried about the security since this should be something pushed to a docker registry under my control.

docwhat on 31 May 2016

❤1

Maybe if the builder generated/loaded a key pair and signed the history it could be distributed through the registry by default. Then you just add the keys of trusted builders to other builders, or use a trust network.

graingert on 31 May 2016

I actually think trying to get better cache usage for CI agent image builders by pulling, building and pushing is actually the wrong approach, for anything other than the latest tag. Imagine you are running builds which get a sequential build number. When the build succeeds, the image is pushed to the registry and tagged with that build number. To get proper cache usage, your agent would need to find the previous build that succeeded, pull that image, then build & push the new image - i.e. the source tag of the image cache layers and the destination tag are different.

I'm actually interested in doing some work on having an image cache that can be shared between all CI workers. When building an image on a worker, at each step of the dockerfile, after docker checks the local graph for a matching image, it could also check this remote service for it (and if found, pull it). If it didn't find it, it could build it itself and then push the image it created.

Doing something like this, you wouldn't need to implement any silly logic in your CI builds to pull down totally unreleated images for the sole purpose of getting cache layers.

I think there'd be an interesting discussion to have as to whether the central cache is itself a docker registry, or something else. But you would specify the location of this "cache server" yourself when issuing something like docker build --cache-from https://cache-registry.ci.mycompany.com, so there is no risk that someone evil has gone and uploaded a cache layer which claims to be "FROM ubuntu; RUN apt-get update".

KJTsanaktsidis on 5 Jun 2016

👍8

The main difference between the existing docker registry and what this hypothetical cache server would need to do:

It would need to know how to answer the question "give me an image which is this parent image after running this container config" - i.e. what GetCachedImage in daemon/image.go is doing
Its storage requirements are different. The registry should be a permenant an immutable record of your organisations docker images, so you have every version of an image that you've run in production. The image cache can probably safely discard images (and their children) that have not been accessed in some time.

If people think this is a good idea, I'd like to have a swing at prototyping something.

KJTsanaktsidis on 5 Jun 2016

@KJTsanaktsidis I think making it a registry would be quite limiting, and having it as a separate service would be better, as you say lifecycle is very different as you can always reconstruct it.

justincormack on 5 Jun 2016

Except the build cache and images are very closely related so separating them into different services not only seems overly complicated but not very worthwhile. The build cache essentially says a dockerfile from image X, when command Y was run resulted in Image Z. The layers of those images are already essentially stored in the registry (x/z). So largely we are talking about that metadata. Whatever method is decided should not require transferring images through a 3rd party method (other than the registry) as the registry is there for a reason. It handles de-duplication, only pulling changes, etc already. If you require the entire image transferred another way you now get into duplicating all that work. Instead if the metadata was easy to export/import (aside from the image itself) that would be a great start. If you want to then make a metadata service that's fine but you start to become more complex (which also means more time to code, and more time before a fix is in place).

mitchcapper on 5 Jun 2016

@mitchcapper this does make sense as a separate service, it just maintains history and image ids, while the main registry will hold the actual data. This also allows me to deploy this cache tool next to an existing registry without this feature

graingert on 6 Jun 2016

Hey all, I wrote a blog post on a method to distribute cache post docker 1.10, hope you find it helpful!
Distributing Docker Cache Across Hosts

anandkumarpatel on 7 Jun 2016

👍7

@anandkumarpatel Good post and workarounds for the current version! Hopefully the future will see this fully resolved.

mitchcapper on 8 Jun 2016

I may be coming from left field but what I would like to see is the ability to add an identifier to a Dockerfile that remains unchanged as the Dockerfile grows and is rebuilt, such that the identifier defines a caching context that can be reused between builds; e.g.,

FROMCACHE my account/repository:tag

This way independent Dockerfiles that do not relate to each other would not share cache entries and could not interfere.

The cache could be extended into the remote repository and delivered with pull as a way of reusing the cache.

Alternatively, I would be happy to have a command switch on docker build that allowed me to specify an image name to use as the layer cache for the build

sammck on 8 Jun 2016

@anandkumarpatel - there seems to be one MAJOR gotcha, which is you need to be able to save the complete image chain and unless you built ALL image layers for a given image on the same machine, you can't recreate the build chain. I.e. if you are building an image from any of the official images, you are ** out of luck 👎

EDIT: I read further up this thread and found the post from @pmbauer - so looks like you can just filter out the < missing > tags and the docker save/load method works 👍

jmenga on 20 Jun 2016

We've been negatively affected by this change as well. Please provide an option to restore the previous caching behavior thru docker pull.

shuber on 24 Jun 2016

Big thanks to @pmbauer - I've been able to automate loading the build cache into my workflow using S3 to store the build cache and then updating the build cache after each run.

But would be much nicer if we could just docker pull...

jmenga on 24 Jun 2016

I just want to add that Travis-CI is still using Docker 1.8.2, so most projects are not yet affected by this issue. I actually found this issue when my build times went from 5 minutes to 1h+ after upgrading to Docker 1.10.1 on Travis.

If/when they upgrade everyone to >=1.10.1, this will cause massive pain to many, many users because up until now it was standard practice to pull then build.

The hacks using save/load are really ugly, make Docker much less practical to use, and may not be much better for security anyway. I think something really needs to be done here, either a new flag on docker pull or any other simple workaround that could make it possible to pull caches easily again, at least on CI systems.

sylvinus on 15 Jul 2016

👍7

Codeship engineer here -- we also stayed on 1.9 for much longer than we normally would have to avoid breaking caching for our users. We completely rewrote our remote caching system to rely on save and load instead of a registry, using the behavior introduced in https://github.com/docker/docker/pull/21385. After this change was in place, we were able to update our build machines to 1.11 with no problem.

Relying on save and load is very clunky and not very performant, as previously mentioned. In the best case, it is on par with the registry pull, and in the worst case, the save/load can take an extra few minutes per build.

Allowing parent layer metadata to be saved for a layer, regardless if the parent layer is in the save command, would be a huge win for those of us working on CI/remote systems. I'm not sure if this goes against the security/naming conventions that have been outlined for 1.10+.

This change would allow us to save the FROM image separately from the rest of the image, and give us more flexibility for optimizing saving and loading.

For example, given this Dockerfile:

FROM busybox

RUN mkdir test-dir
RUN echo "Hello World!"

We get this image:

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
4956c22042b4        11 seconds ago      /bin/sh -c echo "Hello World!"                  0 B                 
ee0bec568dfb        12 seconds ago      /bin/sh -c mkdir test-dir                       0 B                 
2b8fd9751c4c        3 weeks ago         /bin/sh -c #(nop) CMD ["sh"]                    0 B                 
<missing>           3 weeks ago         /bin/sh -c #(nop) ADD file:9ca60502d646bdd815   1.093 MB

The busybox layer is saved as the parent of ee0bec568dfb

$ docker inspect ee0bec568dfb | grep Parent
        "Parent": "sha256:2b8fd9751c4c0f5dd266fcae00707e67a2545ef34f9a29354585f93dac906749",

But if busybox is saved as busybox.tar, and the other two layers are saved independently, the parent metadata is not saved for ee0bec568dfb, even though it was present in the original image I saved. Because of this "missing link", the rebuild of the image can't use the local cache.

$ docker load -i busybox.tar 
Loaded image: busybox:latest
$ docker load -i cache-test.tar 
d2700560cfb2: Loading layer [==================================================>] 1.536 kB/1.536 kB
Loaded image ID: sha256:4956c22042b480a46cf29553748426c5d0b5d7d416db3c0f8ac5b475a16a522c
Loaded image ID: sha256:ee0bec568dfb9cb23e64307503827b47e27ecfc8abb71446a43d9861406774c5
cache-test $ docker history 4956c22042b4
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
4956c22042b4        6 minutes ago       /bin/sh -c echo "Hello World!"                  0 B                 
ee0bec568dfb        6 minutes ago       /bin/sh -c mkdir test-dir                       0 B                 
<missing>           3 weeks ago         /bin/sh -c #(nop) CMD ["sh"]                    0 B                 
<missing>           3 weeks ago         /bin/sh -c #(nop) ADD file:9ca60502d646bdd815   1.093 MB            
cache-test $ docker inspect ee0bec568dfb | grep Parent
        "Parent": "",

It would be awesome to rely on Docker for this behavior instead of having to write some custom implementation, as I think a lot of users would benefit from it.

cc @chesleybrown

rheinwein on 15 Jul 2016

👍16

FYI it happened today, Travis rolled forward (unannounced) to Docker 1.12:
https://blog.travis-ci.com/2016-08-09-outage-gce-images

This is going to suck for many users until there is a reliable way to pull caches.

Edit: #24711 looks like the solution!

sylvinus on 9 Aug 2016

👍6

Re Travis, I've decided to simply use docker save ci-image:$TAG | gzip > $TAG.tar.gz and make use of Travis cache for now. Won't cover all cases, but should be sufficient in the mean time.

oxplot on 10 Aug 2016

https://github.com/tonistiigi/buildcache is a tool that allows exporting build cache so that it can be later restored after docker pull. It works on v1.12 through API only and in v1.11 if you can point it to a graph directory.

There is a --cache-from proposal in https://github.com/docker/docker/pull/24711#issuecomment-237666954 that we think can be implemented without compromising on cache poisoning and content addressability.

tonistiigi on 17 Aug 2016

👍3

@tonistiigi Thanks for a great looking easy to use tool! Will try to get this into our build flow shortly.

mitchcapper on 17 Aug 2016

@tonistiigi tool works very well:) We are storing the cache files by id hash for the image (docker images -q image:tag), so there is no confusion in the automation if one is missing. Old build conveniences are back. Thanks!

mitchcapper on 19 Aug 2016

Going to close this as #26839 was merged. Please check it out and report issues if you find any.

tonistiigi on 26 Sep 2016

@tonistiigi I might be misunderstanding things, but how would #26839 solve not having a build cache on a CI machine? All the build jobs would need to parse the Dockerfile to determine which image is in the FROM line and append it to the docker build command?

simonvanderveldt on 26 Sep 2016

@simonvanderveldt The image specified in --cache-from is the image from the previous CI build. It doesn't have anything to do with the FROM line in Dockerfile. CI can pull it with regular docker pull, like they could before v1.10.

tonistiigi on 26 Sep 2016

👍1

@tonistiigi OK, thanks for the clarification. We'll give it a try, let's see how well it works :)

simonvanderveldt on 26 Sep 2016

I think it would be much better to whitelist registries eg

--history-whitelist 'mycustom.example.com/frontend' 'ubuntu' '
mycustom.example.com/backend'

On 26 Sep 2016 19:15, "Simon van der Veldt" [email protected]
wrote:

@tonistiigi https://github.com/tonistiigi I might be misunderstanding
things, but how would #26839 https://github.com/docker/docker/pull/26839
solve not having a build cache on a CI machine? All the build jobs would
need to parse the Dockerfile to determine which image is in the FROM line
and append it to the docker build command?

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/20316#issuecomment-249650990, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAZQTFqMsq0xsAJkHY3CKXk73Sj5B3Baks5quAutgaJpZM4HZ8dC
.

graingert on 26 Sep 2016

That's a cool feature. But I don't understand how that doesn't conflict
with your earlier security concerns... (maybe I need more coffee?)

I agree with thomas above that this ticket would be better served by
whitelisting registries. (with a massive warning if you whitelist
dockerhub!)

And, as a bonus, in a way that is global to docker so I don't have to
change all projects.

On Tue, Sep 27, 2016, 05:04 Thomas Grainger [email protected]
wrote:

I think it would be much better to whitelist registries eg

--history-whitelist 'mycustom.example.com/frontend' 'ubuntu' '
mycustom.example.com/backend'

On 26 Sep 2016 19:15, "Simon van der Veldt" [email protected]
wrote:

@tonistiigi https://github.com/tonistiigi I might be misunderstanding
things, but how would #26839 https://github.com/docker/docker/pull/26839
solve not having a build cache on a CI machine? All the build jobs would
need to parse the Dockerfile to determine which image is in the FROM line
and append it to the docker build command?

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/20316#issuecomment-249650990,
or mute
the thread
<
https://github.com/notifications/unsubscribe-auth/AAZQTFqMsq0xsAJkHY3CKXk73Sj5B3Baks5quAutgaJpZM4HZ8dC

.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/20316#issuecomment-249665309,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGq9YnCIXMv1XHw_LEwMH-tWFGGMivkks5quBdIgaJpZM4HZ8dC
.

delfick on 27 Sep 2016

👍1

It used to be great that I was able to select a layer from any image and use it as a starting point. Currently, I am given an image that has 4 layers to be stripped off to get to the original base image. The original image is not reconstructable in any other way.

I'll go back to Docker 1.9 and do the thing there by simply tagging the given layer.

I couldn't find a way to do this with this new system. Does anyone have some advice?

andrask on 7 Dec 2016

@andrask you still can, if the image was built locally, the intermediate layers are still stored locally as images during docker build. Those images are not _distributed_ though, when doing docker push

docker build -t foo .
Sending build context to Docker daemon 2.048 kB
Step 1/3 : FROM alpine
 ---> baa5d63471ea
Step 2/3 : RUN echo "step-two" > foobar
 ---> Running in dac42a660616
 ---> 8d8f7ba114a1
Removing intermediate container dac42a660616
Step 3/3 : RUN echo "step-three" > foobar
 ---> Running in fc1292ec6183
 ---> 401c84521cea
Removing intermediate container fc1292ec6183
Successfully built 401c84521cea

docker run --rm 8d8f7ba114a1 cat foobar
step-two

docker run --rm 401c84521cea cat foobar
step-three

thaJeztah on 7 Dec 2016

@thaJeztah Unfortunately, this image was built months ago. No one has the build any more. We are left with a descendant image that has all the original content but on lower layers.

With docker 1.8.3 I have just downloaded the image from the registry and tagged the given layer.

Is anything like this possible with the new setup?

andrask on 7 Dec 2016

@andrask the _layers_ are all there, if you docker save -o image.tar <image> you'll get an archive containing all the image and layer data. Not sure how easy it is to reconstruct an image from previous layers, haven't tried

thaJeztah on 7 Dec 2016

@thaJeztah Thanks for the info. Though, I (and probably many others) would highly appreciate if there was some guidance on how this can be done. Creating the image descriptor by hand, removing the unneeded layer info from the json configs would probably work, I guess.

andrask on 7 Dec 2016

@andrask you are talking about a small corner case I believe. Using the tool above you can already save the metadata for an image after it is built with all the layers and transfer that to any system you need. You are talking about the case where you have lost the original build cache, and do not want to rebuild it locally on one machine to get the history to be able to distribute. Most of the pain areas described above do not match with that.

mitchcapper on 7 Dec 2016

@mitchcapper I'm not sure if this is a corner case. I think it easily falls in line with one of the comments above:

Allowing parent layer metadata to be saved for a layer, regardless if the parent layer is in the save command, would be a huge win for those of us working on CI/remote systems.

Reusing parent layers used to be ridiculously easy. It would be good if we could get some comparably easy way to do it now.

PS: just to make my case clearer
It is impossible to rebuild the base from the Dockerfile as the 3rd party dependencies have changed significantly since 8 months ago when the base was last built. The tags for my base image have been overwritten and I can only restore them from a descendant image.
With Docker 1.8 I simply pulled the descendant image, tagged the base layer and I was done.
With Docker 1.10+ I'd need to save, then manually construct the base image descriptor and reload it. Doable but sad that it's far more complex.

andrask on 8 Dec 2016

👍1

I tried to use the mentioned method with --cache-from, however it doesn't work as expected.
It does indeed get some more steps from the cache, but not all of them.

The case for me is that there are a few people working on an image (weighting about 2g at the moment), and re-building most of the image at any change by any of those people, then re-pushing it is a big pain.

@andrask obviously it's too late, but it's a good practice to keep the 3rd party dependencies mirrored in your own infrastructure :) There is NO GUARANTEE that even a huge site (like launchpad for downloading DEBs) won't go down over a period of time. Plus it obviously saves a lot of time when doing wget -O - 500mb-sources-file.tar.gz | tar xzf - && configure && make/install && rm -rf sources in one step

kbiernat on 19 Jan 2017

@kbiernat

I tried to use the mentioned method with --cache-from, however it doesn't work as expected.
It does indeed get some more steps from the cache, but not all of them.

Have you opened an issue for this?

tonistiigi on 19 Jan 2017

@tonistiigi no :) I'll try to investigate the reasons and will open an issue with full description of what the switch changes for me, if needed,
thank you for your interest.

kbiernat on 19 Jan 2017

I have been trying to make use of --cache-from command in docker 1.13, no luck so far, any help would be really appreciated. This is the step I have been following:
Dockerfile1:
FROM centos:7
RUN echo "Step1"
RUN echo "step2"

docker build -t kish0509/test1:latest .
docker push kish0509/test1:latest

Dockerfile2:
FROM centos:7
RUN echo "Step1"
RUN echo "step2"
RUN echo "step3"
RUN echo "step4"

In the second machine I have been trying to make use of cache:
docker pull kish0509/test1:latest
docker build --cache-from kish0509/test1:latest -t kish0509/test2:latest .

It doesn't make use of cache for the first 2 instructions rather it runs each instructions again. Am I missing anything?

kish3007 on 1 Mar 2017

@kish3007 What is the output of
docker history --no-trunc registry/test1:latest docker history --no-trunc registry/test2:latest

tonistiigi on 1 Mar 2017

docker history --no-trunc kish0509/cachtest:v1-latest

IMAGE                                                                     CREATED             CREATED BY                                                                                          SIZE                COMMENT
sha256:3fa7c14f1eb1c5fdd9175de84dcbd730d809febc5b7ee400c211e291e16124e0   2 hours ago         /bin/sh -c echo "step 2"                                                                            0 B                 
<missing>                                                                 2 hours ago         /bin/sh -c echo "step1"                                                                             0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop)  CMD ["/bin/bash"]                                                                0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop)  LABEL name=CentOS Base Image vendor=CentOS license=GPLv2 build-date=20161214     0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop) ADD file:940c77b6724c00d4208cc72169a63951eaa605672bcc5902ab2013cbae107434 in /    192 MB              
<missing>                                                                 6 months ago        /bin/sh -c #(nop)  MAINTAINER https://github.com/CentOS/sig-cloud-instance-images                   0 B

docker history --no-trunc kish0509/cachtest:v2-latest

IMAGE                                                                     CREATED             CREATED BY                                                                                          SIZE                COMMENT
sha256:31919b97f2dcd7409d72aa18727eb72acd4a4565a639d016a3866c27fc8d8a07   30 seconds ago      /bin/sh -c echo "step 4"                                                                            0 B                 
sha256:21a9a56449d5ee4f5c6f927b9891fb400243609920d6388dd2a3a96625a21a83   32 seconds ago      /bin/sh -c echo "step 3"                                                                            0 B                 
sha256:14d89a0f37b04ce8dc20d2286d7698f7c8f45469a1a1279ead0960bd75df7fd7   34 seconds ago      /bin/sh -c echo "step 2"                                                                            0 B                 
sha256:08e6a91d1b4c163fc3c1241cf88903d811b830837026276c42300d019af10198   36 seconds ago      /bin/sh -c echo "step1"                                                                             0 B                 
sha256:67591570dd29de0e124ee89d50458b098dbd83b12d73e5fdaf8b4dcbd4ea50f8   2 months ago        /bin/sh -c #(nop)  CMD ["/bin/bash"]                                                                0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop)  LABEL name=CentOS Base Image vendor=CentOS license=GPLv2 build-date=20161214     0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop) ADD file:940c77b6724c00d4208cc72169a63951eaa605672bcc5902ab2013cbae107434 in /    192 MB              
<missing>                                                                 6 months ago        /bin/sh -c #(nop)  MAINTAINER https://github.com/CentOS/sig-cloud-instance-images                   0 B

kish3007 on 1 Mar 2017

@kish3007 This seems to be #31189 . It should be fixed when you update to v17.03.0-ce

tonistiigi on 2 Mar 2017

Thank you very much. This issue is fixed in v17.03.0-ce, however is there a way to pull the image on fly during docker build? Like this: docker build --cache-from kish0509/cachetest:1 --pull -t kish0509/cachetest:2 .

Right now we pull the version1 image separately and we build a new image.

kish3007 on 2 Mar 2017

In case someone is going nuts with reusing layers as I did, the "trick" is to pass to --cache-from the image you are rebuilding (and have it pulled already) and ALSO the image that it uses as base in the FROM.

Example:
Dockerfile for image custom-gource:0.1

FROM base_image:2.2.1
RUN apt-get update && apt-get install gource
COPY myscript.sh /myscript.sh

In order to rebuild in other host without doing the apt-get again, you'll need to:

docker pull custom-gource:0.1
docker build --cache-from=base_image:2.2.1,custom-gource:0.1 . -t custom-gource:0.2

It might seem too obvious but I've been struggling long time with this until I got that you need to include the base image too

javipolo on 17 Jan 2018

👍18 ❤8 🎉5

Moby: Pulling build cache

Most helpful comment

All 85 comments

Related issues