_Copied from https://github.com/docker/compose/issues/3148._
I discovered a few days ago that when building images, a cache generated from docker build will not be used by docker-compose build or vice versa. This happens both on my Mac laptop (Docker Machine, Docker v1.10.3 installed through Homebrew) and on an Ubuntu 14.04 server (Docker v1.10.2 installed through apt-get).
Version info for my laptop:
docker-compose version 1.6.2, build unknown
docker-py version: 1.7.2
CPython version: 2.7.10
OpenSSL version: OpenSSL 0.9.8zg 14 July 2015
Version info on my server:
docker-compose version 1.6.2, build 4d72027
docker-py version: 1.7.2
CPython version: 2.7.9
OpenSSL version: OpenSSL 1.0.1e 11 Feb 2013
It is simple to reproduce. The directory _tiny-app_ in https://github.com/SamirTalwar/docker-build-weirdness shows the issue:
$ docker build --tag=tiny-app .
Sending build context to Docker daemon 4.096 kB
Step 1 : FROM ruby
---> 0f58cbcb8dce
Step 2 : WORKDIR /app
---> Running in be80cb8a6324
---> 2972973d66d4
Removing intermediate container be80cb8a6324
Step 3 : COPY script ./
---> 77bc9f2b16eb
Removing intermediate container dc9beae186af
Step 4 : CMD ./script
---> Running in ea5f91334ea4
---> d69db9b758e0
Removing intermediate container ea5f91334ea4
Successfully built d69db9b758e0
$ docker-compose build
Building tiny-app
Step 1 : FROM ruby
---> 0f58cbcb8dce
Step 2 : WORKDIR /app
---> Using cache
---> 2972973d66d4
Step 3 : COPY script ./
---> abb5045392d4
Removing intermediate container 1b1450dd66eb
Step 4 : CMD ./script
---> Running in a0c7889ca68a
---> 354c8fcf4876
Removing intermediate container a0c7889ca68a
Successfully built 354c8fcf4876
From the COPY operation onwards, it no longer uses the cache.
I don't know, but I am pretty sure, that this is because the tarballs sent to the Docker server are different. Also in that repo are the uploaded tarballs and hex dumps (hexdump -C), of the same from the CLI and Compose (through docker-py), captured through a fake HTTP server, _server.py_ in the repo. The only real differences seem to be the inclusion of the owner user and group names in the latter, and a flag in the file mode header (grep for 0100644 vs. 0000644) which I cannot find the life of me find documentation on.
I could, of course, be way off. And even if I'm on track, I can't be sure it'll be considered a bug here as opposed to the server.
Whatever the reason, I'd love it if we could reconcile these so the cache is used. Cheers!
@SamirTalwar we are having the same issue -- did you find any workaround?
@fixe: I'm afraid not. Either the server or docker-py will need to be updated to accommodate the variance in the tarball.
docker-py uses python module tarfile to generate tarballs. It's possible there is an inconsistency between the implementations (docker-py, docker engine) - it could be interesting to try gnu tar.
I think the OP was able to track down the differences already:
The only real differences seem to be the inclusion of the owner user and group names in the latter, and a flag in the file mode header (grep for 0100644 vs. 0000644)
That would be a good place to start. If we can make those consistent I think it's very likely that the tarballs will match.
So we need to set tarinfo.uname and tarinfo.gname to the empty string (https://docs.python.org/2/library/tarfile.html#tarfile.TarInfo.uname), and figure out how to set that filemode flag correctly.
Shouldn't comment when I'm tired, sorry about that.
After reading http://www.gnu.org/software/tar/manual/html_node/Standard.html it seems that the additional flags of mode are suid, sgid and sticky bit:
The mode field provides nine bits specifying file permissions and three bits to specify the Set UID, Set GID, and Save Text (sticky) modes. Values for these bits are defined above. When special permissions are required to create a file with a given mode, and the user restoring files from the archive does not hold such permissions, the mode bit(s) specifying those special permissions are ignored. Modes which are not supported by the operating system restoring files from the archive will be ignored. Unsupported modes should be faked up when creating or updating an archive; e.g., the group permission could be copied from the other permission.
Looks like that engine's API doesn't specify what tar format should be used https://docs.docker.com/engine/reference/api/docker_remote_api_v1.22/#build-image-from-a-dockerfile. After reading tarfile docs, it says that it's using gnu tar by default. How about the golang implementation engine is using - can it do different formats?
@TomasTomecek: That describes the fourth digit from the right:
/* Bits used in the mode field, values in octal. */
#define TSUID 04000 /* set UID on execution */
#define TSGID 02000 /* set GID on execution */
#define TSVTX 01000 /* reserved */
/* file permissions */
#define TUREAD 00400 /* read by owner */
#define TUWRITE 00200 /* write by owner */
#define TUEXEC 00100 /* execute/search by owner */
#define TGREAD 00040 /* read by group */
#define TGWRITE 00020 /* write by group */
#define TGEXEC 00010 /* execute/search by group */
#define TOREAD 00004 /* read by other */
#define TOWRITE 00002 /* write by other */
#define TOEXEC 00001 /* execute/search by other */
All those #defines would imply that mode is four octal digits long, and therefore a null-terminated char[5], with the left-most digit being the set-UID/set-GID/"reserved" value. However, it's actually got a size of 7 (char mode[8]). That "0"/"1" discrepancy is in byte 1 (0-indexed), and the user/group/other flags are in bytes 4–6. I would expect the UID/GID/VTX flags to live in byte 3 in this situation, which is definitely a "0" in both cases, so that seems right.
In short, I really don't know what's going on, but I don't think it's as simple as that. The only thing I can think of is that three bytes of the mode are to be ignored for alignment purposes and that perhaps the server is incorrectly comparing the whole 7 characters instead of just the last 4.
Of course, I don't know that this is what's causing the problem at all. It could be the username/group.
Did anyone ever find a work-around for this?
Is there any update on this?
I'm working on a fix for this problem. Unfortunately this requires a fix in golang and a docker daemon build with this fixed golang. See pullrequest #1582.
Is there a workaround while we wait for the patch?
Still seeing this in 2020, is there a workaround?
See https://github.com/docker/docker-py/issues/2230#issuecomment-723593257 it fixes the issue, but the package will only be open source next year.
Most helpful comment
Is there any update on this?