Moby: Extend docker cp to permit copying from images

Created on 4 Sep 2015  路  45Comments  路  Source: moby/moby

Permit docker cp to copy from an image but fail when copying to an image.

Perhaps a [docker cp] command that targets an image doesn't need to fail.

Motivation.

Although declined, feature was implemented as bash git hub project dkrcp

kinfeature

Most helpful comment

@calavera I think he wants to just copy stuff from images, not into them - so images would still be immutable. Its an interesting feature - I'd like to hear more about the usecases for it though.

All 45 comments

This would make images mutable objects, and the whole point of images is to be immutable. I'm pretty :-1: on this, but I'd like to hear what other people think about it, specially @jlhawn.

@calavera I think he wants to just copy stuff from images, not into them - so images would still be immutable. Its an interesting feature - I'd like to hear more about the usecases for it though.

You can already get the desired behavior by simply creating a container using the image first.

docker create ubuntu:14.04 for example, will create a container from the image but not start it. You can copy to/from the container in this state. You just have to remember to docker rm that container when you are done.

@calavera

As mentioned in the request, docker cp would fail if it attempted to target an image. This failure would be analogous to traditional cp semantics that reject the command due to the target location's attributes, like a read only directory/file system, or DAC permission conflicts.

Although initially proposed to fail when the target argument referred to an image, what if docker cp accepted an image reference as a target argument? An image's immutable constraint could be preserved by implementing the docker cp to create an image when the specified target:

  • _is absent:_ A new image is created by simply adding the source files to an empty file system. Similar to the following Dockerfile where the build context mirrors the docker cp source file tree:

FROM scratch COPY . /

  • _identifies an existing image:_ This situation would generate a new image. Similar to the following Dockerfile where the build context mirrors the docker cp source file tree:

FROM <ExistingImageReference> COPY . /

The naming conventions for the new image would assume the semantics applied by the docker build command.

In other words, a docker cp command that targets an image is implemented as a docker build where the build context mirrors the docker cp's source file tree and needed Dockerfile can be generated from docker cp's arguments and added -t option.

@jlhawn

You can already get the desired behavior by simply creating a container using the image first.

Not always. Strictly speaking docker create doesn't fire On Build triggers and yes I understand that I would need to run docker build to first create an image using On Build triggers, then use docker create but given the near "equivalence" of containers to images, I thought it would be "easy" to implement this feature. It's also been discussed on stackoverflow.

Also, docker create requires an initiating process be defined (CMD/ENTRYPOINT) in order to create a container. Images aren't required to define an initiating process. In this situation, docker create issues the message: 'Error response from daemon: No command specified' preventing the creation of a container which aborts the copy process. Of course, there are methods to circumvent this too, however, I would suggest it's preferable to provide a simple interface that both directly communicates the ability to perform an image copy operation, as well as encapsulates/hides its implementation.

@duglin

This proposal, especially with the enhanced semantics permitting the docker cp to target images provides:

  • a means to separate build and run-time concerns when constructing new images, eliminating the pollution of the run-time image by build tool chain packages and intermediate build artifacts.
  • a composition mechanism to construct images to compliment the inheritance mechanism implemented by FROM.
  • a means to encode a custom build system employing docker images using, for example, a GNU makefile method.

I'm not hearing a lot of traction on this one. I think given the relatively easy solution that @jlhawn mentioned (just create a container - even if you have to give it a dummy command to run - and then use docker cp to pull the files from the container) I think we should close this one. If a new usecase is presented that made this solution totally unbearable for the user then I think we should reconsider it.

Since I found it useful, the semantics of creating/evolving images by simply copying files were incorporated into a bash script called dkrcp that's available as a github project.

+1
Proposal: create new command called docker extract that only extract files from image to host.
The image dont mutate at all.

Use case: im am building my application with docker and then i wanted to extract the binaries to outside.

actual solution: create a dockerfile, build it, and then run a container only for a few seconds to extract with docker cp.

another proposal: add to COPY functionality to extract from image that is being builded to working folder in host.

I'm looking for this exact feature to implement a two phase docker build (see https://github.com/docker/docker/issues/7115):

  1. first Dockerfile will build from source, and as such include sources, test, and all sort of intermediate files
  2. second Dockerfile will package resulting binary in a clean image with runtime

to articulate those, I need to cp binary from image build on step 1. Ability to run docker cp image:path . would make this trivial. Today I have to create a container, run docker cp, then destroy container. Not a big deal, but introduce complexity

@mercuriete @ndeloof

Just in case it went unnoticed, there's a bash script named dkrcp available as Github project that wraps up several Docker CLI commands to implement this feature. It also supports multiple copy sources, an option to specify Dockerfile commands, and will gracefully terminate image creation, cleaning up running containers or deleting a newly created image from local repository, if the copy fails.

@WhisperingChaos
Thank you very much. i will have a look to this script.
I have exactly the same use case of @ndeloof .
1) use docker to build java artifacts or another language binaries.
2) include this binaries to a smaller runtime image.

Cheers!

@mercuriete
Your welcome.
If you find problems with the script, please generate a github issue. I just repaired problem with tilde expansion.

Enjoy!

As shown here you could run _cat_ or _tar_:

docker run <image-name> tar -c -C /my/directory subfolder | tar x

@wedesoft this assumes the image has tar command, which one can't guaranteed.

Now that we have an official way to run Docker on a Raspberry Pi I wold like this to be re-opened. We have a use case where we have software running on PCs and RPis. Given that they have different architecture I CANNOT simply start a container using our image. However there are some files in the image I would like to extract so we do not have to create multiple (PC and RPi) versions of all of our containers.

@falnos24865

Unfortunately, one of my earlier posts suggested that dkrcp operated on running containers to implement the copy operation. It doesn't. Instead, it executes docker create to produce a container to act as a source and/or target for the copy command. docker cp doesn't require running containers in order to operate upon them. Once dkrcp completes, it will remove the temporary containers it created.

I've been using dkrcp with Docker 1.12, within an AMD64 Ubuntu VM to generate minimal images containing a golang executable targeted for the Pi.

Lastly, if you encounter a problem with dkrcp, let me know and I'll address it.

this feature is available when used to build a Dockerfile as multi-staged build, seems to me it could be exposed the the CLI as well.

@ndeloof see the discussion on https://github.com/moby/moby/issues/16079

@thaJeztah you linked to the thread we are already in. I'm interested to see where you intended to link :)

Oh, lol, wrong link on my clipboard; meant to link to https://github.com/moby/moby/issues/30449

docker run --rm <image-name> cat /path/to/file > file_from_image

I like this method :D

This assumes cat is available, which _might_ not be the case :P
Please just consider this feature IS available from multi-staged builds, wonder why the API don't let us play with it.

No need to use cat;

docker create --name foo <image-name>
docker cp foo:/path/to/file file_from_image
docker rm foo

For build servers etc, it would probably be best to assign a unique container name in order to prevent clash, etc:

IMG_ID=$(dd if=/dev/urandom bs=1k count=1 2> /dev/null | LC_CTYPE=C tr -cd "a-z0-9" | cut -c 1-22)
docker create --name ${IMG_ID} <image-name>
docker cp ${IMG_ID}:/path/to/file file_from_image
docker rm ${IMG_ID}

@binarytemple, I agree. Another way to do this is to simply piggy-back off of container IDs by dropping the --name option altogether:

IMG_ID=$(docker create <image-name>)
...

@rbi13 - Thanks for the pointer, I just thought I'd share my Makefile workaround for building parameterized images and copying out build artifacts. It's pretty ugly (and completely untrustworthy (shell exploit anyone?)) , but works for now.

.phony: build run

DOCKER_CMD = $(shell which docker)
IMAGE_NAME = riak_rpm_builder
RPM_PATH=/root/riak/distdir/packages/
RPM_FILE=riak-${RIAK_VER}-1.el7.centos.x86_64.rpm

guard-%:
  @ if [ -z '${${*}}' ]; then echo 'Environment variable $* not set' && exit 1; fi

build: guard-RIAK_VER
  ${DOCKER_CMD} -D build --build-arg riak_ver=${RIAK_VER} -t riak_centos:${RIAK_VER} -f Dockerfile .

copy_out_rpm: guard-RIAK_VER build
  $(shell TMP_IMG=$(shell ${DOCKER_CMD} create ${IMAGE_NAME});${DOCKER_CMD} cp $$TMP_IMG:${RPM_PATH}${RPM_FILE} ./${RPM_FILE}; ${DOCKER_CMD} rm $$TMP_IMG)

I'm concerned that this multi-step process is quite a learning curve for someone getting started with docker. It takes research. Everyone's doing it differently. Docker is already hard to learn.

Why not make it easier for newcomers? What's the downside?

I'm not convinced that the reasons for closing this ticket are adequate enough to discount the potential benefit to the community. I strongly advocate reopening this issue.

My use case is for Tomcat8(Java J2EE server) base image modification for the new application. I would like to see a simple command to extract file from the image, modify file, use it in a dockerfile script file.

## mockup command use-case for clarity
## custom script modifies xml file at runtime such as adding missing jdbc datasources,
## everything else in xml config file is used as-is
## dockerfile inserts a modified tmp/server.xml to a new image
docker copy basetomcat8:latest /usr/local/tomcat/conf/server.xml /tmp/server.xml
./dothemagic.sh /tmp/server.xml
docker build -t myapp-test:1 ./dockertest

@Murmur

Docker's relatively new multi-stage build feature obsolesces the need for this enhancement when using docker build to construct a resultant image. Although I welcome the concept of multi-stage builds, I myself avoid using Docker's current implementation of this feature due to its technical flaws: pathological coupling and poor instruction orthogonality described in great detail by #33206 and this comment. That said, if you can stomach these flaws, you can implement what you describe above using multi-stage build.

I was reaching for the same thing (in my case to add a userspace so I could docker exec and investigate a scratch + go binary container (with no userspace otherwise)) so I wrote up a little script to do this

It essentially does:

  • container_id = docker create $img
  • docker export -o tar.gz $container_id
  • tar -xf tar.gz
  • for toplevel in ls tar: docker cp $thing $container:/

@Murmur

Multi-stage builds obsolete this feature for the purpose of copying in to an image (although it seems impossible to do so and also copy files to /). However, multi-stage builds definitely do not obsolete the need for this feature for copying out of an image. The image continues to be immutable for the purpose of copying a file out.

Note, 'docker export' also does not work for copying out of an image.

Considering the following file, 'hello.txt':

hi!

Consider the following Dockerfile:

FROM scratch
COPY hello.txt /

Build an image from the above Dockerfile.

$ docker build -t trivial .
Sending build context to Docker daemon  3.072kB
Step 1/2 : FROM scratch
 ---> 
Step 2/2 : COPY hello.txt /
 ---> 3a34db2e35d3
Removing intermediate container c84cd3c72f3a
Successfully built 3a34db2e35d3
Successfully tagged trivial:latest
# now remove hello.txt
rm hello.txt

How you you suggest one should extract the file hello.txt from the image (i.e. not a running container)?

$ docker images trivial
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
trivial             latest              3a34db2e35d3        4 minutes ago       4B
$ docker cp 3a34db2e35d3:/hello.txt hello
Error response from daemon: No such container: 3a34db2e35d3

My solution was the following, which is a complete pain in the ass, and could potentially have unintended consequences. I create a stopped container from the image, which has an entry point of "/" (obviously running the container will fail). I then extract the file from the stopped container, and delete the container.

$ CONTAINER="$(docker create --name trivial-latest trivial:latest /)"
$ docker cp ${CONTAINER}:/hello.txt .
$ sudo docker rm ${CONTAINER}
$ cat hello.txt
hi!

IMHO, creating a stopped container to just to copy files out of an image is a waste of resources, but that is just my HO.

IMHO, creating a stopped container to just to copy files out of an image is a waste of resources, but that is just my HO.

Copying from an image directly likely would do the same, as the images filesystem still has to be mounted before being able to copy

I use the following command to copy single file to a local disk, it creates a non-running container from the image file, copies a file and then delete the named temp container. This trick is not my discovery but here and also Stackoverflow helped me to do it.

$ docker create --name temp1 docker-registry.customer.com/repo/tomcat8-jre
$ docker ps --all
$ docker cp temp1:/usr/local/tomcat/conf/server.xml ./build/server.xml
$ docker rm temp1

Copying all files from the image I use similar command sequence.

$ docker create --name temp1 docker-registry.customer.com/repo/tomcat8-jre
$ mkdir tomcat8-jre
$ docker cp temp1:/  ./tomcat8-jre
$ docker rm temp1
$ docker ps --all

Not very intuitive way to achieve a simple task but once you know this black magic trick you move on to the next challenges in life.

Extra steps. I guess it just means life is less convenient for users :P

I think it will make sense copying file to a volume. Use case can be a multi stage build that check out a code build it into an image and just like the COPY --from=build src dest functionality i should be able to do some like:

docker run -it -v <image-name>:/usr/src <other-image-name> command or in a docker-compose file

@duglin @thaJeztah @tonistiigi

I understand this can be done by combining docker create, docker cp, and docker rm, but can we consider reopening this for better UX?

If adding this feature is unacceptable for docker cp, I think we can implement as docker image blahblahblah which internally calls the equivalent of docker create; docker cp; docker rm.

@AkihiroSuda discussing with @tonistiigi - if we understand the use-case correctly, the primary use would be to copy build artefacts from an built-image to the host. Given that BuildKit would allow exporting the artefact directly on the host, perhaps we should have a look at that (i.e. integrating buildkit?)

In addition to that, I think this proposal is useful for copying binaries from pulled images. WDYT?

@thaJeztah I think so, my use-case would be to copy a static site out of an integration-tested image to submit to a CDN.

To elaborate: We use MultiStage builds to take a website -> minify etc. -> make a minimal nginx image with the static site. This image is used for integration testing. But then when that is done the actual hosting is done by a CDN. So I want to get the static site out of the image, zip the files and submit to the CDN. But I don't want to include the CDN API / curl in the "integration testing" container and run the command inside the container.

I'll add one more use case for that. We use Docker multi-staged build to build ember app, and then host it under nginx container.

So we have two build steps. But we also running QUnit tests for ember app inside of first build image, and we want to save unit test results on actual host, so CI agent could pick up it, parse and display test results in CI's UI. docker createcp\rm would help here, but if cp command can do it right away, it would make it easier.

I agree, docker must support such feature, to copy file/folder from an image into host.
This feature is very useful and I use it a lot, especially for retrieving base configuration files from different services, modify them accordingly and then mount them to a container of the very same image.

So, instead of currently doing (as @binarytemple suggested):

IMG_ID=$(dd if=/dev/urandom bs=1k count=1 2> /dev/null | LC_CTYPE=C tr -cd "a-z0-9" | cut -c 1-22)
docker create --name ${IMG_ID} <image-name>
docker cp ${IMG_ID}:/path/to/file file_from_image
docker rm ${IMG_ID}

We need to have something like:

docker cpimage [docker-run-options] /etc/some-service.conf ./ <image-name>[:<tag>]

Or copying an entire folder:

docker cpimage [docker-run-options] /etc/some-service.d ./ <image-name>[:<tag>]

Basically, the following syntax could work:

docker cpimage [docker-run-options] <source-path-on-image> <dest-path-on-host> <image-name>[:<tag>]

I agree with @slavikme.

My use case involves using Docker to build a static binary that is then run on bare metal. In this way, any linux distribution supporting docker can be used to build the application. I am currently resorting to using:

CID=$(docker create image-name:latest)
docker cp ${CID}:/opt/binary-name ./binary-name
docker rm ${CID}

This approach, however is not optimal: apart from requiring scripting, it is prone to subtle breakages, because the three commands sequence is not atomic, and performs side effects on the machine it runs on.

Having a single command that extracts one or more files from an image would really be a better solution.

@muxator docker build --output /some/dir works for you? It was added in Docker 19.03. Needs DOCKER_BUILDKIT=1 to be set.

Hi @AkihiroSuda, thanks for the suggestion!

The Docker version on my ubuntu 20.04 is 19.03.8 and thus it is compatible with your hint. The documentation at https://docs.docker.com/engine/reference/commandline/build/#custom-build-outputs mentions it, too. This is probably a case of getting lost in old material through search engines.

For future reference, a minimal working example would be:

FROM ubuntu:20.04 as build-stage

# placeholder for commands that perform the build and put the
# statically linked binary in <release_path>/<binary_name>
RUN <build_commands>

# Copy just the statically linked artifact in the root directory
# of an empty container
FROM scratch
COPY --from=build-stage <release_path>/<binary_name> /<binary_name>

To run the build and put the generated binary in a directory, this command would then be sufficient:

DOCKER_BUILDKIT=1 docker build --output <dest_path> <dockerfile_path>

In general, that command would put the whole generated image contents in <dest_path>. But since the latest stage is a scratch image containing just one file, it results in putting just that file in the desired position.

Thank you very much.

p.s.: as a side note, it seems that the BUILDKIT subsystem is completely separated from the "normal" docker workflow. For example, the first time I run the build, docker had to re-run the whole Dockerfile (including preliminary package installs that supposedly where already in the cache)

Was this page helpful?
0 / 5 - 0 ratings