Pipenv: Docker layer caching-friendly workflow with pipenv

Created on 22 Nov 2018 · 16Comments · Source: pypa/pipenv

The usual way to create a Docker-based deployment (e.g. for deploying on Kubernetes) for a Python application looked something like this, using a requirements.txt, produced by pip freeze or pip-compile:

FROM python3.6

RUN mkdir /app
WORKDIR /app

# Only copy application dependencies to take advantage of image layer caching,
# i.e. if the "requirements.txt" file doesn't change, the layer is cached
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt

# Install the actual application code now. Usually, this is the part of the code
# that changes, thus invalidating the image layer cache.
ADD . /app

# We actually need to do this because our application code might have a "setup.py"
# which defines "entry_points". If that wasn't the case, it could be optional
RUN pip install .

# Run the app...
CMD ["python", "-m", "myapp.run", ... ]

Using pipenv, I would imagine the equivalent would be something like:

FROM python3.6

# Install "pipenv"
RUN pip install pipenv

RUN mkdir /app
WORKDIR /app

# In a similar fashion as before if the "Pipfile.lock" doesn't change, the
# image layer is going to be cached.
COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --deploy --system

ADD . /app
RUN pip install .

# Run the app...
CMD ["python", "-m", "myapp.run", ... ]

If this is something that others have come across and consider a best practice, I believe it would be useful to make it part of the official documentation, since pipenv is meant to be a solution for applications.

For example, before that, I thought that the logical thing would be doing something like:

# Install my application dependencies
pipenv install requests flask celery ...
pipenv install --dev pytest ...

# ...develop my app...

# Add the app to the Pipfile
pipenv install -e .

# commit everything
git add .
git commit ...

This would make it easy for someone to create an application that could be easily installed locally with just a pipenv install --dev. The problem is that now that the application package is part of the Pipfile, Docker layer caching is thrown out of the window (i.e. one has to do ADD . /app much earlier in order for pipenv install --system --deploy to be able to find the application's setup.py).

(This issue is in no way meant to be a complaint or a back in the "pip"-days I used to...-kind of rant. I'm really just hoping for a good discussion with practical advice and seeing how others tackle this issue using "pipenv")

Docker Discussion Type

Source

slint

👍8

Most helpful comment

I'd recommend multistage builds with docker and pipenv

FROM python:3.7 AS base

ENV LC_ALL C.UTF-8
ENV LANG C.UTF-8

WORKDIR /src

FROM base AS build

RUN pip install pipenv

...

# -- Adding Pipfiles, changes should rebuild whole layer
COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock

# -- Install dependencies: --deploy aborts if deps are incorrect with Pipfile.lock or Python version is incorrect
RUN pipenv install --dev --deploy --system


# use alpine for production to keep the image size small
FROM python:3.7-alpine AS release

...
# install dependencies from Pipfile.lock for reproducible builds
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --system --deploy --ignore-pipfile

lukasz-madon on 25 Jul 2019

👍3

All 16 comments

I'm pretty sure the answer you want is to use a Docker Ignore file:

.git
.svn
.cvs
.hg
README*.md
!README*.md

And yeah we can definitely use a documentation note about this, I am sure we have some docker info in the docs, feel free to propose the changes you want to see whenever you work out the best way to structure things (I don't use docker directly so I'll just verify/leave it to the other maintainers to verify)

techalchemy on 24 Nov 2018

@slint ,
I personally still find it easier to do it in two steps, as per your initial post.
One of the reasons is due to outstanding pipenv bug https://github.com/pypa/pipenv/issues/3148

More details on do I do it in Dockerfile: https://tech.zarmory.com/2018/09/docker-multi-stage-builds-for-python-app.html#pipenv

haizaar on 3 Dec 2018

👍1

Hi @haizaar, sorry for the bother but I'm really interested in what you achieved, i.e. slim & lean docker images with pipenv through multi-stage builds.

I tried to adapt the example you linked but without success.

In particular it's not clear to me where is the step that should include the app sources inside the last stage of your build. Am I missing something? 🤔

Btw my setup only includes a Pipfile & a Pipfile.lock inside the root of my repo alongside the app sources that resides in the app directory.

fusillicode on 5 Dec 2018

Would using the ONBUILD convention outlined in https://github.com/pypa/pipenv/blob/master/Dockerfile work for your use case?

Fongshway on 5 Dec 2018

I don't think so @Fongshway as my goal is to "cook" a lean & custom image with just what I need to run my application :(

fusillicode on 5 Dec 2018

@fusillicode
May be you've missed the part where your own app has a proper setup.py that installs under $PYROOT together with its dependencies - this is what pip install --user . does.

So eventually both your app and its dependencies end up under $PYROOT and you just grab them in the final stage.

I prefer to have setup.py for my app, since it insures a clean install - just copying files has high chances of including unrelevant pieced, e.g. tests. Yes, .dockerignore is a one way to solve it, but it's still tedious for all developers to remember about .dockerignore; and .dockerignore is ignored if you decide to tar your context because you want to have symbolic links resolved https://github.com/moby/moby/issues/18789#issuecomment-165985865.

Hope this helps.

haizaar on 7 Dec 2018

👍1

P.S. @fusillicode, if you really want to go lean with under 40MB final images, you can use python-minimal images.

haizaar on 7 Dec 2018

FYI @haizaar I didn’t forget your issue it just wound up being super complicated on the resolver side for some reason. I am very interested in any updates to docker documentation as I am using pipenv in docker myself now and am not that great with it yet :)

techalchemy on 8 Dec 2018

Dan,
Do you think it worth contributing my best practices to official pipenv
docs? (Or may be to Hitchhiker's Guide to Python?)

On Sat., 8 Dec. 2018, 10:46 Dan Ryan <[email protected] wrote:

FYI @haizaar https://github.com/haizaar I didn’t forget your issue it
just wound up being super complicated on the resolver side for some reason.
I am very interested in any updates to docker documentation as I am using
pipenv in docker myself now and am not that great with it yet :)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/pypa/pipenv/issues/3285#issuecomment-445401748, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AADjWafRAp7UAUWINsHvsjKlLvmZQ2gKks5u2v27gaJpZM4Yv5ep
.

haizaar on 8 Dec 2018

@haizaar I'm really sorry for the lateness of my reply 🙇‍♂️

Thanks a lot for pointing out the possibile problem and solution but even more for sharing your experience in this matter!

Actually I've solved my problem by generating, inside the "builder" stage of my multistage Dockerfile, the requirements.txt via pipenv and then use it directly with pip. It seems that it is working pretty well consider that I end up with a slimmer image than my first production one :)

If it would be helpful I'll share the Dockerfile :)

Thanks a lot for your support, I really appreciate it! 🍻

fusillicode on 8 Dec 2018

Another workflow we have been investigating is building two images:

A dependencies-only image, tagged with the _meta.hash.sha256 key from the Pipfile.lock:

# ./Dockerfile.deps

FROM python:3.6

RUN pip install --upgrade pip pipenv setuptools wheel

RUN mkdir /app
WORKDIR /app

COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --deploy --system

To build the image:

$ deps_version=$(jq ._meta.hash.sha256 Pipfile.lock)
$ docker build -t myapp-deps:$deps_version Dockerfile.deps

This image can be e.g. built on a regular basis by some cronjob/CI workflow, since it's often the case that dependencies don't change that often, compared to actual application code.

For the application image now, we can use ARG before the first FROM to specify the exact tag of our dependencies image:

# ./Dockefile

ARG DEPS_VERSION=latest
FROM myapp-deps:${DEPS_VERSION}

COPY . /app
RUN pip install .

CMD ["python", "-m", "myapp.run"]

To build the image we need to pass the dependencies version via --build-arg:

$ deps_version=$(jq ._meta.hash.sha256 Pipfile.lock)
$ docker build -t myapp:1.2.0 --build-arg DEPS_VERSION=$deps_version .

One caveat of this method is that, if the specific dependencies image might not have been already built, and thus you have to check your registry and trigger a build if needed before building the application image.

slint on 8 Dec 2018

❤1 👍1

@fusillicode
You are welcome. May I ask why did you end up with Pipfile -> reqs.txt conversion? I guess you do pipenv lock -r | pip install -r /dev/stdin, if so, you know you can do pipenv install --system --deploy, or using my side-install method PIP_USER=1 pipenv install --system --deploy, right?
The bonus of using pipenv is when you have installs from private repos - pipenv will install it for your, while converting to pip will require additional credentials/configuration setiup.

haizaar on 9 Dec 2018

@slint
I like the meta-hash trick.
Though, what benefits do you have with this approach compared to doing both steps in a single docker file? If dependencies, i.e. Pipfile.lock do not change, then docker-cache will be reused, yielding practically the same build speed. Do you see different?

haizaar on 9 Dec 2018

@haizaar This is an additional optimization in case you're not building your images locally, but on e.g. a CI/CD service, which runs your docker build ... scripts on a random VM/container everytime and thus cannot benefit from image layer caching. If you are always building/pushing your images from your local machine (which is perfectly fine), then your multi-step build is actually faster and less complex to execute.

slint on 9 Dec 2018

Thanks @slint, I see your point.

haizaar on 10 Dec 2018

I'd recommend multistage builds with docker and pipenv

FROM python:3.7 AS base

ENV LC_ALL C.UTF-8
ENV LANG C.UTF-8

WORKDIR /src

FROM base AS build

RUN pip install pipenv

...

# -- Adding Pipfiles, changes should rebuild whole layer
COPY Pipfile Pipfile
COPY Pipfile.lock Pipfile.lock

# -- Install dependencies: --deploy aborts if deps are incorrect with Pipfile.lock or Python version is incorrect
RUN pipenv install --dev --deploy --system


# use alpine for production to keep the image size small
FROM python:3.7-alpine AS release

...
# install dependencies from Pipfile.lock for reproducible builds
COPY Pipfile.lock Pipfile.lock
RUN pipenv install --system --deploy --ignore-pipfile

lukasz-madon on 25 Jul 2019

👍3

Was this page helpful?

0 / 5 - 0 ratings