Cookiecutter-django: Improve the docker image.

Created on 5 May 2017 · 9Comments · Source: pydanny/cookiecutter-django

The docker image could be improved with a few tricks:

Reduce the number of layers: Each instruction in the Dockerfile creates a new layer.
Do not cache pip data: There is no point of caching the packages in a container, --no-cache-dir should be used
Remove clutter files
Adding the code should be one of the last steps. Every step in the Dockerfile is cached so it has not to be executed everytime it changes. The code being the step that will change the most it must be one of the last steps.

Here's an example of the optimized Dockerfile:

FROM python:3.6
ENV PYTHONUNBUFFERED 1

# Requirements have to be pulled and installed here, otherwise caching won't work
COPY ./requirements /requirements
COPY ./compose/django/gunicorn.sh ./compose/django/entrypoint.sh /

RUN pip install --no-cache-dir -r /requirements/production.txt \
    && groupadd -r django \
    && useradd -r -g django django \
    && sed -i 's/\r//' /entrypoint.sh \
    && sed -i 's/\r//' /gunicorn.sh \
    && chmod +x /entrypoint.sh \
    && chown django /entrypoint.sh \
    && chmod +x /gunicorn.sh \
    && chown django /gunicorn.sh \
    && rm -rf /requirements

COPY . /app
RUN  chown -R django /app

USER django

WORKDIR /app

ENTRYPOINT ["/entrypoint.sh"]

If you would like those improvements I can create a pull request.

Source

Darkheir

Most helpful comment

I was able to remove hundreds of megabytes from my Docker image by replacing:

RUN chown -R django /app

with

COPY --chown=django:django on all COPY statements

iMerica on 25 Mar 2018

🎉3

All 9 comments

@Darkheir, I agree to all your suggestions except for the first one: To my mind, every step of Docker image creation should be considered self-sufficient, isolated in sense from all others, for instance, all entrypoint-related operations should come closer to each other compared to those not related to entrypoint, and so it goes for every other Dockerfile artifact. On the other hand, if there are more prominent reasons why we should strive to keep a number of Docker image layers to a minimum, we'd be glad to hear those.

webyneter on 5 May 2017

Yeah My example above was maybe a bit extreme, I agree with the idea that every step should be self sufficient.

Here's another version:

FROM python:3.6
ENV PYTHONUNBUFFERED 1

# Requirements have to be pulled and installed here, otherwise caching won't work
COPY ./requirements /requirements

RUN pip install --no-cache-dir -r /requirements/production.txt \
    && rm -rf /requirements \
    && groupadd -r django \
    && useradd -r -g django django

COPY ./compose/django/gunicorn.sh ./compose/django/entrypoint.sh /
RUN sed -i 's/\r//' /entrypoint.sh \
    && sed -i 's/\r//' /gunicorn.sh \
    && chmod +x /entrypoint.sh \
    && chown django /entrypoint.sh \
    && chmod +x /gunicorn.sh \
    && chown django /gunicorn.sh

ENTRYPOINT ["/entrypoint.sh"]

WORKDIR /app

COPY . /app

RUN chown -R django /app

USER django

Here's some link about docker optimization as a reference:

PS: I didn't metion the use of the alpine base image because it may creates a lot of problem when we want to install some os related packages.

Darkheir on 5 May 2017

@Darkheir, LGTM, except for django group/user part. It's also conventional to place ENTRYPOINT at the very end, except for the cases when you also have CMD specified. By the way, I've made minor adjustments regarding the order of commands as well. (FYI, we've not yet switched to python:3.6, therefore sticking to python:3.5).

FROM python:3.5

ENV PYTHONUNBUFFERED 1

# Requirements have to be pulled and installed here, otherwise caching won't work
COPY ./requirements /requirements

RUN pip install --no-cache-dir -r /requirements/production.txt \
    && rm -rf /requirements

COPY ./compose/django/gunicorn.sh ./compose/django/entrypoint.sh /
RUN sed -i 's/\r//' /entrypoint.sh \
    && sed -i 's/\r//' /gunicorn.sh \
    && chmod +x /entrypoint.sh \
    && chown django /entrypoint.sh \
    && chmod +x /gunicorn.sh \
    && chown django /gunicorn.sh

COPY . /app

RUN groupadd -r django \
    && -r -g django django
RUN chown -R django /app

USER django

WORKDIR /app

ENTRYPOINT ["/entrypoint.sh"]

webyneter on 5 May 2017

@Darkheir, if the latest version is OK with you, feel free to provide a PR.

webyneter on 5 May 2017

Hum just the

RUN groupadd -r django \
    && useradd -r -g django django

What is the point of recreating the user and the group each time the sources are modified ?

Apart from this I agree

The 3.6 is just that a forgot to switch back to the version you are using
WORKDIR and WORKDIR at the end is OK for me

Darkheir on 5 May 2017

@Darkheir, you're right about the user being recreated. How about that?

FROM python:3.5

ENV PYTHONUNBUFFERED 1

RUN groupadd -r django \
    && -r -g django django

# Requirements have to be pulled and installed here, otherwise caching won't work
COPY ./requirements /requirements
RUN pip install --no-cache-dir -r /requirements/production.txt \
    && rm -rf /requirements

COPY ./compose/django/gunicorn.sh ./compose/django/entrypoint.sh /
RUN sed -i 's/\r//' /entrypoint.sh \
    && sed -i 's/\r//' /gunicorn.sh \
    && chmod +x /entrypoint.sh \
    && chown django /entrypoint.sh \
    && chmod +x /gunicorn.sh \
    && chown django /gunicorn.sh

COPY . /app

RUN chown -R django /app

USER django

WORKDIR /app

ENTRYPOINT ["/entrypoint.sh"]

webyneter on 5 May 2017

Perfect for me !

Darkheir on 5 May 2017

I was able to remove hundreds of megabytes from my Docker image by replacing:

RUN chown -R django /app

with

COPY --chown=django:django on all COPY statements

iMerica on 25 Mar 2018

🎉3

@iMerica that's a dramatic improvement, would you mind sending a PR?

webyneter on 25 Mar 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Management commands when using docker

saschalalala · 4Comments

ImproperlyConfiguredException, could not load boto3 S3 bindings when running collectstatic in heroku

jorgeas80 · 4Comments

Migrate to Pathlib

webyneter · 3Comments

CommandError: Can't find msgfmt. Make sure you have GNU gettext tools 0.15 or newer installed.

vladimirmyshkovski · 4Comments

Unconditionally connect to database from DATABASE_URL env when opting for Docker

webyneter · 3Comments