Goal of this issue is to update the readme file how to use yarn and docker.
I want a docker image to build my node projects - and use yarn to install the packages as the installation is much faster (and more deterministic) than using npm.
One reason why yarn is fast is of course the local yarn cache. So the docker image needs to mount the yarn cache directory when building the projects. Any other hints how to use docker and yarn?
For actually installing Yarn in a Docker or Vagrant image, you can use the Debian package repo (assuming you're using a Ubuntu or Debian Docker image). Enabling the package repo then doing apt-get install yarn
will also install Node.js as a dependency.
As for mounting the cache, that's a pretty good idea, I'm not too sure how to do it though (I'm not very familiar with Docker myself).
You wouldn't mount the Yarn cache directory. Instead, you should make sure you take advantage of Docker's image layer caching.
These are the commands I am using:
COPY package.json yarn.lock ./
RUN yarn --pure-lockfile
It would depend on the environment and approach you want to take to building your assets.
@kyteague's approach is one you could take if you didn't want to use a global cache and instead just cache the project's dependencies in a higher docker layer. (ie: if your development is going to be running docker build
over and over without changing dependencies). If you change the package.json, you lose the cache at the higher layer and have to do a full reinstall.
A more sophisticated approach for development is to run a development container (node:6
or similar with yarn installed) and mount the cache in to do the install. note that the following uses a .docker-yarn-cache
intended to be used with docker because libs like node-sass
can have c-lib issues if you install them on OSX and then try to use them on Debian, etc. Something like:
docker run -itv ~/.docker-yarn-cache:/root/yarn-cache -v `pwd`:/opt/project --workdir /opt/project node:6 yarn
Typically I combine the docker run approach with some bash and volume mount caches in. Then at the end I copy my assets out of the container and ship them to S3, etc or COPY
them into my production docker image.
You could also yarn
on a host if it's similar to your container OS (debian/debian for example) and then write your Dockerfile to COPY the node_modules folder in with the rest of the project. This would allow you to have access to you host's .yarn-cache
for speed and then you don't have to deal with installing in the docker build
.
If your project has a "watch mode" script, you can use a docker-compose file to alleviate some of the concerns of the "repeated docker build" approach by running something along the lines of yarn && yarn run watch
as the command with the same volume mounts as the "manual docker run" approach.
So really it depends on your goals and build environment ("watch" development with Docker for Mac/CI from a dev image to an alpine-based prod image/etc).
@kyteague and @ChristopherBiscardi, great comments! I think it would be valuable to add a page to the documentation around best practices for using Yarn in Docker. Would you like to write a page about "Using Yarn in Docker" for our documentation? The website is in a separate repo: https://github.com/yarnpkg/website
Ideally, docker could use an external cache but there's some resistance to that ( https://github.com/docker/docker/issues/17745 ), so adding a --no-cache
option would be good so the docker image isn't made needlessly large by a cache that won't ever be used.
This is the fastest setup I've get so far as yarn-cache
can be reused by many containers:
Dockerfile
FROM node:6.7.0
RUN curl -o- -L https://yarnpkg.com/install.sh | bash
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
ARG NODE_ENV
ENV NODE_ENV $NODE_ENV
COPY . /usr/src/app
EXPOSE 8000
ENTRYPOINT ["sh", "./entrypoint.sh"]
CMD ["node", "./server"]
entrypoint.sh
$HOME/.yarn/bin/yarn install --pure-lockfile
exec "$@"
docker-compose.yml
app:
build: .
volumes_from:
- yarn-cache
yarn-cache:
image: busybox
volumes:
- /root/.yarn-cache
@rstuven That will do the yarn install
at run time rather than build time and means that the docker image is not self contained/fully reproducible (which is the reason why we and I believe many others use docker).
@daveisfera Yes, that's why I stressed on its "fastest" quality. I missed to point out this is rather for development workflow where fast iterations matter most. On the other hand, the yarn.lock file should guarantee the reproducible aspect, but yes, it's not enough.
Another approach: https://medium.com/@mfornasa/using-yarn-with-docker-c116ad289d56
Am I wrong to assume that Yarn's global cache stores the package zips (which contain cross-platform code by which I mean it will be compiled at install-time)?
If the global cache only contains the downloaded archives and no build artifacts, would we not be able to at least mount the host's global cache so that the Docker container wouldn't have to download them?
@kyteague --pure-lockfile
will not generate yarn.lock
which means if I have changed package.json
and rebuild the image, the old yarn.lock
will be copied into image and not sync with the modified package.json
?
Looks like there are plenty of ways now.
If anyone wants to submit a good way to do this, feel free to send a PR for the docs website https://github.com/yarnpkg/website
another way is to mitm yarn traffic with caching proxy and self-signed cert using cafile
option
here's a crude example: https://github.com/komlevv/docker-squid-cache
it has 2 services: caching proxy and root certificate server
COPY package.json yarn.lock ./
RUN yarn install --frozen-lockfile --no-cache --production
Note: You don't want dev dependencies in a production image, also you need to make sure that Yarn's cache folder is not bundled into the image.
COPY package.json yarn.lock ./
RUN yarn install --frozen-lockfile
Note: In a test / CI environment you still want to install NPM modules via Docker builder in order to utilize Docker layer caching. The next time your image is being built on a CI server, these two steps will be skipped in favor of using an existing layer, unless either package.json
or yarn.lock
was changed.
COPY package.json yarn.lock ./
Note: In development mode (locally) it would be faster to install NPM modules at run-time, this way you can attach a volume with Yarn cache to your container.
The approach above can be implemented by using a single Dockerfile
:
FROM node:8.9.1-alpine
ARG NODE_ENV=production
ENV NODE_ENV=$NODE_ENV
# Set a working directory
WORKDIR /usr/src/app
# Install native dependencies
# RUN set -ex; \
# apk add --no-cache ...
# Install Node.js dependencies
COPY package.json yarn.lock ./
RUN set -ex; \
if [ "$NODE_ENV" = "production" ]; then \
yarn install --no-cache --frozen-lockfile --production; \
elif [ "$NODE_ENV" = "test" ]; then \
yarn install --no-cache --frozen-lockfile; \
fi;
...
Note: It's better to install native dependencies, if any, via a separate RUN
command coming before yarn install
.
docker-compose.yml
:version: '3'
volumes:
yarn:
services:
api:
image: api
build:
context: ./
args:
NODE_ENV: "development"
volumes:
- yarn:/home/node/.cache/yarn
- ./src:/usr/src/app/src
- ./package.json:/usr/src/app/package.json
- ./yarn.lock:/usr/src/app/yarn.lock
...
You can also do it in a multi stage Dockerfile for production, like the following for something that runs as a static front end and doesn't need node/yarn at runtime:
FROM node:alpine
WORKDIR /usr/src/app
COPY . /usr/src/app/
# We don't need to do this cache clean, I guess it wastes time / saves space: https://github.com/yarnpkg/rfcs/pull/53
RUN set -ex; \
yarn install --frozen-lockfile --production; \
yarn cache clean; \
yarn run build
FROM nginx:alpine
WORKDIR /usr/share/nginx/html
COPY --from=0 /usr/src/app/build/ /usr/share/nginx/html
Note: Maybe with --no-cache
since that seems to be added now and then we can skip the cache clean
.
The only way I can find to not have an extra 100MB of cache is to do this on latest version of yarn (1.5.1).
RUN yarn install --frozen-lockfile --production && yarn cache clean
Just in case, there's no --no-cache
, not yet. So yarn cache clean
for now.
https://github.com/yarnpkg/rfcs/pull/53#issuecomment-399678507
from this comment I like the concise nature of using /dev/shm as a volatile storage of the cache
@jeremejevs to prevent the yarn cache from winding up in docker layers, we would need to have them together in a single RUN yarn install && yarn cache clean
command in our Dockerfile
, right?
I don't know for sure but I assume after a RUN yarn cache clean
command on its own would just mark the cache dir as deleted in a new docker layer, but the earlier RUN yarn install
layer would still contain the entire cache.
@jedwards1211 That is correct, yes.
A tad off topic but I'm doing RUN yarn install --frozen-lockfile --production --no-cache && yarn build
and I get a big delay before it transitions to the next layer. If I appended && rm -rf ./node_modules
to the command (as I have no need of them after build) would that reduce the delay/produce a leaner layer?
AFAICT yarn still doesn't have a --no-cache
option
Yes, yarn cache clean
is the only way to avoid putting the cache in your layer. There's also an experimental feature that allows you to mount a cache directory during build (but I haven't had much success with making it work effectively yet, and multi-stage builds have a lot of promise but are still difficult to use
Most helpful comment
You wouldn't mount the Yarn cache directory. Instead, you should make sure you take advantage of Docker's image layer caching.
These are the commands I am using: