Do you want to request a feature or report a bug?
Bug
What is the current behavior?
We usually structure our Dockerfiles to take advantage of the Docker cache mechanism, so we first copy package.json
and yarn.lock
, then run yarn --pure-lockfile
and then copy the rest of the package.
Doing so, the dependencies, that change less often than the package source code, can be cached by Docker, and we can often avoid to have Yarn run on each build.
With Yarn workspaces, you can't run yarn
if you have just the package.json
and the yarn.lock
file, because it will complain about the missing workspace packages.
If the current behavior is a bug, please provide the steps to reproduce.
Have a Dockerfile like this:
FROM node:9-alpine
WORKDIR /app
COPY package.json .
COPY yarn.lock .
RUN yarn --pure-lockfile
COPY . .
RUN yarn test
What is the expected behavior?
There should be a way to tell Yarn to install all the dependencies and defer the packages linking to a later step once the whole source code is copied.
Doing so we can reduce the repetitive task to just the packages linking, and cache the installation/fetching of the dependencies.
I am currently struggling with caching in a multi-stage build. Since it's not possible to use COPY
to match files with a glob pattern and preserve the directory layout, I've decided to manually copy each individual package.json
from my workspace packages, do a yarn install
and the copy everything into the final image.
Are you asking to install the dependencies listed in the yarn.lock rather than from the package.json? At least in our case, the root level package.json doesn't have any dependencies, the dependencies are listed in the other packages in the workspace.
Given a workspace containing appA, appB and moduleC, my Dockerfile looks like:
FROM node:8
WORKDIR /app
COPY package.json yarn.lock .
COPY appA/package.json appA/yarn.lock appA/
COPY moduleC/package.json moduleC/yarn.lock moduleC/
RUN yarn
COPY appA/ appA/
COPY moduleC/ moduleC/
RUN cd appA && yarn run build
If I understand your suggestion correctly, that could change to:
FROM node:8
WORKDIR /app
COPY package.json yarn.lock .
RUN yarn --install-from-yarn-lock
COPY appA/ appA/
COPY moduleC/ moduleC/
RUN cd appA && yarn install --link-only && yarn run build
But I'd get all of the dependencies for appB in my docker image.
Don't necessarily take this as criticism, though. If I didn't mind adding all the dependencies and code for appB into the image, your suggestion would make the Dockerfile much simpler and much less fragile. I like it, I'm just still struggling with how to correctly put things together myself. Because as you can imagine we have a lot more than just appA, appB & moduleC. :)
This seems like it's still an unresolved challenge, yes?
It's not pretty, but I use a bash script docker-prepare.sh
to copy all package.json files from all packages first, then run yarn in the root to install all dependencies, then I copy everything else.
There should be some official way to do this, even if it's just a - recommended and well-designed - script file to execute.
+1 I'm thinking of adding a flag like
yarn --no-workspace
then
yarn --link-workspace
So that the prior command can be cached.
If this seems right, I'll raise a PR for the same.
Any ideas/solutions on that?
Theoretically, there is yarn --focus
that could help with this issue, but it requires the packages to be uploaded in the package register. So... this can't be the solution for private only packages inside a monorepo.
We were running into the same issue until we came across this. After implementing this suggestion, we saw a staggering 70% increase in the build time!
There is some issue with docker multistage builds and caches; hence it is necessary to cache individual stages and use --cache-from
in subsequent builds.
Speeding up multistage builds in Docker
Most helpful comment
This seems like it's still an unresolved challenge, yes?