Hi, thanks for very useful tool. How can I deploy updates to my rails app on production with zero downtime.
Currently I run the following on production, but it causes about 10 seconds of downtime.
sudo fig pull web
sudo fig up -d web
My production fig.yml:
db:
image: postgres:9.3
volumes_from:
- db-data
ports:
- 5432
web:
image: myaccount/my_private_repo
command: bundle exec unicorn -p 3000 -c ./config/unicorn.rb
volumes_from:
- gems-2.1
ports:
- "80:3000"
links:
- db
Thanks
zero downtime can not be achieved while rebooting a container. You need to either load balance to an other container while updating this one or update the code in your container and run a kill -HUP on unicorn.
Fig doesn't provide a solution for zero-downtime restarts. We shouldn't rule it out, but it's a complicated task and needs a thorough discussion of what functionality would best serve the majority of production cases.
Thank you, @coulix and @aanand. I assumed that FIG was suitable for production out of the box without extra work (which was silly of me, I admit). Do you think a paragraph or two about it can be useful in the readme? And maybe with links to tutorials and blog posts where people describe their production setups. Just to make it clear that FIG can not do zero-downtime restart, at least at the moment.
Since Fig is explicitly a development tool at the moment, I don't feel it's urgent that we address zero-downtime restarts in the docs. As Compose shapes up, it's definitely going to _become_ a concern (see the roadmap), and at that point we'll start to talk about production use more officially.
@evgenyneu Great post ! +1
Couple separate questions:
This is possibly a stupid question. Yes I know there are a lot of options out there. Im pretty new to docker and still learning the ropes a bit. Though it'd be nice to be able to do development work and production deployments with as similar a configuration file as possible. Or like... common configuration file with overriding files for each environment.
This (at least for me) would solve the very simple zero-downtime using fig in production on a single machine. I noticed there is no way to fig kill web_1
and fig restart web
does not _rebuild_ the container even after fig build web
to rebuild the image. Example of what I see happening below.
# as example: i have 3 web containers running
fig recycle web
# - fig build web - rebuild our image
# - tear down/kill web_1, start new web_1 fresh from new image
# - tear down/kill web_2, start new web_2 fresh from new image
# - tear down/kill web_3, start new web_3 fresh from new image
You could remove the auto fig build web
and have the user do it manually before hand if you wanted, and make users do fig build web && fig recycle web
. Also maybe adding a -p, --pause
option to set a pause in between to give services time to restart if you are using rails or java or similar.
+1, there doesn't seem to be a "best-practice" way of doing that, it looks as if there's not a lot of people using Docker with Rails yet, at least not for production purposes. The plethora of information out there is hard to digest. It'd be nice to have zero-downtime deployments out-of-the-box with Fig or Compose.
I guess a suitable way for simple setups is to keep the container running and restart, say, puma inside it (provided it's not being run in the foreground)? Capistrano could be used to orchestrate that, after updating a git repo on the host connected to the container via a volume. Any thoughts?
@aanand any update on this by any chance? @fullofcaffeine solution seems to make sense, if using a process manager within a container. Would be great if there were a few recommended strategies.
We are moving our CI and Alpha environments to docker-compose.
Since we like to release there any commit from any team, we need to do dozens of updates in a single work day, most of them affecting only 1 service.
Restarting 1 service causes a partial, quick downtime, while restarting the whole thing takes about 5/10 minutes.
Since dependency information and update checking (via image pull) are all managed via docker-compose, it is only logical that some kind of selective restart of services with updated images is handled by compose itself.
Is there any way to implement it? (or a better issue to track?)
To perform a real zero-downtime deployment you need a load balancer, and a tool to add/remove backends from the load balancer as nodes are stopped and started. If the load balancer is restarted as part of the deploy you won't have zero-downtime, so it has to be outside of the compose file context and part of the infrastructure.
Since compose doesn't manage any of that infrastructure for you it's not really possible to do a real zero-downtime deployment, without some other tooling.
I think for some cases (like dev and staging environments) what you're looking for is very-little-downtime deployments, which is something we can aim for with compose.
In 1.4.x we made "smart-recreate" the default. This means that a container is only recreated if it changes, or one of it's dependencies changes. In 1.5.0 we added experimental support for the new docker networks, which removes the need to recreate containers when only their dependencies have changed.
In 1.6.0 we should be making the new networking the default, and we can look at doing parallel restarts of all containers, which should make for relative short downtime.
With the current release I would expect only a few seconds of downtime to recreate containers. Can you tell me more about why it takes 5 to 10 minutes?
Some related issues: #1663, #1264, #1035
@dnephin I think compose is fast enough in starting everything from scratch.
The wasted time for us is in restarting ALL the applications inside the containers after just ONE was recently updated. Each app, takes about 10/30 seconds (with high CPU) to initialize.
What I would like to have is a selective restart/rebuild of only those containers that have changes (image, dependencies). This results in a short partial downtime, rather than a full, longer one.
Something like:
docker-compose pull # checks updates for all
docker-compose up -d --smart-restart # restarts all contaners that have newer images, changed dependencies.
What I would like to have is a selective restart/rebuild of only those containers that have change
That logic already exists at the service level. If all the containers for a service have the latest image and config, it won't be restarted (unless one of the dependencies change). It will just say "... is up-to-date. As I mentioned if you use --x-networking
it removes the need to recreate when the links change as well.
If you're looking for support at the container level, that was recently requested in #2451
I found using docker-compose pull
first saves the reload time and brings it down to a minimal number of seconds (depending on how long your container takes to boot)
I just worked out another flow to have Zero downtime (mainly for web apps).
We use jwilder/nginx-proxy
to handle routing to app servers, this will assist us in dynamically routing requests to services.
For the first deploy run docker-compose --project-name=app-0001 up -d
.
We edit the docker-compose.yml
with the new image id and run docker-compose --project-name=app-0002 up -d
. We now have version 0.2 of the app up and running. The load balancer will already begin routing requests, and given we are using nginx LB we will have 0 downtime.
If you need to reach a desired scale you can now run that command to scale up resources before you shutdown the older version.
Now we can do docker-compose --project-name=app-0001 stop
to close down the previous deploy. (Optionally we can run an rm
to remove the data - but it might be a good idea to only remove this on the next deploy i.e. deploy app-0003 up, app-0002 stop, app-0001 rm).
If you have reasoning behind limiting the number of resources running at a given time you could simply stagger the scale. I.e:
app-0001 scale web=10
...
app-0001 scale web=8 app-0002 scale web=2
app-0001 scale web=6 app-0002 scale web=4
app-0001 scale web=8 app-0002 scale web=2
app-0001 scale web=10 app-0002 scale web=0
...
app-001 stop
A rollback is also quite simple, run up
on 0001 and stop
on 0002.
This would be as simple as running scale web=2
on app-0001, and scale web=8
on app-0002.
This also reminds me of a similar deploy strategy to capistrano
, I might even look at using a similar tool to wrap docker-compose and save a rewrite of the deploy logic.
That looks like an awesome way of handling it @alexw23. I did not know about the --project-name
param but that looks to cure this problem. Also simple enough to write a wrapper around for your own deployments
@alexw23 Awesome on the "--project-name"!
@alexw23 Thanks, very informative!
_cough_ rancher
Closing this as duplicate of #1786
Thanks for sharing the process, @alexw23. I have a couple of questions I hope you don't mind clarifying (apologies if I'm missing something obvious, I'm not too familiar with docker/compose yet):
nginx-proxy
or any data stores)? Otherwise you end up creating more copies of those other components when they are not required (for nginx-proxy
in particular this might create a problem since it publishes port 80, doesn't it?). How do you keep things organized / linked in this case?nginx-proxy
start sending requests even if the container isn't ready to serve requests? (e.g. rails takes some time to load... but as far as the container is concerned, it's up and running).If you are able to share some examples or scripts it would be great, and thanks again for sharing your solution!
Hi guys,
In case someone still needs a rolling upgrade example I came with the following solution on my (rocketchat) application:
for i in `docker ps -f name=<container_name>_* -q` ; do
docker-compose scale <service_name>=5
docker stop $i
sleep 10
docker rm -f $i
done;```
Most helpful comment
I just worked out another flow to have Zero downtime (mainly for web apps).
Proxy
We use
jwilder/nginx-proxy
to handle routing to app servers, this will assist us in dynamically routing requests to services.First Deploy
For the first deploy run
docker-compose --project-name=app-0001 up -d
.Rolling Update
We edit the
docker-compose.yml
with the new image id and rundocker-compose --project-name=app-0002 up -d
. We now have version 0.2 of the app up and running. The load balancer will already begin routing requests, and given we are using nginx LB we will have 0 downtime.If you need to reach a desired scale you can now run that command to scale up resources before you shutdown the older version.
Now we can do
docker-compose --project-name=app-0001 stop
to close down the previous deploy. (Optionally we can run anrm
to remove the data - but it might be a good idea to only remove this on the next deploy i.e. deploy app-0003 up, app-0002 stop, app-0001 rm).Truly rolling update
If you have reasoning behind limiting the number of resources running at a given time you could simply stagger the scale. I.e:
Rollback
A rollback is also quite simple, run
up
on 0001 andstop
on 0002.80/20 deploy
This would be as simple as running
scale web=2
on app-0001, andscale web=8
on app-0002.Automate Everything
This also reminds me of a similar deploy strategy to
capistrano
, I might even look at using a similar tool to wrap docker-compose and save a rewrite of the deploy logic.