We currently have some lightweight documentation about how to use Compose in production, but this could do with improvements:
Resources
init scripts
and service-configs for systemd, that thing will take over. i thought of a command to generate these configs.
There's some previous discussion and requests in #93, #1264, #1730, #1035. I've closed those tickets in favour of this one, but linking to them for reference.
I would like some recommendations on how to deploy config files through docker/compose/swarm.
We have a setup (that was recommended by consultants) that makes images with config files and declares volumes to export them. This looks good in principle, but it does not work as expected in a number of cases.
Putting the configs into a "config" image that just exposes a volume seems like a reasonable way to do it. I'd be great to hear more about the cases where it doesn't work, either here or in a new issue.
@dnephin If you go down the road of one single configuration container and use --volumes-from you sacrifice some security (every container sees all configs) but it looks easy to set up and nice: it's immutable and does not use any host fs path.
Once you operate outside localhost and do not recreate everything at each run, you start learning the subtleties of the --volumes-from and compose recreation policies: the config image is updated and restarted, but client container still mount their current volumes unless they are recreated as well for independent reasons. This took a while to be noticed and left us with a workaround of deleting the old config container whenever the config changes.
Another solution would seem to be avoiding immutability and change data inside the same volumes, running a cp or similar. At this point it would just be easier to pull the configs from a git repo and skip the image build altogether... which was the stateful solution I originally had in mind. If you want no fixed host path, you need a config data-only container and a config copier.
I am not 100% happy with any of the solutions. Either I am not seeing a better way or maybe some feature is still missing (like some crazy image multiple inheritance or some smarter detection of dependencies when using --volumes-from that I can't figure out).
What this need is essentially is to add a build step, an environment layer, without really building a new image.
I have a suggested solution for supporting the zero-downtime deployment a lot of us want.
Why not simply add a new option to docker-compose.yml like "zero_downtime:" that would work as follows:
web:
image: sbgc (rails)
restart: always
links:
- postgres
- proxy
- cache
zero_downtime: 50 (delay 50 milliseconds before stopping old container. default would be 0)
I run separate containers for nginx, web(rails), postgres and cache(memcached). However, it's the application code in the web container that changes and the only one I need zero downtime on.
$ docker-compose up -d web
During "up" processing that creates the new "web" container, if the zero_downtime option is specified, start up the new container first exactly like scale web=2 would. Then stop and remove sbgc_web_1 like it currently does. Then rename sbgc_web_2 to sbgc_web_1. If a delay was specified (as in the 50 milliseconds example above) it would delay 50 milliseconds to give the new container time to come up before stopping the old one.
If there were 10 web containers already running it would start from the end and work backwards.
This is how I do zero downtime deploys today. Clunky but works: [updated]
$ docker-compose scale web=2 (start new container running as sbgc_web_2)
$ docker stop sbgc_web_1 (stop old container)
$ docker rm sbgc_web_1 (remove old container)
Update: we need a way to rename the sbgc_web_2 container to sbgc_web_1. Thought we could just use 'docker rename sbgc_web_2 sbgc_web_1' which works but then running 'docker-compose scale web=2' will produce sbgc_web_3 instead of sbgc_web_2 as expected.
What happens to links if you do that ? I guess you need a load balancer container linked to the ones you launch and remove and you can't restart it (?)
The links between containers are fine in the scenario above. Adding a load balancer in front would work but seems like overkill if we just need to replace a running web container with a new version. I can accomplish that manually by scaling up and stopping the old container but it leaves the new container numbered at 2. If the internals of docker-compose were changed to accommodate starting the new one first, stopping the old one and renumbering the new one I think this would be a pretty good solution.
In a real use case you want to wait for the second (newer) service to be ready before considering it healthy. This may include connecting to dbs, performing stuff. It's very application specific. Then you want to wait for connection draining on the older copy before closing it. Again connection draining and timeouts is application specific too. It could be a bit overkill to add support for all of that to docker-compose.
Right, the 2nd container would need time to start up which could take some time depending on the application. That is why I proposed adding a delay:
":zero_downtime: 50 (delay 50 milliseconds before stopping old container. default would be 0)
As far as stopping the original goes it wouldn't be any different than what docker-compose stop does currently.
Basically my proposal is just to start the new container first, give it time to come up if needed and then stop and remove the old container. This can be accomplished manually with the docker command line today. The only remaining piece would be to rename the new container. Also possible to do manually today except that docker compose doesn't change the internal number of the container.
Hey folks, I was facing the need for a zero-downtime deployment for a web service today, and tried to take the scaling approach which didn't work well for me before I realized I could do it by extending my app into 2 identical services (named service_a
and service_b
in my sample repo) and restarting them one at a time. Hope some of you will find this pattern useful.
https://github.com/vincetse/docker-compose-zero-downtime-deployment
It does not work for me. I have added an issue on your repo.
I just came across this ticket while deciding on whether or not to use compose with flocker and docker swarm, or whether to use ECS for scaling/deployment jobs, using the docker cli only for certain ad-hoc cluster management tasks.
I've decided to go with compose to keep things native. I'm not fond of the AWS API, and I think most developers, like me, would rather not mess about with ridiculously nested JSON objects and so on.
I then came across DevOps Toolkit by Viktor Farcic, and he uses a pretty elegant solution to implement blue-green deployments with compose and Jenkins (if you guys use Jenkins). It's pretty effective having tested it in staging. Otherwise it would seem @vincetse has a pretty good solution that doesn't involve much complexity.
a very good implementation of the rolling-upgrade already exists on Rancher
http://docs.rancher.com/rancher/latest/en/rancher-compose/upgrading/
as now docker swarm will be native, no need haproxy/nginx for load-balancing, and native health check arguments. is there any more simplified solution?
Update: we need a way to rename the sbgc_web_2 container to sbgc_web_1. Thought we could just use 'docker rename sbgc_web_2 sbgc_web_1' which works but then running 'docker-compose scale web=2' will produce sbgc_web_3 instead of sbgc_web_2 as expected.
If anyone wonders why, that's because of a label that docker-compose adds on container:
"Labels": {
// ...
"com.docker.compose.container-number": "3",
// ...
}
Sadly, it's not yet possible to update labels on running containers.
(also, to save people a bit of time: trying to break docker-compose by using its labels:
section to force the value of that label does not work :P )
Ok, I managed to automate zero downtime deploy, thanks @prcorcoran for the guidelines.
I'll give here a more detailed way about how to perform it, when using nginx.
docker-compose scale web=2
sudo nginx -s reload
, do not do a restart, or it will close active connections)docker-compose scale web=1
To find container ids after scaling up, I use:
docker-compose ps -q <service>
This can be used to find new container IP and to stop and remove old container.
The first id is the old container, the second id is the new one The order is not guaranteed, containers has to be inspected to know which one is the oldest.
To find container creation date:
docker inspect -f '{{.Created}}' <container id>
To find new container IP:
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' <container id>
As mentioned in previous comments, the number in the container name will keep incrementing. It will be eg app_web_1, then app_web_2, then app_web_3, etc. I didn't find that to be a problem (if there's ever a hard limit in this number, a cold restart of the app reset it). I didn't have either to rename containers manually to keep the newest container up, we just have to manually stop the old container.
You can't specify port mapping in your docker-compose file, because then you can't have two containers running at the same time (they would try to bind to the same port). Instead, you need to specify the port in nginx upstream configuration, which means you have to decide about it outside of docker-compose configuration.
The described method works when you only want a single container per service. That being said, it shouldn't be too hard to just have a look at how many containers are running, scaling to the double of that number, then stop/rm that number of old containers.
Obviously, the more services you have to rotate, the more complicated it gets.
@oelmekki The scale
command has been deprecated. The recommend way to scale is now:
docker-compose up --scale web=2
@oelmekki also, if the web
container has port
bindings to the host, won't running scale create a port conflict?
Bind for 0.0.0.0:9010 failed: port is already allocated
is the message I get for a container that has the following ports:
ports:
- 9000:9000
- 9010:9010
If you have a setup that utilizes nginx
, for instance, this probably won't be an issue since the service you're scaling is not the service that has port bindings to the host.
@jonesnc
also, if the web container has port bindings to the host, won't running scale create a port conflict?
That's why I explicitly mention not to do it :)
In my previous comment:
You can't specify port mapping in your docker-compose file, because then you can't have two containers running at the same time (they would try to bind to the same port). Instead, you need to specify the port in nginx upstream configuration, which means you have to decide about it outside of docker-compose configuration.
--
You can't bind those ports on host, but you can bind those ports on containers, which have each their own IP. So the job is to find the IP of the new container and replace the old container IP with it in nginx upstream configuration. If you don't mind reading golang code, you can see an implementation example here.
@oelmekki oops! That part of your post didn't register in my brain, I guess.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as not stale anymore due to the recent activity.
How to manage releases, particularly when deploying through the Docker Hub (e.g. in development build images, in production use image from Hub)
This is one of the main focus of https://github.com/docker/app, and considering how long this issue has been opened without any concrete answer, I think it's better to just close it.
@ndeloof That may be the main focus of docker/app, but there is still no way to do zero downtime or rolling updates, so I think only 1 portion of this bug has been solved by app
In fact, the above listed methods don't work now. If the image is updated and docker-compose scale web=2
is run, then both containers are recreated, instead of 1 new container of the new image
You're perfectly right, but the purpose of this issue has been to document those deployment practices, and obviously there's no standard way to achieve this, especially considering the various platforms compose can be used for (single engine, swarm, kubernetes, ecs ...)
Thanks @oelmekki for your insights. It has been very useful and encouraging when there is so few info on rolling updates with docker-compose.
I ended up writing the following script docker_update.sh <service_name>
, which seems to work very decently. It relies on healthcheck
command, which is not mandatory (change -f "health=healthy"
accordingly) but cleaner IMHO than waiting for container to simply being up
, when it takes a little time to boot (which will be the case if you run eg. npm install && npm start
as a command).
#!/bin/bash
cd "$(dirname "$0")/.."
SERVICE_NAME=${1?"Usage: docker_update <SERVICE_NAME>"}
echo "[INIT] Updating docker service $SERVICE_NAME"
OLD_CONTAINER_ID=$(docker ps --format "table {{.ID}} {{.Names}} {{.CreatedAt}}" | grep $SERVICE_NAME | tail -n 1 | awk -F " " '{print $1}')
OLD_CONTAINER_NAME=$(docker ps --format "table {{.ID}} {{.Names}} {{.CreatedAt}}" | grep $SERVICE_NAME | tail -n 1 | awk -F " " '{print $2}')
echo "[INIT] Scaling $SERVICE_NAME up"
docker-compose up -d --no-deps --scale $SERVICE_NAME=2 --no-recreate $SERVICE_NAME
NEW_CONTAINER_ID=$(docker ps --filter="since=$OLD_CONTAINER_NAME" --format "table {{.ID}} {{.Names}} {{.CreatedAt}}" | grep $SERVICE_NAME | tail -n 1 | awk -F " " '{print $1}')
NEW_CONTAINER_NAME=$(docker ps --filter="since=$OLD_CONTAINER_NAME" --format "table {{.ID}} {{.Names}} {{.CreatedAt}}" | grep $SERVICE_NAME | tail -n 1 | awk -F " " '{print $2}')
until [[ $(docker ps -a -f "id=$NEW_CONTAINER_ID" -f "health=healthy" -q) ]]; do
echo -ne "\r[WAIT] New instance $NEW_CONTAINER_NAME is not healthy yet ...";
sleep 1
done
echo ""
echo "[DONE] $NEW_CONTAINER_NAME is ready!"
echo "[DONE] Restarting nginx..."
docker-compose restart nginx
echo -n "[INIT] Killing $OLD_CONTAINER_NAME: "
docker stop $OLD_CONTAINER_ID
until [[ $(docker ps -a -f "id=$OLD_CONTAINER_ID" -f "status=exited" -q) ]]; do
echo -ne "\r[WAIT] $OLD_CONTAINER_NAME is getting killed ..."
sleep 1
done
echo ""
echo "[DONE] $OLD_CONTAINER_NAME was stopped."
echo -n "[DONE] Removing $OLD_CONTAINER_NAME: "
docker rm -f $OLD_CONTAINER_ID
echo "[DONE] Scaling down"
docker-compose up -d --no-deps --scale $SERVICE_NAME=1 --no-recreate $SERVICE_NAME
And here's my docker-compose.yml
:
app:
build: .
command: /app/server.sh
healthcheck:
test: curl -sS http://127.0.0.1:4000 || exit 1
interval: 5s
timeout: 3s
retries: 3
start_period: 30s
volumes:
- ..:/app
working_dir: /app
nginx:
depends_on:
- app
image: nginx:latest
ports:
- "80:80"
volumes:
- "../nginx:/etc/nginx/conf.d"
- "/var/log/nginx:/var/log/nginx"
And my nginx.conf
:
upstream project_app {
server app:4000;
}
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://project_app;
}
}
Hope it can be useful to some. I wish it would be integrated into docker-compose
by default.
Thanks a lot, but since you have bound app port to static 4000, it means only 1 container, new or old can bind to it, hence new and old containers can't run simultaneously. I need to test it once since I may be wrong
Most helpful comment
Ok, I managed to automate zero downtime deploy, thanks @prcorcoran for the guidelines.
I'll give here a more detailed way about how to perform it, when using nginx.
docker-compose scale web=2
sudo nginx -s reload
, do not do a restart, or it will close active connections)docker-compose scale web=1
useful commands
To find container ids after scaling up, I use:
This can be used to find new container IP and to stop and remove old container.
The first id is the old container, the second id is the new oneThe order is not guaranteed, containers has to be inspected to know which one is the oldest.To find container creation date:
To find new container IP:
a few more considerations
As mentioned in previous comments, the number in the container name will keep incrementing. It will be eg app_web_1, then app_web_2, then app_web_3, etc. I didn't find that to be a problem (if there's ever a hard limit in this number, a cold restart of the app reset it). I didn't have either to rename containers manually to keep the newest container up, we just have to manually stop the old container.
You can't specify port mapping in your docker-compose file, because then you can't have two containers running at the same time (they would try to bind to the same port). Instead, you need to specify the port in nginx upstream configuration, which means you have to decide about it outside of docker-compose configuration.
The described method works when you only want a single container per service. That being said, it shouldn't be too hard to just have a look at how many containers are running, scaling to the double of that number, then stop/rm that number of old containers.
Obviously, the more services you have to rotate, the more complicated it gets.