Description
Portainer seems to be unable to connect to the docker endpoint (host) when the portainer service is scaled greater than one instance when running on a swarm.
I have several swarms which reproduce this issue.
Steps to reproduce the issue:
docker service create \
--name portainer \
--publish 9000:9000 \
--constraint 'node.role == manager' \
--mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock \
portainer/portainer \
-H unix:///var/run/docker.sock
$ docker service scale portainer=5
This causes massive security holes if the portainer instance does not propagate the credentials and user accounts across swarm managers and lets anyone create an admin account on a not yet instantiated portainer instance.
It also is not reliable if there only is one instance of portainer on an AWS docker swarm since AWS has the keen ability to shut down managers...
Technical details:
docker run -p 9000:9000 portainer/portainer): see above, service Logs:
[email protected] | 2017/09/19 17:01:01 Starting Portainer on :9000
[email protected] | 2017/09/19 17:01:01 Starting Portainer on :9000
[email protected] | 2017/09/19 16:11:34 Starting Portainer on :9000
[email protected] | 2017/09/19 16:11:34 http error: User not found (code=404)
[email protected] | 2017/09/19 16:11:38 http error: User not found (code=404)
[email protected] | 2017/09/19 17:01:01 Starting Portainer on :9000
[email protected] | 2017/09/19 17:01:02 Starting Portainer on :9000
[email protected] | 2017/09/19 17:09:30 http error: User not found (code=404)
[email protected] | 2017/09/19 17:09:30 http error: User not found (code=404)
[email protected] | 2017/09/19 17:12:20 http error: User not found (code=404)
Hi @stevelacy
When running Portainer with the following command:
docker service create \
--name portainer \
--publish 9000:9000 \
--constraint 'node.role == manager' \
--mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock \
portainer/portainer \
-H unix:///var/run/docker.sock
You're telling Swarm to deploy a Portainer instance on each one of your Swarm manager node. You're not actually persisting the data of the Portainer container outside of the container (see http://portainer.readthedocs.io/en/stable/deployment.html#persist-portainer-data).
Portainer uses a filesystem database that is by default embedded inside the Portainer container. With this service create instruction, you are telling Swarm to deploy one container of Portainer on each manager of your cluster, each instance with its own embedded database. When you query the Portainer instance via Swarm, it will automatically redirect you to one of the Docker instance deployed in the Swarm and thus, you'll hit a different instance (with a different database) for each request.
When deploying Portainer as a service inside a Swarm cluster, we recommend the following:
docker service create \
--name portainer \
--publish 9000:9000 \
--replicas=1 \
--constraint 'node.role == manager' \
--mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock \
--mount type=bind,src=/path/to/shared/filesystem,dst=/data \
portainer/portainer -H unix:///var/run/docker.sock
This will ensure only one instance of Portainer is running on a manager at any time. If the Portainer instance goes down or the manager node where Portainer is running goes down, Swarm will automatically re-schedule a new Portainer container inside the cluster and ensure high-availability.
I'll close this issue, feel free to re-open it if you think something is missing. You can also comment in the related issue https://github.com/portainer/portainer/issues/523 if you want to clarify the subject.