Portainer: Portainer running on a swarm with scale > 1 does not propagate credentials

Created on 19 Sep 2017 · 1Comment · Source: portainer/portainer

Description

Portainer seems to be unable to connect to the docker endpoint (host) when the portainer service is scaled greater than one instance when running on a swarm.
I have several swarms which reproduce this issue.

Steps to reproduce the issue:

Create a swarm with 5 managers 5 workers
Run portainer with

  docker service create \
    --name portainer \
    --publish 9000:9000 \
    --constraint 'node.role == manager' \
    --mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock \
    portainer/portainer \ 
    -H unix:///var/run/docker.sock

Scale the service to 5
$ docker service scale portainer=5
Create admin account, login. All works well
Log out, wait (5?) minutes
Attempt to log back in, get greeted by the "Admin account setup" screen
Perform the setup (again)

This causes massive security holes if the portainer instance does not propagate the credentials and user accounts across swarm managers and lets anyone create an admin account on a not yet instantiated portainer instance.
It also is not reliable if there only is one instance of portainer on an AWS docker swarm since AWS has the keen ability to shut down managers...

Technical details:

Portainer version: 1.14.0 image 47dbf4321bb4
Target Docker version (the host/cluster you manage): Docker version 17.06.0-ce, build 02c1d87
Platform (windows/linux): linux
Command used to start Portainer (docker run -p 9000:9000 portainer/portainer): see above, service
Target Swarm version (if applicable): Docker version 17.06.0-ce, build 02c1d87
Browser: Chrome 61

Logs:

[email protected]    | 2017/09/19 17:01:01 Starting Portainer on :9000
[email protected]    | 2017/09/19 17:01:01 Starting Portainer on :9000
[email protected]    | 2017/09/19 16:11:34 Starting Portainer on :9000
[email protected]    | 2017/09/19 16:11:34 http error: User not found (code=404)
[email protected]    | 2017/09/19 16:11:38 http error: User not found (code=404)
[email protected]    | 2017/09/19 17:01:01 Starting Portainer on :9000
[email protected]    | 2017/09/19 17:01:02 Starting Portainer on :9000
[email protected]    | 2017/09/19 17:09:30 http error: User not found (code=404)
[email protected]    | 2017/09/19 17:09:30 http error: User not found (code=404)
[email protected]    | 2017/09/19 17:12:20 http error: User not found (code=404)

aredeployment

Source

stevelacy

>All comments

Hi @stevelacy

When running Portainer with the following command:

  docker service create \
    --name portainer \
    --publish 9000:9000 \
    --constraint 'node.role == manager' \
    --mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock \
    portainer/portainer \ 
    -H unix:///var/run/docker.sock

You're telling Swarm to deploy a Portainer instance on each one of your Swarm manager node. You're not actually persisting the data of the Portainer container outside of the container (see http://portainer.readthedocs.io/en/stable/deployment.html#persist-portainer-data).

Portainer uses a filesystem database that is by default embedded inside the Portainer container. With this service create instruction, you are telling Swarm to deploy one container of Portainer on each manager of your cluster, each instance with its own embedded database. When you query the Portainer instance via Swarm, it will automatically redirect you to one of the Docker instance deployed in the Swarm and thus, you'll hit a different instance (with a different database) for each request.

When deploying Portainer as a service inside a Swarm cluster, we recommend the following:

Present a shared filesystem to all your manager nodes where you can persist the Portainer database (nfs, glusterfs...)
Deploy a single replica service in your Swarm manager (multiple Portainer instances cannot share the same database at the moment, see: https://github.com/portainer/portainer/issues/523) and persist the data in the shared filesystem. Something similar to:

docker service create \
    --name portainer \
    --publish 9000:9000 \
    --replicas=1 \
    --constraint 'node.role == manager' \
    --mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock \
    --mount type=bind,src=/path/to/shared/filesystem,dst=/data \
    portainer/portainer -H unix:///var/run/docker.sock

This will ensure only one instance of Portainer is running on a manager at any time. If the Portainer instance goes down or the manager node where Portainer is running goes down, Swarm will automatically re-schedule a new Portainer container inside the cluster and ensure high-availability.

I'll close this issue, feel free to re-open it if you think something is missing. You can also comment in the related issue https://github.com/portainer/portainer/issues/523 if you want to clarify the subject.