Moby: docker service create doesn't allow --privileged flag

Created on 20 Jul 2016  Ā·  99Comments  Ā·  Source: moby/moby

Output of docker version:

Client:
 Version:      1.12.0-rc4
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   e4a0dbc
 Built:        Wed Jul 13 03:39:43 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.0-rc4
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   e4a0dbc
 Built:        Wed Jul 13 03:39:43 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 54
Server Version: 1.12.0-rc4
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 71
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host overlay
Swarm: active
 NodeID: 33ops9juo9ea1twbfq2dyt89y
 IsManager: Yes
 Managers: 2
 Nodes: 5
 CACertHash: sha256:cef0da32ea05dd1038a5b8ae1a3a6956b6a5efa2d2fcad535a696dd568220197
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 3.13.0-86-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 12
Total Memory: 94.42 GiB
Name: irvm-ggallag
ID: WA3H:N54J:H7F3:CQV6:74ZX:IWIZ:U6XG:2VCB:45LP:LDD5:FHB6:7CWZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):
Ubuntu 14.04 VM under KVM running Docker enginer 1.12 RC4

Steps to reproduce the issue:

  1. docker service create
  2. inside docker image NFS mount

Describe the results you received:
I can run "docker run --privileged" to allow an NFS mount from within my container, however there is no way to pass this --privileged flag to "docker service" and if I do not pass the --privileged flag, the contain would error internally on the mount like:

mount: permission denied

Describe the results you expected:
I should be able to have my container mount an NFS server from within it. I do not want to do this externally or via a docker volume, for example, I am trying to drive a huge number of parallel containers running NFS mounts and I/O individually.

Additional information you deem important (e.g. issue happens only occasionally):

areswarm kinfeature versio1.12

Most helpful comment

Damn, forgot to come back and congratulate this issue on its second birthday.

All 99 comments

I think there is a whole set of issues for these features on service create, we should probably make an issue listing them all.

I think the plan was to discuss what should be added, once 1.12 is released.

Correct, I was planning to create a tracking issue for that

I could really use this for 1.12 as well. If this is an area I could jump in and issue a PR, I'm happy to get started on it.

We need to decide first; services are not "containers", so not all options can be / should be copied to service create

@thaJeztah Another consideration - I have different needs between a replicated service and a global one. If a jobs service type is introduced, which has been discussed, those needs might be different too.

The global one I may expect to have more flags/options around, just given the nature of "other things" I might be doing with them (monitoring, networking, running containers, etc.). I suppose I could have a global service that mounts the docker socket that then runs a privileged container on each node, but that seems messy (now my the tasks in my global service are managing the lifecycle of a container on each engine separately).

Hopefully that helps with some of that discussion.

If services != containers, why do you pass an image name to the create command? Seems like you would rather pass something like a manifest (maybe exactly like the docker-compose.yml file ?).

For this issue, if it is a PR, I'll phrase it in user story format: As a user of swarm, I want to create services and containers which run under privileged mode. How do I do this?

I'm happy to help any way that I can!

@frellus You're right - stacks/DABs are really what I need but they're also really early and don't have a service type option associated with them yet. There's also some other little nits there I need to write a more specific issue around in compose. Ultimately it's all still a bit of a chicken and egg problem - in one I get privileged, in the other I get service types. :) It'll all shake out, for now just reporting my uses to help provide as much data as I can! :smile:

I am very interested in this, because as far as I know the Oracle DB cannot run in a container without either the --privileged option or specifying the --shm flag when running a container.

Given that they are both not supported yet in API 1.24 for services AFAICS, it would be impossible to replace a Docker Swarm (standalone) by Docker 1.12 Swarm to run such services.

_Edit_: Oracle just published Docker files https://github.com/oracle/docker-images/tree/master/OracleDatabase, so this big hurdle is resolved for us.

Drive-by observation: you should do --cap-add before --privileged, to encourage people to be more granular in what they need. #25885 relates.

I'd really like to see it implemented soon - there are more solutions that require --privileged flag to function properly, e.g. cAdvisor which I'm using for containers performance monitoring.

If I may add ... --privileged and/or --device* are quite critical for the case you need to run containers using GPU/CUDA calculations ....
Placement rules to allow these kind of containers to run on specific hosts can be used ... but not being able to actually use the GPU ..... kind of renders the swarmmode useless for us ... :(

Just FYI, linking the PR for supporting "device" to this issue:
https://github.com/docker/swarmkit/issues/1244
https://github.com/docker/swarmkit/pull/1355

Even though there would be --device, I think --privileged is still attractive (e.g. for DinD)

+1

--cap-add and --cap-drop Is a must. --privileged would be nice. Are there any plans implementing it?

Missing --cap-add to use swarm mode in production, my management push me to move towards kubernetes if this option is not added soon. Do you have some plan and agenda for adding this feature please?

+1

--cap-add would be a huge help!

Working on kind of a workaround. It will run your privileged app in a secondary container by mounting /var/run/docker.sock in your service and proxying tcp connections back to the service container with socat and unix sockets. Still needs some work though.

I'd also like to see the --cap-add on docker services. I've written a workaround, similar to @seiferteric's if anyone would like to try it out: https://github.com/calh/docker_priv_proxy

I'm using signal traps, socat, and the docker socket to pair swarm mode service containers with a local privileged mode container. It seems to work well so far!

Just another use case, I want to run keepalived with vrrp on a swarm and it needs net=host and --cap-add=NET_ADMIN, so cap-add would be great.

+1

I want to be able to run a dockerised web service on an RPi that interfaces with a microcontroller over USB and provides web-based access to it over a network. I'm blocked by this and dotnet/dotnet-docker#223.

Folks, please avoid filling up this issue with +1 šŸ˜­
You could click the šŸ‘ button instead.

a

Here's a specific (admittedly edge) use case: I distribute DinD containers around a cluster of host nodes (the DinD containers allow us to run mini-isolated test swarms). If I were a able to use privileged services on an outer host swarm, I could take advantage of swarm for the automatic distribution of these workloads (and named access) without having to manage these details manually.

Any news on --device for services?

@Toshik #33440, it's not complete in terms of actually allocating the device, but putting the right API's in place.

Please don't +1 this issue, it's not helpful. If you have specific requirements to need this feature, please _describe_ the use case, which is more helpful.

At this point it is not likely that --privileged will be added as-is to docker service, as this option nullifies all security that containers provide (TL;DR; running a service/container with --privileged effectively gives root access to the host).

Instead, more fine-grained permissions are needed; while --cap-add / --cap-drop is an option, it doesn't solve everything; for example, security profiles (SecComp, SELinux, AppArmor) are also related. Setting the right combination of options is complex (and cumbersome), which is why often people "just set --privileged" to make things work (but doing so, running highly insecure - see above).

Security should be usable, therefore a proposal is opened to work towards a design that both allows people to give services the privileges they need in a more fine-grained approach, without introducing the complexity of setting all options manually.

The proposal can be found here: Proposal: Entitlements in Moby, and the POC https://github.com/docker/libentitlement repository, also there's a Google document describing the proposal in more detail.

We welcome feedback on the proposal; either by commenting on that issue, or taking part in the Orchestration Security SIG meetups (see https://forums.mobyproject.org/c/sig/orchestration-security)

@2stacks
@blop
@dalefwillis
@danmanners
@dfresh613
@dl00
@Farfad
@galindro
@getvivekv
@gklbiti
@joaquin386
@Kent-H
@matansag
@MingShyanWei
@pinkal-vansia
@realcbb
@siavash9000
@soar
@spfannemueller
@TAGC
@vingrad
@voron3x
@wangsquirrel

I removed your +1 comments, as they don't add useful input to the discussion; if you don't have a particular use-case / additional information that can help getting the design right, use the :+1: emoji in the top comment to let others know you're interested in this (see @AkihiroSuda's comment https://github.com/moby/moby/issues/24862#issuecomment-296577597). If you commented just to get notified on updates; use the subscribe button on the right side of this page.

This would really be helpful for getting more detailed cluster info. In my use case, I need to get info on memory, Infiniband devices, GPUs, network health, PCI topology, and a some other things. We need to periodically recheck this info. We use Swarm for our cluster level scheduling, and being able to run something like this as a global service so that all nodes could periodically report back info about themselves would be a huge benefit.

ATM we are limited to what docker tells us about the system. If we want any more detailed info, we have to go on the box and run the service ourselves or get a second scheduler.

Use case :
Distributed embedded environment with multiple nodes (>100) running headless in the wild. Need to be able to remotely update applications / environments. The embedded systems operate external electrical devices (e.g. smart home / industrial automation). Example devices:

  • USB to serial (either the USB or the TTY device could be shared)
  • UARTs (TTY)
  • GPIO pins
  • I2C / SPI
  • On-board LED's

Current plan is to run without docker service (i.e. 'manual swarm') until a workable solution exists.

Another use case:
Using a Container-as-a-service environment (so no ability to install anything on the host). We want to deploy an image (--mode=global) that can report back the stats we care about on all other docker containers running on the same host, as well as on the host itself. (Similar to @wannabesrevenge's case )

I mainly need this for bootstrapping/provisioning nodes, gathering hardware relevant data, setting ips for the host...
Currently I have a service running which then runs a script which does docker run --privileged.
That's ridiculous.

I'm in desperate needs for this feature to run a mode: global in swarm using docker-compose v3 for monitoring on the host level...

My current workaround is to drop the docker-compose.yml and use compose v2 instead and execute it with privileged: true.

To deploy containers such as sysdig. Seccomp, being proposed by Docker but not supported in swarm service is unreasonable.

IPSec VPN with strongSwan/Openswan is another use-case for privileged support.

I have an ASP.NET Core web service that I've dockerised. Part of the operation of this web service involves communicating with a microcontroller over USB. I can run it on my local Windows 7 machine using Docker Toolbox by passing the USB device through to Virtualbox (so the docker VM has access to it).

On Windows 10 deployment machines it would be preferable to use Docker For Windows, which runs Docker on top of Hyper-V. However, Hyper-V apparently doesn't support USB pass-through like Virtualbox does. One idea I had to work around this would be to split my webservice into two separate ones; one that handles all the USB communication logic and the other handling everything else. That would mean I could run the former (the "device comms" service) on a Raspberry Pi and the latter would be free to run anywhere, even in a non-privileged container. The idea would be to have a Docker swarm with these two services linked to each other.

However, how can I create a "device comms" service which requires privileged access to the filesystem if --privileged is not supported? How do I make any sort of Dockerised application that needs to perform USB communication on a Windows 10 machine?

This issue has been open for 14 months and it doesn't seem like any progress is being made.

I want to deploy GlusterFS in swarm, but glusterfs dont work without --privileged flag ((( Please help!

@thaJeztah

The proposal looks great, but that looks like a complete change of the way privileges work, which will take a loooooong time.

Meanwhile docker allows you to run --privileged, or to use --cap-add for containers, so why not provide the same facility for services and let the users decide if they want to run containers securely or insecurely until that proposal is implemented. This step would be simple I think, it is just a matter of passing variables when creating the containers on the other hosts.

It surprises me that security is thrown around as the reason why this has not been implemented, yet there is no inherent way to restrict egress traffic other than an external device, or by manipulating the ip tables.

I don't really need you to be concerned with how I run my applications.

@purplesrl and @knick-burns sum it up.

At this point it is not likely that --privileged will be added as-is to docker service, as this option nullifies all security that containers provide

I don't care about the security that containers provide. So what if a Docker container has unrestricted privileges when the host it's running on is a Ā£30 throw-away Raspberry Pi? Why not just permit people to use --privileged and if they shoot themselves in the foot with it, that's on them?

What I actually do want is the service discovery, self-healing, self-containment and modularity you get from using Docker swarm.

I agree this would be a great help. Mouting NFS is impossible without hacking the stuff apart.

@edward-of-cit you can always use a volume to mount nfs.

@cpuguy83 When you can point me to documentation that works for docker service and docker stack, I'll believe you. I have spent the last two days looking for NFS mount documentation and nothing I have found works.

@edward-of-clt In addition to the built-in support for low-level NFS options, there are a number of volume drivers out there as well.

For the built-in support:

services:
    foo:
        mounts:
            - type: volume
              source: mynfs
              target: /data

volumes:
    mynfs:
        driver_opts:
            type: "nfs"
            o: addr=<nfs server addr>
            device: :/path/to/share

See, I tried that. It's not working. Nothing ends up being mounted at the mount point.

And this is what I always get:

mounts Additional property mounts is not allowed

Probably need to set the "version" at the top of the stack file to 3.1 or 3.2... Can't remember what version it came out in. 3.4 is what's included with 17.09.

That's not working either.

Happy to help on the community slack.

I'm on there, but can't see anything but the messages that have been sent to me.
image

@edward-of-clt

From what I have researched you can use this:

https://github.com/mavenugo/swarm-exec

To execute a command across the swarm with privileged flag. There is also a tutorial for what you need here:

https://www.vip-consult.solutions/post/persistent-storage-docker-swarm-nfs

We talked offline. The linked items are really not what you want for mounting NFS... but in the interest of keeping this issue from diverging if anyone would like to discuss this, feel free to ping me on slack.

Our use case is that we need to assign fixed IPs to services over macvlan. We could re-assign the IPs from inside the container if we had --privileged or proper --cap-add. Nicer would be --ip option passed to the service definition similar to the way it is done with docker run.

The problem with macvlan is that it is unusable, when starting two services on two different worker nodes, we get the same IP for both.

@efocht actually, you need to split the subnet of the swarmed macvlan for each worker
its seems that the ip allocation for macvlan networks are not distributed at the moment.

I do something like (splitting /16 into sub /24 per worker) :

# on worker 1
docker network create --config-only --subnet 10.140.0.0/16 --gateway 10.140.0.1 -o parent=ens160 --ip-range 10.140.1.0/24 local-network-name
# on worker 2
docker network create --config-only --subnet 10.140.0.0/16 --gateway 10.140.0.1 -o parent=ens160 --ip-range 10.140.2.0/24 local-network-name
# on worker 3
docker network create --config-only --subnet 10.140.0.0/16 --gateway 10.140.0.1 -o parent=ens160 --ip-range 10.140.3.0/24 local-network-name

# on manager
docker network create -d macvlan --scope swarm --attachable --config-from local-network-name swarm-network-name

The setup above won't allow the failover a service with a particular IP. For example a user-space NFS server. I need service1 to have ip1, service2: ip2. And keep the IPs when the services are failed over.

@efocht could you decouple the IP addressing from the container networking in your environment? In our environment we run keepalived on the hosts to provide floating IP addresses and then tcp-proxy back into the container networks. This wonā€™t work if you need to dynamically assign IP addresses but you can proxy to different containers based on port instead of address.

@matthanley thank you for the hint, we will consider this solution. I'm a bit reluctant to add another high availability layer, though. Swarm's capabilities would actually be sufficient.

I don't want to hijack the thread to solve my issue, just wanted to report that we also have a use case for the addition of --cap-add option to services and hope that the July 20 proposal from @thaJeztah will be implemented and pushed soon.

I don't think we'll be adding --privileged or --cap-add. See #32801 for the current way we want to solve this both for swarm and k8s. There was a talk on this at DockerCon EU... struggling to find the video at the moment.

Actually I guess the talk was at Moby Summit. These videos have not been posted yet.

I need the option --priviledged for running jenkins in a swarm. Jenkins in my configuration needs to run docker within docker, because all builds are started in a fresh docker container.

See https://github.com/mwaeckerlin/jenkins

Advantages:

  • builds are highly reproducible
  • build packages for all kind of Linux distributions and version, by simply running a container of that version
  • no need to install build tool and library dependencies in the main jenkins container / service, but install them in the build container (or store images that already contain the dependencies)

So for me, running docker in docker is an absolut must requirement to be able to migrate jenkins from current single host local docker to my new docker swarm environment.

AFAIK, running docker in docker requires --priviledged, so for me, --priviledged is also an absolute must requirement to docker swarm, unless you can show me another solution for running docker containers in docker services.

@mwaeckerlin I'm don't have the details of what you're trying to do, but I'm thinking that you may not need docker in docker, but start sibling dockers from docker by mounting the docker socket as a volume.

Have a read at this: https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/

@jonasdddev, well I don't really bother the Ā«badĀ», Ā«uglyĀ» and Ā«worseĀ», because I have a working setup. But I'll give that solution mentioned there a try and just bind-mount /var/run/docker.sock into the jenkins image.

@mwaeckerlin Dito, don't care about it either, but was in your situation even before Swarm and that solution solved all my perceived problems. Just let me know for my info if it was successful.

Thanks.

I have same use case as @man4j. I want to use gluster within swarm and I need priviillaged= true, is there any work around to make it happen? I get "setting extended attributes" error

Hi all. I have a workaround. I create a swarm service from my image with next entrypoint:

!/bin/bash

exec docker run --init --rm --privileged=true --cap-add=ALL ...etc

It's all. When service tasks are created they run my privileged container and when tasks are destroyed SIGTERM kills child privileged container thanks to --init flag.

And of course we need to map docker.sock and install docker inside image for parent service.

Dockerfile:

FROM docker
RUN apk --no-cache add bash
COPY entrypoint.sh /
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]

entrypoint.sh:

!/bin/bash

exec docker run --init --rm --privileged=true --cap-add=ALL ...etc

service-run.sh:

!/bin/bash

docker service create --name myservice --detach=false \
--mount "type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock" myimage

Waiting for your thoughts!

@man4j this is the same workaround @port22 mentioned in https://github.com/moby/moby/issues/24862#issuecomment-326124626

Just mounting /var/run/docker.sock as volume does not work, @jonasddev. Just because the group id of group docker is different inside the container than outside, so if the user inside the container is not root, but only member of group docker, as it should be, then it has no access to the socket!

Inside the container:

root@docker[69686cddbd76]:/var/lib/jenkins# grep docker /etc/group
docker:x:111:jenkins
root@docker[69686cddbd76]:/var/lib/jenkins# ls -l /var/run/docker.sock
srw-rw---- 1 root 120 0 Jan  6 01:08 /var/run/docker.sock

Outside of the container:

marc@raum:~$ grep docker /etc/group
docker:x:120:marc
marc@raum:~$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 Jan  6 02:08 /var/run/docker.sock

Any good solution for this?

I'm facing this issue as well when trying to deploy GitLab in Docker Swarm

The package needs to set values but it can't and the service just keep restarting

/opt/gitlab/embedded/bin/runsvdir-start: line 24: ulimit: pending signals: cannot modify limit: Operation not permitted
/opt/gitlab/embedded/bin/runsvdir-start: line 37: /proc/sys/fs/file-max: Read-only file system

@jonasddev, I updated my jenkins image to fix the permissions within the container, I added to my entry-point-script:

# add user to group that has access to /var/run/docker.sock
addgroup --gid $(stat -c '%g' /var/run/docker.sock) extdock || true
usermod -a -G $(stat -c '%g' /var/run/docker.sock) jenkins || true

Now the next problem:

  • docker in docker, access through shared /var/run/docker.sock
  • mounting volumes does not work

This is the problem in a mwaeckerlin/jenkins container:

ubuntu$ docker run -d --restart unless-stopped -v /var/run/docker.sock:/var/run/docker.sock --name jenkins -p 8080:8080/tcp -p 50000:50000/tcp --volumes-from jenkins-volumes mwaeckerlin/jenkins
ubuntu$ docker exec -it -u jenkins jenkins bash

jenkins$ docker create -v /var/lib/jenkins/workspace/mrw-c++.rpm/distro/fedora-27:/workdir -v /var/lib/jenkins/.gnupg:/var/lib/jenkins/.gnupg -e LANG=en_US.UTF-8 -e HOME=/var/lib/jenkins -e TERM=xterm -e DEBIAN_FRONTEND=noninteractive -e DEBCONF_NONINTERACTIVE_SEEN=true -e BUILD_NUMBER=64 -w /workdir fedora:27 sleep infinity
cd93d11c2634b4e4094cc2541996e30a4c08c60923b8896e0bd7e439c7d9c673
jenkins$ docker start cd93d11c2634b4e4094cc2541996e30a4c08c60923b8896e0bd7e439c7d9c673
cd93d11c2634b4e4094cc2541996e30a4c08c60923b8896e0bd7e439c7d9c673
jenkins$ ls /var/lib/jenkins/workspace/mrw-c++.rpm/distro/fedora-27
AUTHORS                      COPYING                   mrw-c++.spec.in
autogen.sh                   debian                    NEWS
ax_check_qt.m4               demangle.h                README
ax_cxx_compile_stdcxx_11.m4  dependency-graph.sh       resolve-debbuilddeps.sh
ax_init_standard_project.m4  doc                       resolve-rpmbuilddeps.sh
bootstrap.sh                 examples                  rpmsign.exp
build-in-docker.conf         INSTALL                   sql-to-dot.sed
build-in-docker.sh           mac-create-app-bundle.sh  src
build-resource-file.sh       makefile.am               suppressions.valgrind
ChangeLog                    makefile_test.inc.am      template.sh
checkinstall.sh              mrw-c++.desktop.in        test
configure.ac                 mrw-c++-minimal.spec.in   valcheck.sh
jenkins$ docker exec -u 107 -it 033088f9008601d5f9f9034744b579910ee1f88bdf3a61d2f0354b8454ba94ed bash

docker$ ls /workdir
docker$ mount | grep /var/lib/jenkins/workspace/mrw-c++.rpm/distro/fedora-27
/dev/mapper/big-root_crypt on /workdir type btrfs (rw,relatime,space_cache,subvolid=257,subvol=/@/var/lib/jenkins/workspace/mrw-c++.rpm/distro/fedora-27)
docker$

So in a jenkins-container (here named jenkins, a container (here named docker) is started. A directory from the jenkins comtainer is mounted int the container in the container, but there, it is empty. The mount is visible in the mount command.

Any idea?

BTW: #21109

The problem is describe here in a coment: http://container-solutions.com/running-docker-in-jenkins-in-docker/

When using docker run inside the jenkins container with volumes, you are actually sharing a folder of the host, not a folder within the jenkins container. To make that folder ā€œvisibleā€ to jenkins (otherwise it is out of your control), that location should have a parent location that matches the volume that was used to run the jenkins image itself.

Of course, I want to mount from docker container to the container in the container and not from the host.

It is even a huge security hole: The docker container must not have access to the filesystem of the host where it is runnning!

Any simple solution for this?

Security Problem

If using -v /var/run/docker.sock:/var/run/docker.sock for docker in docker, you get full access to the host and you are no more jailed within the container!

Here a demonstration of the security issue mounting the socket:

marc@jupiter:~/docker/dockindock$ echo 'this is the host' > /tmp/test
marc@jupiter:~/docker/dockindock$ docker run -d --name dockindock -v /var/run/docker.sock:/var/run/docker.sock -v $(which docker):/usr/bin/docker mwaeckerlin/dockindock sleep infinity
d44fbd58e44a180e388621d39aff64652bc4118973c2cbc86a1738ffb481ebaf
marc@jupiter:~/docker/dockindock$ docker exec -it dockindock bash
root@docker[d44fbd58e44a]:/# cat /tmp/test
cat: /tmp/test: No such file or directory
root@docker[d44fbd58e44a]:/# echo 'this is the outer container' > /tmp/test
root@docker[d44fbd58e44a]:/# docker ps
CONTAINER ID        IMAGE                    COMMAND             CREATED             STATUS              PORTS               NAMES
d44fbd58e44a        mwaeckerlin/dockindock   "sleep infinity"    16 minutes ago      Up 16 minutes                           dockindock
root@docker[d44fbd58e44a]:/# docker run -it --rm -v /tmp:/tmp mwaeckerlin/ubuntu-base bash
root@docker[f197131c54c7]:/# cat /tmp/test
this is the host

So, the solution of using -v /var/run/docker.sock:/var/run/docker.sock is a security issue and an absolute no-go! @jonasddev

So either we get --privileged for swarm, or there is need for any other, better solution, that solves this issue!

Wanted Behaviour

Here a demonstration of how it should work, see the restricted access:

marc@jupiter:~/docker/dockindock$ echo 'this is the host' > /tmp/test
marc@jupiter:~/docker/dockindock$ docker run -d --name dockindock --privileged mwaeckerlin/dockindock
655c67b2a6d9f06da8bf630889710ee596f006331a036dc009c86a9e04ea0201
marc@jupiter:~/docker/dockindock$ docker exec -it dockindock bash
root@docker[655c67b2a6d9]:/# cat /tmp/test
cat: /tmp/test: No such file or directory
root@docker[655c67b2a6d9]:/# echo 'this is the outer container' > /tmp/test
root@docker[655c67b2a6d9]:/# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
root@docker[655c67b2a6d9]:/# docker run -it --rm -v /tmp:/tmp mwaeckerlin/ubuntu-base bash
Unable to find image 'mwaeckerlin/ubuntu-base:latest' locally
latest: Pulling from mwaeckerlin/ubuntu-base
1be7f2b886e8: Pull complete 
6fbc4a21b806: Pull complete 
c71a6f8e1378: Pull complete 
4be3072e5a37: Pull complete 
06c6d2f59700: Pull complete 
04fca7013ee9: Pull complete 
7a66494bf7fe: Pull complete 
be1530d02718: Pull complete 
57cb4fb92cd1: Pull complete 
4170a785b84a: Pull complete 
36570a7926c8: Pull complete 
34218f1ce9d6: Pull complete 
Digest: sha256:e9207a59d15739dec5d1b55412f15c0661383fad23f8d2914b7b688d193c0871
Status: Downloaded newer image for mwaeckerlin/ubuntu-base:latest
root@docker[bcfc1e9bc756]:/# cat /tmp/test
this is the outer container

As you see, there is no access to the images from outside of the outer container, docker ps does not show the host's containers and the directory cannot be mounted from outside of the container, but view is limited to the outer container. This is real encapsulation, that's how it must be.

Conclusion

There is no solution for docker in docker in a docker swarm, unless option --privileged is supported in docker swarm!

_The option --privileged is a must_ to have docker containers in docker containers, encapsulated without access to the whole swarm!

Please add, needed for fuse!

So, the solution of using -v /var/run/docker.sock:/var/run/docker.sock is a security issue and an absolute no-go!

Correct; if youā€™re blind-mounting the socket, youā€™re not running docker-in-docker; youā€™re controlling the hosts daemon from inside the container; in many cases this may actually be preferable, see http://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/

The option --privileged is a must to have docker containers in docker containers, encapsulated without access to the whole swarm!

First of all, only _manager_ nodes have access to the whole swarm; worker nodes can only control the worker node itself

But be aware that ā€”privileged is equivalent to having full root access on the host; there is _no_ protection whatsoever, processes inside the container can escape the container, and have full access to the host (for example, have a look at /dev inside a privileged container, and you see it has access to all devices from the host)

@thaJeztah writes:

But be aware that ā€”privileged is equivalent to having full root access on the host; there is no protection whatsoever, processes inside the container can escape the container, and have full access to the host (for example, have a look at /dev inside a privileged container, and you see it has access to all devices from the host)

Then let's adapt the requirement. What is a good solution to have:

  • Real independent Docker in Docker
  • Full security, no access to the host, no escape
  • Can run in a swarm

For me, not the --privileged flag is important, but the real and secure docker in docker.

Possible use cases:

  1. Jenkins with untrusted users that can setup their own build jobs
  2. Sell docker playground to any unkown audience, let them start any docker services from within docker

Currently I am running a jenkins server in a local docker container using --privileged, and I would like to migrate this to a swarm. The jenkins instanciates a dedicated docker container for every build, i.e. for cross-build to windows and to any linux distribution, e.g. to build dep-packaged for ubuntu xenial, it runs the build in a container from mwaeckerlin/ubuntu:xenial-amd64.

@mwaeckerlin

We need to get nested runc to work without root privileges first.
You may want to follow this issue: https://github.com/opencontainers/runc/issues/1658

I'd like to add that Kubernetes does support adding capabilities. For an example, check out https://caveofcode.com/2017/06/how-to-setup-a-vpn-connection-from-inside-a-pod-in-kubernetes/ (this is the exact same use case I have, and that I'd like to run on docker swarm).

Any news ?

the news is all k8!

+1

Gave up on this and switched to Kubernetes.

On Jun 15, 2018, at 19:02, Denis notifications@github.com wrote:

+1

ā€”
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

Just first off, thank you @thaJeztah and co-devs for being attentive on this. I'm sure this is a point of contention among yourselves.

Really just came here to say that I've been successful with bind mounting the docker socket and using docker binaries in the container with Jenkins in a swarm (not bind mounting the binaries). In that specific case, I've constrained Jenkins to only run on manager nodes. Anyone should make sure the permissions are set correctly, especially when you're working with GlusterFS, as all of my Jenkins nodes run on top of GlusterFS volumes with no issues so far (been about 6 months).

The --privileged flag was something I used when prototyping but when it wasn't available I found that just setting the correct permissions did the trick for my purposes; something everyone should be aware of and practicing when bind mounting anyway. My case is different than many others, so your mileage will vary.

Damn, forgot to come back and congratulate this issue on its second birthday.

zabbix-agent need privileged mode to monitor host resources.

Need privileged mode to run systemd enabled containers! Like dogtag-ca for instance.

Trying to deploy dell openmanage exporter and it doesn't work. I understand that --priv / --cap-add can be a security issue, but if I want to shoot myself in the foot, I should be able to do so.

Hey guys, I've been maintaining my own poorly-patched fork of docker for a while with support for --cap-add and --privileged in docker stack deploy, maybe we should create a proper fork?

@manvalls I think --cap-add (or --capabilities, i.e., not merging defaults, but require the full set to be specified) is something that would be accepted; have you considered contributing, and opening a pull request to discuss that option?

Hi @thaJeztah, this has already been done:
https://github.com/moby/moby/pull/26849
https://github.com/docker/swarmkit/pull/1565

Like many people here I've been closely monitoring these issues for a long time until it came clear that there were only two options: switching to kubernetes or forking docker.

I'm pretty sure I'm not the only one who opted to gather all those PR together and maintain their own docker fork, and I'm really glad I did so because I, like many others, love what you guys did with docker swarm.

If adding these features to upstream docker in a convenient way is against its principles, which really seems to be the case, it feels only natural for all those forks which lots of people are likely using already to unite together. Maybe even you guys could help maintain it, or even own it?

@manvalls the PR you linked to was closed because that implementation did --cap-add / --cap-drop. The proposal was to have a --capabilities flag (which would override the defaults)
https://github.com/moby/moby/pull/26849#issuecomment-249176129 :

I think it is better long term if we switch to a --capabilities interface which says which you actually need, even if you might get more by default for compatibility.

If someone wants to work on that, that's a change that will likely get accepted.

Also need device, privileged, cap_add / cap_drop. No progress here?

@pwFoo A workaround is to bind-mount /var/run/docker.sock and call docker run within service containers.

Maintainers: Although I agree we should support "entitlements" in the long-term plan, as we already have been supporting bind-mounting arbitrary path including docker.sock, I don't see any security degradation in implementing docker service create --privileged.

@pwFoo there is that #26849 PR which you can take as template and just implement asked changes to get it merged (look comments above). That beauty of open source that you can do that yourself šŸ˜‰

It is still possible to get that feature out on 19.03 if someone just take to step and start implement it.

EDIT: Looks that #26849 have been reopened today :)

I'm new with Go. So I don't think I should do it...
But maybe someone with more experience could do it.

What's the state of this issue? (Docker version 17.12.1-ce)

needs somebody to take over https://github.com/moby/moby/pull/26849

FYI. I started implementing capabilities feature. You can see plan and status on https://github.com/moby/moby/issues/25885#issuecomment-447657852

Pleeeeaaaaase add this feature in :(

Pleeeeaaaaase don't send useless messages here. It just slowdown the process as lot of people get notified and that time is away from actual implementation.

My plan is to get #38380 released as part of 19.03 and then actual Swarm side changes on version which comes after that (19.06 I guess). Anyone who want to help plz test that and tell comments on PR so then hopefully we can stay on that schedule.

This issue is open since 2016 is there a chance that this gets added to Docker in the next years.... ? This is a must have feature, for running Docker in Docker on a swarm. Please Docker team.

Sorry but this is very frustrating for me. I can't switch to a HA setup because of this :(

39173 was merged so this feature will ship as part of Docker 19.06 / 19.09 (not sure actual version yet).

Please, look my question about CLI side implementation on https://github.com/moby/moby/issues/25885#issuecomment-501017588
and here you can also comment use cases which actually would need --privileged switch or can all those be handled by defining all needed capabilities? (nothing prevents user to list all of them)

Hi
I to seek for the possiblity to attach a device such as the /dev/ttyACM0 to a stack deployed service.
I created a script to run based on crontab every 5 min. (on the host of every node)
If the script seeĀ“s my special device attached. Then it will raise a node flag.
This flag is then used as a constraint to what node the service is allowed:

If lets say node1 failes then i can move the usb-dongle to node2 and the service is started there.

If you are interested in the cron script then look here:
https://github.com/SySfRaMe/Docker_ZwaveHW_Flag

Was this page helpful?
0 / 5 - 0 ratings

Related issues

anshumanbh picture anshumanbh  Ā·  3Comments

lvthillo picture lvthillo  Ā·  3Comments

noahwhite picture noahwhite  Ā·  3Comments

cefn picture cefn  Ā·  3Comments

bhuvan picture bhuvan  Ā·  3Comments