Podman: Unable to bind mount inside container (mount: mounting <dir> on <dir2> failed: Permission denied)

Created on 12 May 2019  Â·  37Comments  Â·  Source: containers/podman

/kind bug

Description

I require the ability to run mount --bind inside a container.

Steps to reproduce the issue:

sudo podman run -it --privileged docker.io/library/alpine:latest
/ # cd
~ # mkdir tmp
~ # mkdir tmp2
~ # mount --bind tmp tmp2/
mount: mounting tmp on tmp2/ failed: Permission denied
docker run -it --privileged library/alpine:latest
/ # cd
~ # mkdir tmp
~ # mkdir tmp2
~ # mount --bind tmp tmp2/
~ #

Additional information you deem important (e.g. issue happens only occasionally):

$ docker info
Containers: 8
 Running: 2
 Paused: 0
 Stopped: 6
Images: 3505
Server Version: 18.09.5
Storage Driver: btrfs
 Build Version: Btrfs v4.19
 Library Version: 102
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683b971d9c3ef73f284f176672c44b448662
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 5.0.7-gentoo-nulllabs-xeon-apparmor
Operating System: Gentoo/Linux
OSType: linux
Architecture: x86_64
CPUs: 31
Total Memory: 125.9GiB
Name: crucible
ID: LM6K:6A6Y:NATB:6CUG:5YJA:T5YB:JOF5:GGZZ:QNP7:MR5F:QZ5C:55AO
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Output of podman version:

Version:            1.2.0
RemoteAPI Version:  1
Go Version:         go1.12.1
Git Commit:         3bd528e583182b4249f3e6bbd8497a8831d89950
Built:              Fri Apr 12 00:08:57 2019
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: 3bd528e583182b4249f3e6bbd8497a8831d89950
  go version: go1.12.1
  podman version: 1.2.0
host:
  BuildahVersion: 1.7.2
  Conmon:
    package: Unknown
    path: /usr/libexec/crio/conmon
    version: 'conmon version 1.13.7, commit: 42585737f5eb59273e791e47ab1643e10862d67f'
  Distribution:
    distribution: gentoo
    version: unknown
  MemFree: 24831664128
  MemTotal: 135142313984
  OCIRuntime:
    package: Unknown
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc6+dev
      commit: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
      spec: 1.0.1-dev
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 31
  hostname: crucible
  kernel: 5.0.7-gentoo-nulllabs-xeon-apparmor
  os: linux
  rootless: false
  uptime: 475h 40m 15.31s (Approximately 19.79 days)
insecure registries:
  registries:
  - crucible.lab:4000
registries:
  registries:
  - crucible.lab:4000
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 22
  GraphDriverName: btrfs
  GraphOptions: null
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Build Version: 'Btrfs v4.19 '
    Library Version: "102"
  ImageStore:
    number: 15
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

Additional environment details (AWS, VirtualBox, physical, etc.):

On-prem, self-hosted

kinbug

All 37 comments

it seems to work fine on Fedora 30. Is there anything specific to Gentoo that could block it?

Does dmesg show anything?

Why would this work on the same box with docker but not with libpod, both using the same runtime?

There are many reasons why this could fail: missing capabilities, seccomp, SELinux/AppArmor. Could you show how /proc/CONTAINER_PID/status and /proc/CONTAINER_PID/mountinfo look like for a container created by Docker and one by Podman?

Podman status:

Name:   podman
Umask:  0022
State:  S (sleeping)
Tgid:   23709
Ngid:   0
Pid:    23709
PPid:   23708
TracerPid:      0
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 64
Groups: 0 1 2 3 4 6 10 11 26 27 1158 2000
NStgid: 23709
NSpid:  23709
NSpgid: 23708
NSsid:  21930
VmPeak:  2738064 kB
VmSize:  2672528 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     50116 kB
VmRSS:     50116 kB
RssAnon:           26004 kB
RssFile:           24048 kB
RssShmem:             64 kB
VmData:   390104 kB
VmStk:       132 kB
VmExe:     36916 kB
VmLib:      2856 kB
VmPTE:       488 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
CoreDumping:    0
THP_enabled:    1
Threads:        35
SigQ:   0/514865
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: fffffffffffbfeff
CapInh: 0000000000000000
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs:     0
Seccomp:        0
Speculation_Store_Bypass:       thread vulnerable
Cpus_allowed:   ffffffff
Cpus_allowed_list:      0-31
Mems_allowed:   00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list:      0
voluntary_ctxt_switches:        136
nonvoluntary_ctxt_switches:     1

Docker status:

Name:   docker
Umask:  0022
State:  S (sleeping)
Tgid:   26430
Ngid:   0
Pid:    26430
PPid:   11417
TracerPid:      0
Uid:    1157    1157    1157    1157
Gid:    1157    1157    1157    1157
FDSize: 64
Groups: 16 81 230 234 235 244 249 250 1001 1157 1158 2000 2001
NStgid: 26430
NSpid:  26430
NSpgid: 26430
NSsid:  11417
VmPeak:  1871732 kB
VmSize:  1806196 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     69132 kB
VmRSS:     69132 kB
RssAnon:           36784 kB
RssFile:           32348 kB
RssShmem:              0 kB
VmData:   287704 kB
VmStk:       132 kB
VmExe:     42976 kB
VmLib:      2148 kB
VmPTE:       428 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
CoreDumping:    0
THP_enabled:    1
Threads:        23
SigQ:   0/514865
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: ffffffffffc1feff
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs:     0
Seccomp:        0
Speculation_Store_Bypass:       thread vulnerable
Cpus_allowed:   ffffffff
Cpus_allowed_list:      0-31
Mems_allowed:   00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list:      0
voluntary_ctxt_switches:        73
nonvoluntary_ctxt_switches:     0

Your tests works fine for me on podman. I can not recreate this failure.

# rpm -q podman
podman-1.2.0-2.git3bd528e.fc30.x86_64
# podman run -it --privileged docker.io/library/alpine:latest
/ # cd
~ # mkdir tmp
~ # mkdir tmp2
~ # mkdir tmp2
~ # mount --bind tmp tmp2/
~ # exit

```

oops I saw that giuseppe already stated this.
Could this be an issue with runc?

Your docker status looks wrong, since it is showing the container with no capabilities.

Gentoo does not use SELinux, so most likely this is Capabilities or SECCOMP or something else, that we don't know about.

Is there anything mentioned in dmesg or in audit.log, if gentoo supports audit.log.

I don't have selinux enabled, however I did build everything with apparmor support:

aa-status
apparmor module is loaded.
1 profiles are loaded.
1 profiles are in enforce mode.
   libpod-default-1.2.0
0 profiles are in complain mode.
11 processes have profiles defined.
11 processes are in enforce mode.
   /bin/bash (12752) libpod-default-1.2.0
   /usr/sbin/pdns_server (12766) libpod-default-1.2.0
   /usr/sbin/pdns_recursor (12785) libpod-default-1.2.0
   /bin/bash (23575) libpod-default-1.2.0
   /usr/lib64/php7.1/bin/php-fpm (23630) libpod-default-1.2.0
   /usr/sbin/nginx (23638) libpod-default-1.2.0
   /usr/sbin/nginx (23639) libpod-default-1.2.0
   /usr/lib64/php7.1/bin/php-fpm (23647) libpod-default-1.2.0
   /usr/lib64/php7.1/bin/php-fpm (23648) libpod-default-1.2.0
   /bin/busybox (23801) libpod-default-1.2.0
   /root/.go/bin/registry (26564) libpod-default-1.2.0
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.

I can't seem to find the policy location for libpod-default-1.2.0 however:

/etc/apparmor.d # grep -r libpod ./
/etc/apparmor.d # 

Nothing pertinent in audit.log or syslog-ng output for /var/log/messages.

Podman:

CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff

Docker:

CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff

The mask includes the following capabilities:

[CHOWN DAC_OVERRIDE DAC_READ_SEARCH FOWNER FSETID KILL SETGID SETUID SETPCAP LINUX_IMMUTABLE NET_BIND_SERVICE NET_BROADCAST NET_ADMIN NET_RAW IPC_LOCK IPC_OWNER SYS_MODULE SYS_RAWIO SYS_CHROOT SYS_PTRACE SYS_PACCT SYS_ADMIN SYS_BOOT SYS_NICE SYS_RESOURCE SYS_TIME SYS_TTY_CONFIG MKNOD LEASE AUDIT_WRITE AUDIT_CONTROL SETFCAP MAC_OVERRIDE MAC_ADMIN SYSLOG WAKE_ALARM BLOCK_SUSPEND AUDIT_READ]

I can't seem to find the policy location for libpod-default-1.2.0 however

That's directly written from memory. Can you check if the process is not mistakenly confined under this profile? Mounting is denied which would explain the permission error.

You could do a sudo podman run --rm ubuntu ps auxZ and paste the output.

LABEL                           USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
libpod-default-1.2.0 (enforce)  root         1  0.0  0.0  25940  1500 ?        Rs   12:53   0:00 ps auxZ

Did you use the —privileged flag?

On Mon 20. May 2019 at 14:54, KBAegis notifications@github.com wrote:

LABEL USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
libpod-default-1.2.0 (enforce) root 1 0.0 0.0 25940 1500 ? Rs 12:53 0:00 ps auxZ

—
You are receiving this because you commented.

Reply to this email directly, view it on GitHub
https://github.com/containers/libpod/issues/3112?email_source=notifications&email_token=ACZDRA4KP47DQTLGJHPIEV3PWKNOPA5CNFSM4HMKXH22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVYXKPI#issuecomment-493974845,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACZDRA5SWBNX4QM4YQXDTADPWKNOPANCNFSM4HMKXH2Q
.

Yes, ^^, but for easy reference: sudo podman run -it --privileged docker.io/library/alpine:latest

This looks like an apparmor bug as we shouldn’t apply the profile with
—privileged.

On Mon 20. May 2019 at 14:57, KBAegis notifications@github.com wrote:

Yes, ^^, but for easy reference: sudo podman run -it --privileged
docker.io/library/alpine:latest

—
You are receiving this because you commented.

Reply to this email directly, view it on GitHub
https://github.com/containers/libpod/issues/3112?email_source=notifications&email_token=ACZDRAYXIOA7HER3RLSMJ53PWKN2TA5CNFSM4HMKXH22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVYXTUQ#issuecomment-493976018,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACZDRAZRPMPUALNBJJT5CUDPWKN2TANCNFSM4HMKXH2Q
.

Aside from this patch/workaround, how would one alter the profile to enable mount?

Does

--apparmor=unconfined
Fix your issue.

sudo podman run -it --privileged --apparmor=unconfined docker.io/library/alpine:latest
Error: unknown flag: --apparmor

OOps
podman run -it --privileged --security-opt apparmor=unconfined docker.io/library/alpine:latest

sudo podman run -it --privileged --security-opt apparmor=unconfined docker.io/library/alpine:latest
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
/ # cd;mkdir tmp;mkdir tmp2;mount --bind tmp tmp2
mount: mounting tmp on tmp2 failed: Permission denied

However:

sudo podman run --rm --security-opt apparmor=unconfined ubuntu ps auxZ
LABEL                           USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
unconfined                      root         1 30.0  0.0  25940  1464 ?        Rs   13:40   0:00 ps auxZ

The issue has changed setting security-opt:

sudo podman run --rm --security-opt apparmor=unconfined --security-opt seccomp=unconfined -it docker.io/library/alpine:latest
/ # cd;mkdir tmp;mkdir tmp2;mount --bind tmp tmp2
mount: permission denied (are you root?)
~ # id
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video)
~ #

All do --cap-add all

That is what --privileged is supposed to do, along with a couple of other things.

Works!

sudo podman run --rm --security-opt apparmor=unconfined --cap-add all -it docker.io/library/alpine:latest
/ # cd
~ # mkdir mnt;mkdir mnt2
~ # mount --bind mnt mntu2
mount: mounting mnt on mntu2 failed: No such file or directory
~ # mount --bind mnt mnt2
~ #

sudo podman run --rm --security-opt apparmor=unconfined --privigiled -it docker.io/library/alpine:latest
Does not?

Correct:

sudo podman run --rm --security-opt apparmor=unconfined --privileged -it docker.io/library/alpine:latest
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
/ # cd
~ # mkdir mnt
~ # mkdir mnt2
~ # mount --bind mnt mnt2
mount: mounting mnt on mnt2 failed: Permission denied

It looks to me like there's a defect somewhere with the privileged flag. Setting it seems to break --cap-add all

sudo podman run --rm --privileged docker.io/library/alpine:latest grep Cap /proc/self/status
sudo podman run --rm --cap-add all docker.io/library/alpine:latest grep Cap /proc/self/status

What is the difference?

On Fedora I see

$ sudo podman run --rm --privileged docker.io/library/alpine:latest grep Cap /proc/self/status
CapInh: 0000003fffffffff
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000003fffffffff
$  sudo podman run --rm --cap-add all docker.io/library/alpine:latest grep Cap /proc/self/status
CapInh: 0000003fffffffff
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000003fffffffff

The same here. The permissions are equivalent, however setting --privileged in any context appears to break bind mounts.

Perhaps looking at the output of mount between the two commands might show something, since --privileged also reverts all of the read-only mounts.

BTW Does AppArmor even matter for this? IE Is --cap-add =all enough to allow you to bind mount?

I would figure --cap-add SYS_ADMIN would be enough.

BTW Does AppArmor even matter for this? IE Is --cap-add =all enough to allow you to bind mount?

Yes, because AppArmor denies the mount syscall.

I would figure --cap-add SYS_ADMIN would be enough.

I concur, SYS_ADMIN should be enough for mounting.

The weird think is he says --privileged and apparmor=unconfined does not solve the issue, but
--cap-add all apparmor=unconfined does.

Once I am back from the trip, I'll take a look and fire off some VMs with AppArmor on it.

Just tried the most recent version:

user@system /usr/portage/app-emulation/libpod $ podman --version
podman version 1.3.1
user@system /usr/portage/app-emulation/libpod $ sudo podman run -it --security-opt apparmor=unconfined --cap-add all --rm --network host --entrypoint /bin/bash --privileged --cap-add SYS_ADMIN registry/oci/build:latest

WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.
system / # cd
system ~ # mkdir tmp
system ~ # mkdir tmp2
system ~ # mount --bind tmp tmp2
mount: /root/tmp2: bind /root/tmp failed.
Was this page helpful?
0 / 5 - 0 ratings