Podman: Volume for container with custom user is not writeable

Created on 11 Sep 2019 · 33Comments · Source: containers/podman

/kind bug

Description

When running a container with explicitly set user, such as
https://github.com/schemaspy/schemaspy/blob/a28c9fc932cc6f85c7780050a678b3a3d7f595e9/Dockerfile#L44 the volume mounted by podman is not writeable.

Steps to reproduce the issue:

Get some PostreSQL host (192.168.4.1) and port (5432)
Create dir to mount as a volume

$ mkdir html

Run podman

$ podman run -it -v "$PWD"/html:/output:Z schemaspy/schemaspy:snapshot -u postgres -t pgsql11 -host 192.168.4.1 -port 5432 -db anitya
...
INFO  - Starting Main v6.1.0-SNAPSHOT on 9090a61652af with PID 1 (/schemaspy-6.1.0-SNAPSHOT.jar started by java in /)
INFO  - The following profiles are active: default
INFO  - Started Main in 2.594 seconds (JVM running for 3.598)
INFO  - Starting schema analysis
ERROR - IOException
Unable to create directory /output/tables
INFO  - StackTraces have been omitted, use `-debug` when executing SchemaSpy to see them

Describe the results you received:

Container is unable to write to /output, most likely because it is running with java user.

Describe the results you expected:

Volumes work rw regardless of user settings inside of conainer.

Output of podman version:

✗ podman version
Version:            1.5.1
RemoteAPI Version:  1
Go Version:         go1.12.7
OS/Arch:            linux/amd64

kinbug stale-issue

Source

abitrolly

👍1

Most helpful comment

I agree that unshare is a terrible name; we named it after an existing utility that enters user namespaces (doing something very similar to what we do, but not doing many of the things we do to make sure that it matches what other podman commands are doing.

mheon on 25 Feb 2020

😄4

All 33 comments

I assume the -u postgres in here means that your app in the container isn't running as root?

As such, it's running as a non-0 user in the container, which is mapped to a user on the host through /etc/subuid (root in a rootless container is the user that started the container, all higher UIDs and GIDs are mapped to a block on the host given by /etc/subuid and /etc/subgid). The volume looks like it's somewhere that's owned by the user starting the container - but you're running the app in the container as a different user, which means you run into permissions errors.

You may want to just remove -u postgres and run as root if you need to access volumes owned by your user. Running as root in a rootless container is already very secure (the container has no added privileges that your user does not), so the only security benefit to swapping to another user in the container is preventing the container from accessing files owned by your user - which, in this case, you need (to talk to the volume).

mheon on 11 Sep 2019

@mheon -u is a parameter for schemaspy, the container itself defines USER java, and podman is run unprivileged without sudo. I'd expect podman to handle mount volumes transparently without leaking low level details about uid and filesystem mappings. Otherwise all scripts will need to contain -u root, which doesn't looks very secure. )

abitrolly on 11 Sep 2019

We really can't handle this ourselves - these are separate users from the perspective of the kernel, and normal filesystem permissions apply.

mheon on 11 Sep 2019

It is possible to configure filesytem layer to ignore host permissions? If a container is already isolated through filesystem path, why impose additional uid restrictions?

Maybe it is possible to implement two layer writes? The first layer enforces permissions, so that container won't escape the defined path, but final writes to disk are ignoring the permissions. If container needs a separate user with volume mapping, maybe podman could switch to the double layer concept automatically.

abitrolly on 11 Sep 2019

You can change the ownership on the volume with the podman unshare chown UID PATH

rhatdan on 11 Sep 2019

@rhatdan PATH is path in my directory, not /var/..., right? How do I know UID? What will happen to filesystem permissions after I quit this modified user namespace?

I also checked man podman-unshare and the description sounds too low level. Maybe it is possible to modify it for people who are not familiar with cgroups yet.

podman-unshare - Run a command inside of a modified user namespace.

abitrolly on 12 Sep 2019

I've got a thought overnight.

(root in a rootless container is the user that started the container, all higher UIDs and GIDs are mapped to a block on the host given by /etc/subuid and /etc/subgid)

When I run container with custom USER as non-privileged, then there is no root inside anymore - is that right? If it is so, then why not to map that custom USER instead of root to my UIDs and GIDs instead?

abitrolly on 13 Sep 2019

By default podman as non root, runs as root within the container. This means the processes in the container have full Namespaced Capabilities. This also means that if the container process escaped the container, it would have full access to files in your homedir (Based on UID, SELinux would still block it, but I have heard that some people disable SELinux). If you run the processes within the container as a different non root UID, then those processes will run as that UID and if they escape they would only have world access to content in your homedir.

rhatdan on 13 Sep 2019

@abitrolly I am writing a blog based on these issues. Send me your email and I can expose an early copy to you. [email protected]

Just needs to be reviewed and then I can get it published.

rhatdan on 13 Sep 2019

@rhatdan people disable SELinux, because not all builds scripts add :Z suffix to volume mounts, without which volumes on SELinux do not work, and podman doesn't add this suffix automatically.

My email is [email protected]. I sent email titled "Early copy" from this address.

Given that even with podman unprivileged model escaped container can steal my private SSH keys, I don't think that podman is more secure than docker anymore. Private keys are more valuable than root level access to OS (which main risk is again - stealing private keys from more boxes). Now I think that double level filesystem access control is a must have feature for any non-priviliged process containers.

abitrolly on 13 Sep 2019

People running with containers not separated by SELinux are taking a big risk, since it is the main tool to protect their file system from containers.

Escape for Docker allows access to all keys, rootless podman only to the users uid.
Running rootless containers in a different User Namespace would give you more protections.

rhatdan on 13 Sep 2019

@abitrolly But bottom line, I tell people to always run their containers as non-root, even in the rootless container.
One thing we could consider would be to add a :U to volumes which would chown the directory to match the primary user of the container. For podman
Might be something to consider.

rhatdan on 13 Sep 2019

👍4

Not sure if this is a "true" solution or more of a workaround, but would not --userns handle at least some of the situations desired to mount with non-root user permissions?

For example:

podman run --rm --userns=keepid -v /home/hostUserName/tmp:/home/containerUserName/tmp:Z -it image_name /bin/bash

This mounts tmp inside the container at /home/containerUserName/tmp with the same UID:GID inside the container as it possesses on the host.

Perhaps --userns=ns:my_namespace could be used to mount a volume with the UID:GID corresponding to the user named my_namespace?

Note: you cannot use --user myUserName and --userns=... in the same podman run .... command, as I understand it.

thatchristoph on 14 Sep 2019

👍1

This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.

github-actions[bot] on 31 Oct 2019

One thing we could consider would be to add a :U to volumes which would chown the directory to match the primary user of the container.

@rhatdan, what's your take on this issue. Shall we pursue your upper proposal?

vrothberg on 31 Oct 2019

I don't think so, I am hesitant to make this more complicated. I think it is up to the user to set up the permissions correctly on the volume.

rhatdan on 31 Oct 2019

I still don't understand what unshare does. How is that different from su <user>? How does unshare know the UID to run inside?

What is the proposed solution? Is the following correct?

Try to figure you what will be UID of rootless container (how? ) - let's call it RLUID
Run podman unshare chown -R RLUID /host/path
Run container with podman run -v /host/path:/guest/path - /guest/path is now writable
Exit container and run chown -R UID to get permissions back

Is that right?

abitrolly on 12 Dec 2019

👍2

Wow, this stuff is way too complicated. I've the same issue as @abitrolly (running podman as non-root, having a user inside the container that is not "root" and I cannot write to the mounted directory). I've read every comment here and I still don't have an idea how to make this work.

ChristianCiach on 25 Feb 2020

👍1

So, it seems like I can make it "work" be "chown"ing (on the host) the shared directory to the user-id that the non-root-container-user (in my case called "jenkins", because I'm using the jenkins:jenkins image from Docker hub) is mapped to on the host system. In my case, this jenkins-user from inside the container has the UID 559751 on the host system. (Btw, what is the easiest way to find this out?). So, doing sudo chown 559751 builds on the host makes the directory writable to the user inside the container. But this has two big issues:

This requires root privileges on the host.
The UID 559751 is surely specific to my local system configuration, so it's not portable.

I need to share this image with my coworkers (who don't have root privileges), so both of these points are unacceptable.

ChristianCiach on 25 Feb 2020

I don't think so, I am hesitant to make this more complicated. I think it is up to the user to set up the permissions correctly on the volume.

Okay, but how?

ChristianCiach on 25 Feb 2020

Root should not be necessary - podman unshare sticks you into the same user namespace that the rootless container uses, which gives you access to every UID/GID that the container does. Within a podman unshare shell you should be able to chown folders/files owned by your user to the UID/GID used by Jenkins. You will need to know what IDs are in use inside the container, because podman unshare is a shell on the host (though you can mount the container with podman mount and inspect its /etc/passwd to get those). This can also potentially allow you to identify the user we're mapped to on the host (su to the right UID in the podman unshare shell and touch a file in your /home - the UID there should be the one in use).

For the second issue... That is definitely a concern, and one I don't think we have an easy solution to as yes. There is talk of adding UID/GID mappings to LDAP for use across multiple systems, but they will still be unique to the user running the container for security reasons, so not portable between users

Running rootless containers as non-root and mounting in volumes is proving to be quite complicated. I think a review of how things are right now and a discussion of how we can improve (maybe a blog?) is definitely warranted here.

mheon on 25 Feb 2020

👍2

@mheon Thank you very much for your detailed comment!

Root should not be necessary

I definitely needed "sudo" to execute sudo chown 559751 builds on the host. This may be because the user accounts are centrally managed and there may be something wrong with my /etc/subuid..? I remember that it was necessary for me to create this file manually some months ago. But this may be because my workstation is old and my Fedora installation has been upgraded many times over the last six years or so.

. Within a podman unshare shell you should be able to chown folders/files owned by your user to the UID/GID used by Jenkins.

Well, this seems _portable_ in the sense that I should be able to write a simple shellscript to automate this process for my coworkers every time they want to use my image.

unshare seems to be a strange name for this subcommand, but I probably just do not understand the deeper meaning of this. When browsing the podman-subcommands in an attempt to fix my issue I would've disregarded this subcommand immediately just because of its name. ("How does _unsharing_ something helps me with those permission issues?")

I will tinker around with this some more tomorrow.

ChristianCiach on 25 Feb 2020

mheon on 25 Feb 2020

😄4

@ChristianCiach have you been able to come up with tutorial for your colleagues?

abitrolly on 5 Mar 2020

@abitrolly Sorry for the late reply. Yes, creating a simple wrapper that calls "podman unshare" before calling "podman run" works as expected. This is good enough for my use case.

ChristianCiach on 9 Mar 2020

Now how do you chown the directory back to the host user though?

$ mkdir tmp 
$ podman unshare chown 1001:1001 tmp 
$ ls -la tmp 
total 0
drwxrwxr-x.  2 101000 101000   40 Jul 30 17:20 ./
drwxrwxrwt. 54 root   root   2000 Jul 30 17:20 ../
/tmp
$ chown $(id -u):$(id -g) tmp    
chown: changing ownership of 'tmp': Operation not permitted

zakkak on 30 Jul 2020

👍1

It might not work in all use cases but another work around is to run the command in the container with the host's user ID and GUID by using --userns=keep-id --user=$(id -ur):$(id -gr), e.g.:

$ mkdir project 
$ podman run -it --rm -v $PWD/project:/project:z --userns=keep-id --user=$(id -ur):$(id -gr) --entrypoint=/bin/bash quay.io/quarkus/ubi-quarkus-mandrel:20.1.0.1.Alpha2-java11 -c 'id; touch /project/lala'

uid=1000(1000) gid=1000 groups=1000

while without it it fails:

$ mkdir project 
$ podman run -it --rm -v $PWD/project:/project:z --entrypoint=/bin/bash quay.io/quarkus/ubi-quarkus-mandrel:20.1.0.1.Alpha2-java11 -c 'id; touch /project/lala'

uid=1001(quarkus) gid=1001(quarkus) groups=1001(quarkus)
touch: cannot touch '/project/lala': Permission denied

zakkak on 30 Jul 2020

I wonder if giving the container host user ID and GUID makes the contaner unprivileged?

abitrolly on 30 Jul 2020

Containers by default are unprivileged. (Depending on your definition of unprivileged)
Running with --keep-id just changes the way the User Namespace is setup, It does not change the security controls on the container.
The only difference is instead of the users UID being Root inside of the container, the User UID is the Users UID inside of the container, and the first UID listed for the user in the /etc/subuid files user mappings is UID=0 inside of the container.

rhatdan on 1 Aug 2020

i'm battling this same thing. i am using a bitnami image of postgresql from docker hub. it has a baked in user id of 1001. on my arch linux system my uid is 1000. I would like to make a directory in my home directory for postgres to persist its data to, and be able to poke around in that directory without having to chown it all the time when i want to.
@rhatdan what is your suggestion for people using vendor provided images that already have a uid baked in?

anishp55 on 5 Oct 2020

👀1

Well for now you can do

$ podman unshare chown 1001:1001 PATHTODIR

We could add something to the volume command to do this, but I am not sure how ugly the syntax would be.

rhatdan on 6 Oct 2020

👍1

Well for now you can do

$ podman unshare chown 1001:1001 PATHTODIR

We could add something to the volume command to do this, but I am not sure how ugly the syntax would be.

I would love to have this feature, even if it looks ugly :)

kushaldas on 25 Oct 2020

Thanks for writing this up, it really helped me to understand what is going on here.

Unfortunately the container I am using and want to deploy and regularly update has 43 different user, and their associated group relationships. So If I understand the situation I would need to parse out all 43 entries from /etc/passwd using podman mount, then create a wrapper script that calls podman unshare with each of those. Then when the container gets updated to add a new user, I'm then broken and need to go update the script.

I know it would be complex from an implementation perspective, but it would be great if podman could inspect /etc/passwd itself from within the image and pull out the appropriate non-root users all within the --volume command option without a need for further user options.