Origin: Wrong permissions used on pv directories in oc cluster up.

Created on 22 Mar 2017 · 33Comments · Source: openshift/origin

When using oc cluster up (1.5.0-rc.0), the persistent volume directories are created with permission 0770. Eg:

4 drwxr-xr-x 103 root root 4096 Mar 22 02:53 .
4 drwxr-xr-x   3 root root 4096 Mar 22 02:52 ..
4 drwxrwx---   2 root root 4096 Mar 22 02:52 pv0001
4 drwxrwx---   2 root root 4096 Mar 22 02:52 pv0002
4 drwxrwx---   2 root root 4096 Mar 22 02:52 pv0003

This will work fine if images are being forced to run as assigned uid, and thus end up defaulting to run as group root, as can still write to directory via the group when persistent volume mounted.

This will also work if images are run as root as still run as group root in that case, as well as fact directory is owned by root.

This will fail though if using anyuid and the image was set up to run with some other assigned uid.

So if the image had UID 1001 and that mapped to a valid user in the passwd file which used group users, it will be prevented from being able to see inside of the directory as o+rwx is missing on the dirctory.

Version

v1.5.0-rc.0

Steps To Reproduce

Use oc cluster up.

Grant anyuid role to project by running oc adm policy add-scc-to-user anyuid -z default -n myproject.

Run any image which has USER set to be some user in the passwd file which doesn't have group of root.

Mount persistent volume into container.

Try and access the mounted directory.

$ ls -las                                                                                                                
ls: cannot open directory .: Permission denied

Current Result

Application in container cannot access the persistent volume.

Expected Result

Application should be able to access the persistent volume.

cc @csrwng @jorgemoralespou

componencomposition kinquestion prioritP2

Source

GrahamDumpleton

Most helpful comment

Should add that workaround is to run:

docker exec origin chmod o+rwx /var/lib/origin/openshift.local.pv/*

after oc cluster up has been started.

You need to wait a while since pv directories are created in background and that may take a little while.

GrahamDumpleton on 22 Mar 2017

👍2

All 33 comments

Should add that workaround is to run:

docker exec origin chmod o+rwx /var/lib/origin/openshift.local.pv/*

after oc cluster up has been started.

You need to wait a while since pv directories are created in background and that may take a little while.

GrahamDumpleton on 22 Mar 2017

👍2

@GrahamDumpleton the default permissions we use was a conscious choice when we added pv creation to cluster up. Nowhere else in the product do we grant o+rwx permissions to directories. I understand the use case, and I guess we need to decide whether it makes sense to support it out of the box. How common is it?

@bparees @smarterclayton thoughts?

csrwng on 22 Mar 2017

I'm not experiencing this issue. Using our jenkins image as an example, as a non-root user who exists in /etc/passwd, i can delete files owned by the root group:

sh-4.2$ cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin
systemd-bus-proxy:x:999:998:systemd Bus Proxy:/:/sbin/nologin
systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
jenkins:x:998:997:Jenkins Continuous Integration Server:/var/lib/jenkins:/bin/false
default:x:1000040000:0:Default Application User:/var/lib/jenkins:/sbin/nologin

sh-4.2$ whoami
default

sh-4.2$ groups
root groups: cannot find name for group ID 1000040000
1000040000

sh-4.2$ pwd
/opt/openshift

sh-4.2$ ls -l
total 8
-rw-rw-r--. 1 1001 root  852 Mar 22 14:01 base-plugins.txt
drwxrwxr-x. 2 1001 root    6 Mar 22 13:58 configuration
-rwxrwxr-x. 1 1001 root 2153 Mar 20 14:05 password-encoder.jar

sh-4.2$ rm base-plugins.txt

sh-4.2$ ls -l
total 4
drwxrwxr-x. 2 1001 root    6 Mar 22 13:58 configuration
-rwxrwxr-x. 1 1001 root 2153 Mar 20 14:05 password-encoder.jar

maybe this issue only occurs when the user also exists in the /etc/groups file?

I have some concerns about creating world writable directories on the user's host machine, especially since they don't reside in /tmp, but I would agree the current experience is bad. I'm tempted to say we should do it (these aren't production machines you're running oc cluster up on), but I'd definitely like to hear other opinions.

bparees on 22 Mar 2017

Pretty concerned about world readable. What are the implications for abuse?

smarterclayton on 22 Mar 2017

I'm more worried about world writable... someone could stuff your root filesystem.

for readable, it means anyone can read whatever data you're writing from your pods, so depending on the kind of dev you're doing, i suppose you might have interesting production DB data or something in those directories.

bparees on 22 Mar 2017

@bparees As noted in steps to reproduce, "Run any image which has USER set to be some user in the passwd file which doesn't have group of root". The default account used by RH images is group root and why it doesn't have the problem. This issue is mainly going to be with third party images as they aren't going to create a special default user which is group root.

As far as making world writable, that is principally going to be a problem on Linux isn't it? Are there any perceived dangers of having it be world writable on Windows and MacOS X where the file system is inside of the Docker VM and not the host file system.

Is there any Linux specific workaround that could be used for Linux?

GrahamDumpleton on 22 Mar 2017

This issue is mainly going to be with third party images as they aren't going to create a special default user which is group root.

right but (as you noted) that's only going to be a problem if either:

1) you're using the runasany SCC such that the image runs w/ its default user
or
2) you've added logic to the startup of the image such that it not only adds the current user to /etc/passwd but it also adds that user to /etc/groups

so i would not expect this to be incredibly common.

Are there any perceived dangers of having it be world writable on Windows and MacOS X where the file system is inside of the Docker VM and not the host file system.

it certainly seems less dangerous in those environments.

Is there any Linux specific workaround that could be used for Linux?

one workaround would be for us to default to creating the directories in /tmp. At least that addressed the "fill your filesystem" concern. It doesn't solve the "read production DB data" problem though. For that part the question fundamentally comes down to "how secure does the oc cluster up environment need to be?"

For now, given the relatively unlikelihood of a user getting into this situation (due to the need to use a non-default SCC), i'd be inclined to just document the issue+workaround to tell people to chmod the dir as needed (w/ appropriate caveats around the implications of doing so)

bparees on 22 Mar 2017

This doesn't involve any special startup logic in an image, it will be the default scenario when in the Dockerfile for an image when building it, people use useradd or adduser to add a separate account to run the image as by then setting CMD. If group isn't specified when adding a user, it will use users group.

So this would affect all images where people realise running as root is a bad idea and so set it up to not run as root. People are starting to do this much at least, even if we can't get them to design their images so that when adding the user they make it have group root, and then fix all writable directories to also be group root.

So agree it may not come up often, but more because people will try and run an image and find it fails because we force it to run as different assigned uid. They will just give up on the product altogether at that point and wouldn't even look at setting anyuid role because they would not know to. So that is a bigger obstacle before even get to point of persistent volume.

GrahamDumpleton on 22 Mar 2017

Unless they are smart developers and use something like oc-cluster or
similar which lowers the entry barrier.
Still confused on what oc cluster is meant for. 😐

jorgemoralespou on 22 Mar 2017

😕1

I can't see that wrappers can help with this unless extend the wrappers current commands for creating additional volumes manually to allow user/group for a labelled volume to be set. Is that what you had in mind?

GrahamDumpleton on 22 Mar 2017

This doesn't involve any special startup logic in an image

like I said, it requires one of two things.

one way to hit it is logic in the image.

the other way is what you described, but you seem to be leaving out the fact that your path requires you are using the anyuid SCC, which is not the default behavior. So either you're running your image using cluster admin, or you've granted users the anyuid permission.

if you don't have anyuid permission, the image is going to run as a random uid and have the root group and all will be well.

bparees on 22 Mar 2017

to be clear, i'm not saying this isn't an issue, I am trying to clarify who will actually be affected by this problem and weigh that against the difficulty of solving this problem in a secure way.

bparees on 22 Mar 2017

The beauty of wrappers is that this features can be really tailored. We did
provide volumes before and we can hack afterwards.
I would think that sometimes is better to give the user some choices.
Having admin jobs to create volumes that the user can trigger on demand and
possibly adding labels to this storage nos that there is storage classes.
That's overly complex to be in oc cluster but something that our target
developers could do. Issue a command provided via a plugin.

jorgemoralespou on 22 Mar 2017

I can't see that wrappers can help with this unless extend the wrappers current commands for creating additional volumes manually to allow user/group for a labelled volume to be set. Is that what you had in mind?

I'm missing where the reference to wrappers came from.

bparees on 22 Mar 2017

Wrappers come up because we already do various things around oc cluster up to make it more usable to developers because you don't want those things done in oc cluster up.

Realistically, we will probably just end up adding stuff in our wrappers to make it possible to create additional labelled/claimed volumes with specific permissions to cover it.

GrahamDumpleton on 22 Mar 2017

If we only had a clear goal on what to expect of oc cluster, feature
requests would probably not rush and we could probably provide more mindful
thought. At least me and grumpy.

jorgemoralespou on 22 Mar 2017

So this would affect all images where people realise running as root is a bad idea and so set it up to not run as root. People are starting to do this much at least, even if we can't get them to design their images so that when adding the user they make it have group root, and then fix all writable directories to also be group root.

@GrahamDumpleton in this scenario, the user would have to:

realize running as root is bad
rewrite their Dockerfile to not run as root but somehow need to run as some other special uid
have their image fail on OpenShift because it will have a random non-root uid
determine why that is the case and how to use the AnyUID security context constraint
decide that OpenShift randomizing the uid somehow doesn't solve point 1 above...
hit this issue

I'm not sure this is really realistic -- what steps did you follow to get to this point? If you're attempting to run as non-root why is your special user better than the default OpenShift random uid?

stevekuznetsov on 23 Mar 2017

I took someone else's image from Docker Hub and tried to use it. Simple as that.

So, I know the extra hoops you have to go through to construct images so they will work with the UID assigned by OpenShift to the project, but people who know nothing about OpenShift don't. It is wrong to assume that this situation is the exception.

GrahamDumpleton on 23 Mar 2017

I took someone else's image from Docker Hub and tried to use it. Simple as that.

I think the point @bparees was trying to make before was that it is not as simple as that, as you also have to change the default SCC in OpenShift. By default, this won't happen as by default the random non-root uid the container will run with will be in the root group. Did you also make those changes? oc cluster up won't do that.

stevekuznetsov on 23 Mar 2017

"oc cluster up" is the fastest and easiest way for a developer with Docker
installed to start running an OpenShift cluster that lets them test,
develop, and iterate locally on apps.

It is not

Ansible-lite
A power user tool
Deeply configurable
Insecure by default

On Wed, Mar 22, 2017 at 10:36 PM, Steve Kuznetsov notifications@github.com
wrote:

I took someone else's image from Docker Hub and tried to use it. Simple as
that.

I think the point @bparees https://github.com/bparees was trying to
make before was that it is not as simple as that, as you also have to
change the default SCC in OpenShift. By default, this won't happen as by
default the random non-root uid the container will run with will be in
the root group.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/13496#issuecomment-288600063,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pw-pr_gceNkh4vicYJ0r7JArb8tyks5rodqcgaJpZM4MksIb
.

smarterclayton on 23 Mar 2017

👍1

There have been multiple blog posts on the OpenShift blog telling people that when taking images from Docker Hub that they may need to use anyuid and explain how to set it up. This is because they might run as root, or some other specific user in the passwd file.

The original Origin All in One Vagrant VM image set anyuid on by default, as I am told the CDK did also. This is all because as soon as people take images from Docker Hub Registry and find they don't work, they will just give up on OpenShift. This information therefore isn't some secret lore.

GrahamDumpleton on 23 Mar 2017

And real life developers (not openshift engineers) will do that in their
local laptops, as they will be trying things out.

jorgemoralespou on 23 Mar 2017

This is all because as soon as people take images from Docker Hub Registry and find they don't work, they will just give up on OpenShift.

Is this a fact or an assumption? I believe people accept this behaviour already on OpenShift Online...

gbraad on 5 May 2017

It's a fact.
Assumption is that Online users accept that behavior. Only few people that
want to run their own software on online adapt it. The rest look for an
alternate solution.

El 5 may. 2017 16:21, "Gerard Braad" notifications@github.com escribió:

This is all because as soon as people take images from Docker Hub Registry
and find they don't work, they will just give up on OpenShift.

Is this a fact or an assumption? I believe people accept this behaviour
already on OpenShift Online...

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/13496#issuecomment-299477903,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEyDjz5b7Y2ucZnF3U5NcHDcoqU-t4gks5r2zB8gaJpZM4MksIb
.

jorgemoralespou on 6 May 2017

The increasingly short attention span of developers means you have to be able to show that something can give quick results with minimal friction. People are too busy these days or aren't prepared to invest time to persist with something until it works.

One can call it the StackOverflow generation. People aren't prepared to learn stuff anymore. If they have a problem they just ask on StackOverflow. If the very first part of a blog post, or documentation for a package, doesn't give them the answer they need, they stop reading. If a piece of software doesn't work out of the box, they are more likely to go and try a different software package rather than try and work out why the first doesn't work.

This is why the user developer experience is so important these days.

GrahamDumpleton on 6 May 2017

Amen

jorgemoralespou on 6 May 2017

The solution may be give you a way to easily create pv's that will have the permissions you want. I don't believe that we will by default allow world-write access.

Creating Trello Card:
https://trello.com/c/Y5B2fKsp/1255-cluster-up-allow-creation-of-world-writeable-persistent-volumes

csrwng on 27 Jun 2017

Must agree. I've been using Openshift for sometime now and it's been rather tricky to get stuff to work. When things don't work (like persistence) consecutively you can't help but feel rather discouraged. I've continued on as I do see value but in comparison to other platforms it needs to work more reliably. Example moving to Nexus 3 has been a pain with some permission issue as a developer I just need it to work so I can progress.

I do think Openshift has a lot to offer just frustrated at the moment!

glennodickson on 12 Nov 2017

👍1

@glennodickson This issue is about a corner case that people would in most cases never encounter. This is because it relates to using images not built to best practices.

Is there a reason or a specific problem you are having which made you think this issue was a good place to comment as you did?

GrahamDumpleton on 12 Nov 2017

👎1

As I mentioned I have been trying to get my PV to work. I noticed that
others' have commented on their difficulties.

My Nexus repo doesn't seem to retain the repos I add even though there is
PV attached to nexus and hence I came across this post. I've tried
upgrading Nexus from 2.11 to 3 and now receive and error.

My comments were intended to be constructive. Apologies if I've offended
you that was not the intention. I do think Openshift is a good platform
overall. I recognise that images may not be up to standard and therefore I
would like to enquire if there is a location where reliable images reside
(i,e. Nexus) that can be used reliably.

On 12 November 2017 at 12:44, Graham Dumpleton notifications@github.com
wrote:

@glennodickson https://github.com/glennodickson This issue is about a
corner case that people would in most cases never encounter. This is
because it relates to using images not built to best practices.

Is there a reason or a specific problem you are having which made you
think this issue was a good place to comment as you did?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/13496#issuecomment-343713269,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADsXAYeW9zaAumzHhcUYKZoPNvuXx8qcks5s1ne0gaJpZM4MksIb
.

--
Regards
Glenn

glennodickson on 12 Nov 2017

But are you using oc cluster up, or minishift? This issue shouldn't come up in a normal OpenShift installation. From the little detail you give, it doesn't so like it is related.

For help in debugging issues where you can't identify a specific reason as to why something does niot work, better places to try to get help are:

and StackOverflow.

If you post there, you would need to give more details about what you are doing, the configuration used and the specific problem/error you are seeing.

An existing issue here in the bug tracker isn't really the best place to try and have a back and forth discussion about it.

GrahamDumpleton on 12 Nov 2017

Yes using OC cluster up on Windows 10 via Virtual Box. What do you term
"normal", not a VM and on Linux?

Ok I'll participate in these groups thanks.

On 12 November 2017 at 15:58, Graham Dumpleton notifications@github.com
wrote:

But are you using oc cluster up, or minishift? This issue shouldn't come
up in a normal OpenShift installation. From the little detail you give, it
doesn't so like it is related.

For help in debugging issues where you can't identify a specific reason as
to why something does niot work, better places to try to get help are:

https://lists.openshift.redhat.com/openshiftmm/listinfo

https://groups.google.com/forum/#!forum/openshift

and StackOverflow.

If you post there, you would need to give more details about what you are
doing, the configuration used and the specific problem/error you are seeing.

An existing issue here in the bug tracker isn't really the best place to
try and have a back and forth discussion about it.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/13496#issuecomment-343720062,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADsXAd3PR7wLthyMAaqMq7KwS3B6EOzQks5s1qU3gaJpZM4MksIb
.

--
Regards
Glenn

glennodickson on 12 Nov 2017

By 'normal' mean a proper production installation of OpenShift, not the development/test systems such as oc cluster up and minishift. They shouldn't be used for production systems.

GrahamDumpleton on 12 Nov 2017

Was this page helpful?

0 / 5 - 0 ratings