Enhancements: Pod Security Policy

Created on 9 May 2016  ·  93Comments  ·  Source: kubernetes/enhancements

Feature Description

Related issues

kinfeature siauth sinode stagbeta trackeno

Most helpful comment

This situation is extremely frustrating for those of us needing to run high security clusters. Our options are:

  1. Sit and wait for PSPs to be deprecated.
  2. Outsource workload runtime security enforcement to a vendor (Styra), since OPA does not document how to run a fully restrictive PSP replacement with Rego.

So, my company has implemented fully locked down PSPs. They are not easy to implement, and debugging them is a chore, but they are highly functional, and they actually work. We even published a blog post detailing how they could be used this way, and how to handle exceptions when they happen.

IMO, the PSP beta should be merged as is into the mainline kubernetes core. My reasons are:

  1. While PSPs have flaws, they function, and have functioned for 10 releases.
  2. Kubernetes as a project should care about workload runtime security. Container escapes are far too easy. PSPs are one of the few tools which make this harder for attackers.
  3. Perfect is the enemy of Done. Merge PSPs as is, and push off the "better" implementation to policy/v2.
  4. Finally, and most important, it allows OSS developers to run higher security clusters, not just companies who can afford vendors like Styra.

All 93 comments

Admission controller code is under review in: https://github.com/kubernetes/kubernetes/pull/24600

This feature is skipping straight to Beta since it has had initial exposure in OpenShift.

It will be default disabled in kubernetes/kubernetes#24600. After that goes in, we need changes in the admission controller to link PSPs to users.

Noting https://github.com/kubernetes/kubernetes/pull/20573 as a dependency for the next step on PSP (subject level access)

Whats the status of this? Is the description in first comment up to date?

Is the description in first comment up to date

no (I don't have permissions to update). I believe all of the alpha requirements have been met. The initial types, api, and tests have been merged. The admission controller is not enabled by default.

IMO the remaining work for beta/1.4 is auth integration for permissions, updating for new fields we want to constraint (seccomp - in progress, sysctl), and any required docs/tutorials.

And an e2e test.

On Tue, Jul 12, 2016 at 6:23 AM, Paul Weil [email protected] wrote:

Is the description in first comment up to date

no (I don't have permissions to update). I believe all of the alpha
requirements have been met. The initial types, api, and tests have been
merged. The admission controller is not enabled by default.

IMO the remaining work for beta/1.4 is auth integration for permissions,
updating for new fields we want to constraint (seccomp - in progress,
sysctl), and any required docs/tutorials.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/5#issuecomment-232045429,
or mute the thread
https://github.com/notifications/unsubscribe/AHuudqFwephlYk0Y1PS77y0xxA5QW0_-ks5qU5U7gaJpZM4IaU8n
.

How about interactions with cloud providers? It would be nice to easily assign each pod different IAM roles so they can access only the subset of cloud services that they actually need. Would it be in scope or is it considered a SecurityContext detail?

@therc that should be done via ServiceAccount.

@goltermann I noticed this was marked with alpha but I believe it probably needs the beta tag based on https://github.com/kubernetes/features/issues/5#issuecomment-217939650

@erictune does beta sound right based on the @pweil- comment?

@goltermann I think technically this would've been beta in 1.3, it is not new to 1.4 though development is ongoing.

Yes, beta is correct. I was incorrect when I said alpha earlier today.

great, fixed it up

@pweil- Are the docs ready? Please update the docs to https://github.com/kubernetes/kubernetes.github.io, and then add PR numbers and have the docs box checked in the issue description

@janetkuo docs PR https://github.com/kubernetes/kubernetes.github.io/pull/1150

edit: https://github.com/kubernetes/kubernetes.github.io/pull/1206 is the correct 1.4 PR

cc @kubernetes/feature-reviewers

@pweil- I suppose, this PR is actual - https://github.com/kubernetes/kubernetes.github.io/pull/1206?

correct

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

work is happening in 1.10 to move PSP to its non-extensions API group
cc @php-coder

@erictune doc updates, please? See also [the 1.10 feature tracking spreadsheet[(https://docs.google.com/spreadsheets/d/17bZrKTk8dOx5nomLrD1-93uBfajK5JS-v1o-nCLJmzE/edit#gid=0). lmk if you have any questions. We need to get a docs PR reviewed and merged by 3/9. Thanks!

@php-coder ^

@Bradamant3 @liggitt What doc updates are required?

For the changes related to API group transition, I've submitted: https://github.com/kubernetes/website/pull/7562, https://github.com/kubernetes/examples/pull/206, and https://github.com/kubernetes/examples/pull/208

I'm not the right owner for PSP Doc updates.

On Fri, Mar 2, 2018 at 11:26 AM, Vyacheslav Semushin <
[email protected]> wrote:

@Bradamant3 https://github.com/bradamant3 @liggitt
https://github.com/liggitt What doc updates are required?

For the changes related to API group transition, I've submitted:
kubernetes/website#7562 https://github.com/kubernetes/website/pull/7562,
kubernetes/examples#206 https://github.com/kubernetes/examples/pull/206,
and kubernetes/examples#208
https://github.com/kubernetes/examples/pull/208


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/5#issuecomment-370026485,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHuudtBCup17Kt91pqJzBRpKWStoXUt-ks5taZzcgaJpZM4IaU8n
.

That's all we need. I've added the PR to the tracking spreadsheet. Thanks!

@php-coder @liggitt @tallclair
Any plans for this in 1.11?

If so, can you please ensure the feature is up-to-date with the appropriate:

  • Description
  • Milestone
  • Assignee(s)
  • Labels:

    • stage/{alpha,beta,stable}

    • sig/*

    • kind/feature

cc @idvoretskyi

@php-coder Can you respond to @justaugustus 's comment with the work that you're doing here? Are there any changes other than the policy group move?

Are there any changes other than the policy group move?

No, I worked only on this.

I hope that @liggitt will update the description when he has a time (because I don't have appropriate permissions).

Done.

@tallclair just to clarify, we're tracking stable as the target for 1.11, correct?
I've updated the label, just want to make sure though.

No, this will still be beta. I'm not sure PodSecurityPolicy will ever go to stable (i.e. will be superceded by something else), but others might disagree with me on this.

Got it. Thanks for the update, @tallclair!

@justaugustus I'll remove this from 1.11 milestone, as there's no significant progress is going to happen in the current release.

No updates planned for 1.12

@tallclair i might be able to get the RunAsGroup PSP knobs in 1.12

Ack. This will still be in beta though. At the moment there are no plans for PSP to go to GA. There are some major useability issues that need to be addressed before we progress this. (see https://github.com/kubernetes/kubernetes/issues/60001 and https://github.com/kubernetes/kubernetes/issues/56174)

/unassign

/assign @tallclair

Hi
This enhancement has been tracked before, so we'd like to check in and see if there are any plans for this to graduate stages in Kubernetes 1.13. This release is targeted to be more ‘stable’ and will have an aggressive timeline. Please only include this enhancement if there is a high level of confidence it will meet the following deadlines:

  • Docs (open placeholder PRs): 11/8
  • Code Slush: 11/9
  • Code Freeze Begins: 11/15
  • Docs Complete and Reviewed: 11/27

Please take a moment to update the milestones on your original post for future tracking and ping @kacole2 if it needs to be included in the 1.13 Enhancements Tracking Sheet

Thanks!

No changes planned in 1.13.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

@tallclair Hello - I’m the enhancement’s lead for 1.14 and I’m checking in on this issue to see what work (if any) is being planned for the 1.14 release. Enhancements freeze is Jan 29th and I want to remind that all enhancements must have a KEP

Nothing planned for 1.14.

What are the gaps for this to be GA? I can think about few, but I do not see any criteria in the description.

Before this can go to GA, we need to fix these problems:

  1. Flawed authorization model - PSP usage is granted through RBAC, and can either be granted to the user, or the created pod. Granting it to the user is the intuitive approach, but is problematic (see explanation). This approach also has some security issues.
  2. Difficult to roll out - Because pods are rejected by default, we cannot roll out PSP to all clusters without breaking them. Similarly, users who would like to enable PSP need to ensure complete coverage of all workloads before they can turn it on, which makes it difficult to enable (hence the relatively low usage). Because the feature must be explicitly enabled, we do not have adequate test coverage (the feature matrix is too complex).
  3. Inconsistent API - Less of a fundamental issue, but evolution of the PSP API over a long span of k8s releases has led to a number of inconsistencies making it difficult to use. In particular, mutation is lumped together with validation, which can lead to some unexpected results when a pod has access to multiple PSPs.

@liggitt and I have some ideas about how to address this, but there's an open question about whether this belongs in core Kubernetes. I'd like to have a roadmap out in the next month, either a plan for going to GA or a plan for deprecation.

Thanks for sharing the information!

Because pods are rejected by default, we cannot roll out PSP to all clusters without breaking them.

I guess it is not really that. We did this by creating a PodSecurityPolicy which is open enough(or even open all) firstly, and then refine that gradually.

@zhouhaibing089 a Kubrenetes user can use that method which works because they control the policies. However, we cannot roll that out as a Kubernetes default since PodSecurityPolicy only opens the cluster up, meaning it's very difficult to manage the system controlled allow-all PSP.

Hello @liggitt @tallclair , I'm the Enhancement Lead for 1.15. Is this feature going to be graduating alpha/beta/stable stages in 1.15? Please let me know so it can be tracked properly and added to the spreadsheet. The community proposal will need to be migrated to a KEP for inclusion in 1.15.

Once coding begins, please list all relevant k/k PRs in this issue so they can be tracked properly.

No changes planned for 1.15

@tallclair Would really love to see this land as GA in 1.16. Is that possible?

@lachie83 No, we're not sure we want PodSecurityPolicy to go to GA. It's not clear that this is a use case that should be solved by Kubernetes core, and we're looking into out-of-core alternatives. If you'd like to discuss it in more detail, it's a good topic for SIG-Auth.

@tallclair Would stuff like Open Policy Agent's gatekeeper be a better path to head down?

Yes, exactly. That might be the leading contender, and I'm working closely with that team to see make sure it will cover these use cases.

The one thing I've been waiting for, is a tool that could potentially translate PodSecurityPolicy --> OPA rego policy. That would make deprecating them from your standpoint a lot easier.

@tallclair appreciate the prompt response

@SEJeff agreed. We will not deprecate PodSecurityPolicy until there is a clear replacement with feature parity and migration path.

Hey @tallclair, you mentioned a roadmap to GA or a plan for deprecation. Seems like we're leaning towards the latter.

Do we have something written to help people that are looking at PSPs as a solution close the loop?

Not yet. Part of the hesitation is that we don't want to say that we'll be deprecating it in favor of something else until there is a clear replacement. Although I'm excited about gatekeeper, it doesn't have the features (or stability) we need to replace PSP yet. Another possibility is that we could move PSP out of tree, and bring it to GA as an admission webhook (the 2 options aren't mutually exclusive). We haven't formally laid out a roadmap yet though.

Wtf

Hi @tallclair looks like nothing is happening here for 1.16 as well so I'll keep it the same.

Hey there @tallclair -- 1.17 Enhancement lead here -- it looks like this is staying as is for 1.17. If that changes, please don't hesitate to give me a poke and I can add it to the tracking sheet 👍

Has there been any more discussion on a clear path for the future of PSP?

Yes, exactly. That might be the leading contender, and I'm working closely with that team to see make sure it will cover these use cases.

@tallclair - we have implemented most PSP checks in Kyverno. Can you help take a look? Would love to discuss ideas and details.

https://github.com/nirmata/kyverno/blob/master/samples/README.md

The Gatekeeper project has also been looking at what a post-PSP world would look like. Our initial approach has been to break down the PSP resource into individual constraints. We were wondering what the community’s thoughts were on this approach. Maybe it would be a good time to re-imagine how these policies compose? Migration for both new users and existing PSP users will also be important.

cc @maxsmythe @sozercan @tsandall

I have some concerns about breaking the policies into individual constraints, namely that I need to create many more constraint objects. If I think need to clone or alter those for different workloads, I'm worried it will become very complex.

I think the best approach would be a user-centric one. If we can get real feedback on how PSP's are being used, and then see what a similar setup would look like under these alternative plugins, then that can help shape the design.

@tallclair one of the use cases we are pursuing is related to namespace based multi-tenancy. The intent is to use policies to enforce restrictions and ensure that namespaces are properly configured.

Before this can go to GA, we need to fix these problems:

  1. Flawed authorization model - PSP usage is granted through RBAC, and can either be granted to the user, or the created pod. Granting it to the user is the intuitive approach, but is problematic (see explanation). This approach also has some security issues.

@tallclair , I'm wondering about the above -- can you elaborate on how this approach is problematic and/or has security issues?

Can someone more informed please confirm this tweet:

https://twitter.com/TechJournalist/status/1197658440040165377

And if that's true, what should people using PSP to limit linux capabilities today do going forward?

Hi all,
This is a very interesting discussion and we are currently looking for solutions to secure Pod creation in Kubernetes clusters.

We have taken a look at both OPA Gatekeeper and PodSecurityPolicies, as well as the effort to reimplement PSP in OPA constraints.
The fundamental issue we found with this comparison is that we are dealing with two opposite models.

  • OPA Gatekeeper follows the open-by-default model, in which everything is allowed and the admin forbids certain things with constraints.
  • PSP follows the closed-by-default model, in which everything is forbidden and the admin allows certain things with policies.

From a security perspective, I would argue that the PSP model is better, albeit more difficult to bring into existing clusters as all workloads must conform to it.

How do you plan to bridge this fundamental gap in architecture, between PSP and the Constraint Framework?

/cc @ritazh I'd love to hear your opinion on this, since you have worked on porting the PSP functionality to OPA.

The different approaches definitely make migration more complicated. We're exploring different ways to make the transition smoother.

In a perfect world, I agree that deny-all-by-default is the more secure approach. However, it's one of the things that makes PSP so difficult to use and roll out. In practice, I think gradually ratcheting down permissions is more feasible, and as the old adage goes "the best security is the security you use".

On a side note, we're also discussing how to opt out / exclude / get exceptions to constraints (for example to protect the kube-system namespace). Depending on how this works, you could implement a deny-by-default approach by locking everything down, and then granting exceptions. I'm not sure if that's a use case we want to design for though...

@tallclair do you expect any progress on this in 1.18? I am an enhancements shadow for the release and would like to know if we should be tracking this.

No changes planned for 1.18

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

@tallclair Hi Tim. Any plans for this in 1.19?

No plans for v1.19, although I'm hoping to see some movement in the v1.20 timeframe.

Just stumbled upon Kubernetes Pod Security Policies with Open Policy Agent. @tallclair can you share what's blocking us and where help is needed, happy to contribute as well.

can you share what's blocking us

Basically we just need to make a decision on a path forward. Currently, I think there is agreement that PSP should not go GA in its current form, but we haven't settled on a replacement yet. Options we've discussed:

  1. No replacement - recommend choosing from a third party option with an admission webhook. The recently published Pod security standards doc is trying to make this smoother by promoting equivalent functionality.
  2. Alterantive built in controls

    1. @deads2k has proposed upstreaming openshifts SecurityContextConstraints

    2. I've proposed a minimally configurable feature that only enforces the the standard policies linked above (and recommend a third party solution when more configurability is needed)

  3. Fix pod security policy - Although some of the issues are core enough to the design, that this would need to be non-backwards compatible, at which point it might as well be a new alternative in (2)

Am I reading https://github.com/kubernetes/kubernetes/pull/90603 correctly that because the pod security standards are published there's no planned replacement for PSPs in the API server and any replacement will need to be implemented as an outside system?

See https://github.com/kubernetes/enhancements/issues/5#issuecomment-637066475

The deprecation schedule for the current beta version in 1.22 is independent of whether or not an in-tree implementation of the standard pod security profiles will be provided. That has not yet been determined.

Thanks @liggitt was confirming that nothing had been set. thought originally nothing would be deprecated until a replacement was available. Wasn't clear if a decision had been made one way or another.

The deprecation timeline is not specific to PSP and was added as part of https://github.com/kubernetes/enhancements/tree/master/keps/sig-architecture/1635-prevent-permabeta

if i'm reading this correctly, whats pushing the deprecation is that no API should be in the same beta version for more then 9 months so PSP needs to get promoted or deprecated and since there won't be any new betas or GA of psp it needs to get on track for deprecation even though a replacement hasn't been decided on?

if i'm reading this correctly, whats pushing the deprecation is that no API should be in the same beta version for more then 9 months

exactly. all future beta versions of all built in APIs will come with a prebaked deprecation and removal target when they are first introduced

Hi @tallclair

Enhancements Lead here. Any plans for this in 1.20?

Thanks,
Kirsten

No plans for v1.20.

This situation is extremely frustrating for those of us needing to run high security clusters. Our options are:

  1. Sit and wait for PSPs to be deprecated.
  2. Outsource workload runtime security enforcement to a vendor (Styra), since OPA does not document how to run a fully restrictive PSP replacement with Rego.

So, my company has implemented fully locked down PSPs. They are not easy to implement, and debugging them is a chore, but they are highly functional, and they actually work. We even published a blog post detailing how they could be used this way, and how to handle exceptions when they happen.

IMO, the PSP beta should be merged as is into the mainline kubernetes core. My reasons are:

  1. While PSPs have flaws, they function, and have functioned for 10 releases.
  2. Kubernetes as a project should care about workload runtime security. Container escapes are far too easy. PSPs are one of the few tools which make this harder for attackers.
  3. Perfect is the enemy of Done. Merge PSPs as is, and push off the "better" implementation to policy/v2.
  4. Finally, and most important, it allows OSS developers to run higher security clusters, not just companies who can afford vendors like Styra.

@zapman449 can you clarify what you mean by a "fully restrictive PSP replacement"?

Hopefully the Gatekeeper PSP library makes it easier to enforce rules similar to those used by PSP. I'm definitely interested in any functional gaps you are seeing.

@zapman449 would you happen to have a link to that blogpost?

@maxsmythe I haven't caught up on what Gatekeeper PSP is doing, will review.

However, what I mean is:

  1. Full control over process capabilities like NET_BIND_SERVICE, SYS_ADMIN, etc
  2. Restrict UID/GID/FSGroups to non-zero values
  3. Explicit listing of host paths which can be mounted
  4. Explicit listing of types of volume mounts allowed
  5. Block Privileged, block privilege escalation
  6. Block access to host level interprocess communication primitives
  7. Block access to host networking
  8. restrict which host ports are allowed
  9. Enforce readOnlyRootFilesystem
  10. Connection point for SELinux

These are provided today with PSPs.

If we're asking for a wishlist, I'd love:

  1. Smart defaults for SysCalls from containers. There's a small percentage of the total linux syscall list that most containers need. Let me restrict most containers to that list, then allow me to either explicitly allow certain calls for certain pods owned by certain service accounts, or to grant carte blanche to specific service accounts.
  2. Let me dream a bit more, and I'll come up with something. ;-)

@zapman449 - If you haven't seen it, we discussed the future of PSPs in the last sig-auth meeting (https://docs.google.com/document/d/1woLGRoONE3EBVx-wTb4pvp4CI7tmLZ6lS26VTbosLKM/view#heading=h.hsgtsqg83z5u). We'll continue the discussion in the December 9th meeting if you're able to make it, but we also won't make any final decisions without a proposal being sent to the mailing list.

Our intention here is absolutely not to leave anyone high and dry. We know that PSPs address an important security need for Kubernetes, and the purpose of these discussions is to figure out what the best way to meet those needs in the future is.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

majgis picture majgis  ·  5Comments

justaugustus picture justaugustus  ·  3Comments

sparciii picture sparciii  ·  13Comments

saschagrunert picture saschagrunert  ·  6Comments

andrewsykim picture andrewsykim  ·  12Comments