A while ago AWS released Permissions Boundaries to allow Admins to delegate creation of Roles to users while retaining control on the maximum set of permissions that can be delegated. This is needed to avoid situations where users would be able to escalate privileges attaching a policy with admin permission to a role they created & then assume it. We're extensively using this in our low-tier environments to allow users to play with everything they want while ensuring they won't be able to do things such as removing key resources that we use for monitoring.
However, all of this breaks when trying to use kops since it tries to create roles straight away without a boundary. While we could create roles and instance profiles upfront and feed them to kops as described here, but this will quickly become a problem to manage should kops change the required permissions. Also, I assume additionalPolicies in the spec will stop working the moment we start using
--lifecycle-overrides IAMRole=ExistsAndWarnIfChanges,IAMRolePolicy=ExistsAndWarnIfChanges,IAMInstanceProfileRole=ExistsAndWarnIfChanges to prevent kops from creating IAM roles.
What we would really need is kops being able to apply a permission boundary on role creation starting from a variable defined in the spec.
We don't have a lot of Go experience in house, but we'd be happy to help if we'd be able to get some guidance on what / how to touch. Our assumption is that most of the changes should be in iam.go,
but we're not completely sure of how to do this cleanly.
Thanks for the help!
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Not really sure I can do this though :-)
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle rotten
can you describe exactly what needs to change in order to support permission boundaries? From my limited understanding of permission boundaries it seems like the only change is in the policy attached to the IAM entity running the kops command, not any of the policies that kops creates.
If you can paste an AWS error message or suggest how the IAM policies need to change in order to support this, that would be appreciated.
And yes depending on what needs to change exactly, it would probably either be in the iam.go you linked to or iam_builder.go
Hi @rifelpet,
the only change that needs to happen is that when creating the various roles kops should set a permission boundary passed in by the user through the spec.
There's a field in the CreateRoleInput struct in the AWS Go SDK to deal with this called PermissionsBoundary that accepts the ARN of the permission boundary that will be attached to the role when created.
_No_ changes are required in the policies themselves, sorry for the confusion.
Ah ok, that makes more sense. Supporting that should be pretty simple. The main challenge will be what the API for it looks like.
InstanceGroupSpec has the IAMProfileSpec where we can specify an AWS IAM Profile ARN. I don't think we could add a PermissionsBoundary field here because its a property of the IAM Role and not instance group, and we have many instance groups per IAM Role.
The ClusterSpec has an IAMSpec that defines some settings that apply to all the IAM roles Kops creates. We could add a new field there, but it would either need to be a single PermissionsBoundary that applies to all roles or we'd need to structure the new field to support setting per-role PermissionsBoundaries (for example a map of role name -> permission boundary, or a list of objects that contain a role name and permission boundary)
Would users ever want different permission boundaries per role? That seems technically possible but I don't know how common of a use-case that would be. If we choose not to support that and only support a single permissions boundary for all roles in the cluster, I think it would make the most sense to include a PermissionsBoundary field in the IAMSpec.
Thoughts?
I think the chance of users requiring multiple permission boundaries is rather low. All the setups I've seen so far have a single permission boundary they use for creating all the IAM roles and I honestly can't see a lot of people making their life so complicated to have different permission boundaries depending on the instance profile within a single cluster. Maybe one per cluster, but in that case a single PermissionBoundary field in IAMSpec will still be enough.
Feels like going for PermissionBoundary in the IAMSpec should cater for a lot of the use-cases out there - very close to all IMHO - and going for a more complicated solution in kops to support edge cases at the moment doesn't make too much sense?
Should a super-edge case arise, the old path of creating the roles out-of-band and feed them into kops would still be available to them, after all.
Agreed, if users want per-role boundaries we can recommend that they manage their own roles.
In that case I think we can move forward with adding a PermissionBoundary field to IAMSpec. I'm happy to do this when I get around to it but I think it would be a simple enough addition that if anyone else wants to take on the work I can provide assistance.
To my mind this issue needs to have more attention. It is quite easy to solve, but nevertheless stays for almost a year open. :(
Usage of the boundaries became more or less standard in any corporate environment and luck of support of this feature creates a substantial isues with using Kops in such environments. On other hands it looks like the whole issue can be solved with ~10-20 lines of code and as such solving this issues is a "quick win" for everybody.
Does anybody have half hour time to solve it? Would be very much appreciated.
I don't at the moment. But I am happy to review a PR with 10-20 lines of code :)
I have absolute no knowledge of Kops structure and build process. But even I can find the places in the code that needs to be changed. In fact 1 sting configuration option has to be added which in unchanged form has to be passed to AWS API call. How hard should it be for someone with good knowledge of the project?
@voroniys Even though technically you are right, you miss the bigger point. Kops maintainers are not your contractors. This is an open source project. Sending in a PR is the right way to go.
Or reaching out to them in their preferred way, e.g. here : https://kops.sigs.k8s.io/welcome/contributing/#office-hours
Sending in a PR is the right way to go.
That is what I'm doing with project based on technology I'm familiar with. Golang unfortunately is not one of them.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
This has been implemented and will be in Kops 1.19
/close
@rifelpet: Closing this issue.
In response to this:
This has been implemented and will be in Kops 1.19
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.