This RFC proposes modification to the linkerd install flow, dividing Linkerd
installation into stages grouped by permission level.
Installation of the full suite of components requires three commands, with
varying combinations of flags:
linkerd install-cnilinkerd installlinkerd install --tls optionallinkerd install --tls optional --proxy-auto-injectlinkerd install-spAll components listed above, grouped by privilege, --admin and --user:
$ linkerd install --help
Output Kubernetes configs to install Linkerd.
Usage:
linkerd install [flags]
Flags:
--admin Install components requiring cluster-wide privileges.
--user Install components requiring namespace-wide privileges.
The linkerd check command can be modified to mirror the linkerd install flow:
linkerd check --admin
linkerd check --user
Default usage:
linkerd install --admin
Possible modifier:
linkerd install --admin --cni
linkerd install --user
linkerd install shall continue to work as an atomic installation process,
assuming the user has cluster access sufficient to install all Linkerd
components.
To address the timing issue between installing CRD/serviceprofiles and the
control-plane ServiceProfiles, the ServiceProfiles shall be installed via a
job configured in the linkerd install output.
@siggy downside of having ServiceProfiles as a job post-fact is that we'll need to have extra RBAC (which is probably okay, just need to be aware).
@grampelberg Re: ServiceProfiles I could go either way, adding more RBAC isn't great, but a 3rd command isn't great either. I'd rather it work one way rather than both.
linkerd install --admin --cni as a possible opt-in modifier. Unless there is a compelling use case for installation without CNI, I'd prefer to limit to configuration surface area by always including it.proxy-init containers sufficient permission to modify IP tables? This may be another argument from the previous point, where we can avoid supporting different installation configs by limiting the surface area.Meta point: In this issue I'd like us to nail down the user-facing ergonomics of installation. The underlying implementation is important, and may influence the UX, but deeper discussions around install/inject changes should probably happen elsewhere, #2095 being a good example.
馃樆
When neither --user nor --admin are specified, is the result just the concatenation of the --admin and --user outputs?
I see service profiles listed under install --user. Should those be installed by a job instead?
I wonder what we can do with the naming of these commands to make it more clear that these are sequential steps rather than alternatives. I still like the idea of having the --admin flag be called something like --prepare, --setup, --init, etc. Or even a separate subcommand linkerd prepare, linkerd setup, linkerd init, etc.
Building off the previous point, it may be possible for linkerd install to be context sensitive. It can inspect the state of the cluster (linkerd check basically) and determine if the admin setup has already been completed or if the user has admin permissions:
Re: neither --user nor --admin being specified, and service profiles getting installed by a job, see the Developer installation section. ;)
I'm open to naming changes, the intent of --admin and --user was to convey the role of the person doing the install, in addition to privilege level. I'm also open having them be separate commands. The ergonomics around a full-suite linkerd install being a concatenation of linkerd --admin and linkerd --user felt intuitive to me.
Context sensitive commands are interesting. I think there's a lot we could leverage in our check infrastructure when performing installs around providing feedback to the user. I'm a bit hesitant to make the output of cli commands context-dependent, as it could make scripts/docs/instructions less deterministic.
That's a good point, predictability is really important.
FYI: Everything related to TLS in this proposal will change between 2.2 and 2.3
Product goals, in priority order:
In other words: clarity first, easiness last.
What those priorities suggest to me is that it's more important to have the service profile explicitly in the command output (instead of created by a job) than it is to have a single atomic install command for developers.
This suggests putting the service profiles in the --user output and scraping the atomic install mode.
It feels like we need to think through the check implications here at the same time.
I'm a little torn, from a workflow perspective, it could be:
$ linkerd check setup
$ linkerd install setup # --admin
# -----------
# Optional
# -----------
$ linkerd check cni
$ linkerd install cni
# -----------
$ linkerd check components
$ linkerd install components # -- user
setup and cni are the wrong words, linkerd check components isn't right either. I prefer sub-commands to flags as I suspect we'll have flags that are specific to each and confusing when all mashed together.
This suggests that there's the possibility of:
$ linkerd check install
$ linkerd install
To make this work, we'd need to change the install flow from linkerd install | kubectl apply to doing it all within install and having significantly reduced configuration options. I still think the concept of outputting YAML is solid, but maybe this is a good middle ground? You can get all the YAML if you follow the advanced procedure and just have it work for the simple one. It would be possible to prompt the user and explain what's happening to their cluster in the linkerd install case as well in a more succinct version than we're able to with YAML.
Going this route, we could use some checks to suggest install modes (we noticed you don't have RBAC for CRDs, you might want to install in unprivileged mode).
Also, it feels like there's a whole other sub-command for what is --single-namespace now:
$ linkerd check unprivileged
$ linkerd install unprivileged
While it'd be possible to split this across setup and components using a flag, feels like the whole thing is overly complicated.
Just my 2c:
linkerd install CLI wizard which performs multiple steps sequentially is a really nice user experience, but it also is in tension with the project goals of being transparent and non-magical. Having two different install methods (applying yaml and install wizard) means more surface area to support and maintain. My vote would be to not have this.linkerd check X is intended to run before or after linkerd install x. I think running check after makes sense, which means we'd also need a linkerd check pre which can be run before any linkerd install command.linkerd check without any further sub-command being something that runs ALL the checks in sequence and basically tells you which command to run next. i.e. "You're cluster is correctly set up, next run linkerd install control-plane" or whatever.All together I think the workflow would be:
linkerd check pre # All good, cluster is ready for Linkerd setup
linkerd install setup # Must be run by admin
linkerd check setup # All good
linkerd check # Linkerd setup is complete, next run `linkerd install control-plane`
linkerd install control-plane # Can be run by user
linkerd check control-plane # All good
linkerd check # Linkerd control-plane installed, next inject some services
linkerd inject
linkerd check data-plane # All good
linkerd check # All good
How important is it that we support an unprivileged mode? I don't think it's unreasonable to require that having a cluster admin run a setup command is a hard prerequisite for Linkerd. This lets us remove unprivileged/single-namespace entirely which greatly simplifies the code and reduces the number of configuration permutations.
It is pretty important that someone other than cluster admins operate the control plane. If there was some YAML that a cluster admin could run to setup cluster level things (like the CRDs) and still have the control plane restricted to a single namespace, that might be a valid workaround. I have some concerns around upgrade instructions, in particular whether it would be possible at all to do #1903.
We should standardize on if linkerd check X is intended to run before or after linkerd install x. I think running check after makes sense, which means we'd also need a linkerd check pre which can be run before any linkerd install command.
In a perfect world, I'd actually prefer to just run check pre as part of each install step. It makes the whole thing easier for users. That way, check becomes a post-run command and install becomes a little bit more predictable (with a flag to opt out of course).
I also like the idea of linkerd check without any further sub-command being something that runs ALL the checks in sequence and basically tells you which command to run next. i.e. "You're cluster is correctly set up, next run linkerd install control-plane" or whatever.
I dunno, I'm not a big fan of this. You'll need to be able to run the distinct steps anyways. Feels a little magical.
If there was some YAML that a cluster admin could run to setup cluster level things (like the CRDs) and still have the control plane restricted to a single namespace, that might be a valid workaround.
Yes, I think that is how it would work. A cluster admin would run linkerd install setup | kubectl apply -f - (which creates CRDs, etc) and create a namespace for a regular user to use: kubectl create ns alex. Then a regular user would be able to linkerd install -l alex control-plane | kubectl apply -f -. This would install the control plane into the alex namespace without requiring any cluster level permissions.
I think this gets us everything we want without any special --single-namespace mode.
Just need to think through the ClusterRole and Role implications (which is the biggest differentiator right now).
Related to --single-namespace and down in the weeds, the control-plane components currently take a --single-namespace flag, and behave differently based on this setting.
That said, I don't love the fact that we pass this flag around, and would much prefer that the control-plane components detect and degrade gracefully when confined to a single namespace. I just wanted to call this change out to help scope.
ClusterRole vs Role is an interesting wrinkle I hadn't thought about until now. Just as a starting point for discussion, what about something like this?
linkerd install setup takes a flag which specifies which namespaces Linkerd can access (default: all) and creates a ClusterRole with permissions on those namespaces. It also takes a flag which specifies which namespaces the control-plane can be installed into (default: linkerd) and creates ClusterRoleBindings for the linkerd service accounts in those namespaces.
Then, linkerd install control-plane creates a service account, a Role that grants access within that namespace, and a RoleBinding that connects them.
This unlocks the following use-cases:
# Creates a ClusterRole that can read all namespaces and a ClusterRoleBinding for the linkerd
# service account in the Linkerd namespace.
linkerd install setup | kubectl apply -f -
# The control plane runs under the linkerd service account and can therefore read from all
# namespaces.
linkerd install control-plane | kubectl apply -f -
# Creates a ClusterRole which can read from the test1, test2, and test3 namespaces. Also creates
# a ClusterRoleBinding for the linkerd service account in the alex namespace. This command is
# run by a cluster admin.
linkerd intsall setup --data-plane-namespaces=test1,test2,test2 --control-plane-namespaces=alex | kubectl apply -f -
# This command is run by someone running the linkerd control plane in the alex namespace. Since
# it runs under the linkerd service account in the alex namespace, it has permissions to read from
# namespaces test1, test2, and test3. It also creates a Role and RoleBinding so that it can read from
# its own namespace.
linkerd install control-plane -l alex | kubectl apply -f -
# No ClusterRole or ClusterRoleBindings are created.
linkerd install setup --data-plane-namespaces="" --control-plane-namespaces=""
# Creates the linkerd control plane in the alex namespace. It also creates a Role and RoleBinding
# so that it can read from its own namespace. This is effectively single-namespace mode. linkerd
# does not have permissions to read from any other namespace.
linkerd install control-plane -l alex | kubectl apply -f -
As far as I can tell, the --single-namespace flag is used to control two things in the controller code:
Re: ClusterRole vs Role, I think it's an interesting idea to have linkerd install control-plane create a Role for itself.
Re --data-plane-namespaces and --control-plane-namespaces, do we need both? This seems to provide customization beyond what --single-namespace does today. What if instead we just did:
linkerd install setup
linkerd install control-plane
linkerd install setup -l foo
linkerd install control-plane -l foo
What about just removing the ClusterRoleBindings entirely from linkerd install setup with a flag?
linkerd install setup --no-rbac
You can't make a ClusterRoleBinding that is namespace specific. This feels like it'd be nicer for cluster admins as well to audit.
I'd always looked at ClusterRoleBindings and RoleBindings as an either/or kind of thing, but I like the idea of always just doing a Role/RoleBinding on the namespace level. Makes permissions a little more clear.
You can't make a ClusterRoleBinding that is namespace specific. This feels like it'd be nicer for cluster admins as well to audit.
I believe you can have a ClusterRoleBinding that has a service account (from a specific namespace) as the subject:
subjects:
- kind: ServiceAccount
name: linkerd-controller
namespace: alex
Yup, that's because service accounts are tied to namespaces. The ClusterRole would still give you access to everything in every namespace.
What is the summary of "things in the admin install" versus "things in the user install"? I.e. when we add new resources, what's the question that we use to determine which phase the resource should be installed within?
Is it safe to summarize this issue as the following:
install --admin that emits CRDs, ClusterRoles, ClusterRoleBindings, and the linkerd-cni configinstall --user emits all other resources (including those previously installed by install-sp).Is that it?
@olix0r That's basically the gist.
Another way to think about it:
linkerd install --admin is for things that require privileges outside of a predetermined set of namespaceslinkerd install --user is for things that only require privileges within a predetermined set of namespacesThere is still some discussion around supporting something like a --single-namespace mode, and how that would relate to a multistage install.
Completed as part of #2719. Tracking subsequent work via #2337.