Linkerd2: Update 'check' command to validate HA

Created on 1 Oct 2019  路  10Comments  路  Source: linkerd/linkerd2

Following https://github.com/linkerd/linkerd2/issues/3305#issuecomment-525033155, we'll like to update the linkerd check command to validate the properties of HA control planes, such as:

  1. The control plane components (except, for web, grafana and prometheus) should have more than one replicas.
  2. The replicated pods should be scheduled on different hosts.
  3. The MWC and VWC's FailurePolicy should be set to Fail to prevent uninjected workloads from entering the service mesh.
  4. The control plane components are assigned CPU and memory resource requirements.
  5. kube-system has the inject skip annotation.
arecli good first issue help wanted

All 10 comments

Let's check to make sure the kube-system namespace has the skip annotation as well.

Actually, IIRC, in order to support multi-stage installation, the linkerd check currently doesn't check for cluster-scoped resources, because the user running the command may not have the RBAC permissions. So checking for MWC/VWC's FailurePolicy and kube-system label won't work here.

I think it is fine to do a warning that says we were unable to check because of RBAC.

Hi, @grampelberg @ihcsim !
I'd like to work on this issue. Where can I start from?

@mayankshah1607 the list @ihcsim has is pretty good, you'll want to:

  1. Lookup whether the control plane was installed with --ha.
  2. Introduce a new section in check for HA.
  3. Add each line from the description as a separate check.

@grampelberg @ihcsim
This PR - https://github.com/linkerd/linkerd2/pull/3731 implements some parts of this issue like adding a new section to the checks to see if --ha is enabled and hence check if kube-system has the inject skip annotation.
I guess I'd have to wait for https://github.com/linkerd/linkerd2/pull/3731 to get merged to prevent any overlaps. Once merged, I'll open a PR that implements the remaining checks mentioned above. Is that ok?

@mayankshah1607 sgtm; I'll keep you posted.

@ihcsim @grampelberg It seems like point 5 (kube-system annotation) has been fixed in https://github.com/linkerd/linkerd2/pull/3731 . Could we remove it from this issue then?

@grampelberg I think we should close this one now that https://github.com/linkerd/linkerd2/pull/3942 has been merged :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zaharidichev picture zaharidichev  路  4Comments

geekmush picture geekmush  路  4Comments

klingerf picture klingerf  路  3Comments

manimaul picture manimaul  路  3Comments

wmorgan picture wmorgan  路  3Comments