hi,
right now the status field in cnp looks like this:
status:
nodes:
minikube:
enforcing: true
lastUpdated: 2018-05-25T00:09:50.623919535Z
localPolicyRevision: 408
ok: true
this doesn't quite tell me which policies are in effect. it would be nice to have a field that can be used to identify which policies are being enforced on each node, maybe something like a map from namespace to cnp labels/annotations.
@michi-covalent The status field is per policy object and applies to all rules. Can you clarify what you mean exactly by "doesn't quite tell me which policies are in effect"?
oh ok that's amazing. somehow i thought the status was global.
@tgraf how about exposing resourceVersion of the cnp in the status field? right now there is lastUpdated timestamp, but with resourceVersion you can check which version of cnp is in effect more reliably right?
how about exposing resourceVersion of the cnp in the status field? right now there is lastUpdated timestamp, but with resourceVersion you can check which version of cnp is in effect more reliably right?
馃憤 I like it
I don't think using resourceVersion will work. See the following, in which I modified CNP to have a ResourceVersion field in its Status:
Import a CNP:
$ kubectl apply -f test/k8sT/manifests/l3_l4_policy.yaml
ciliumnetworkpolicy "rule1" created
Describe CNP:
$ kubectl describe cnp
Name: rule1
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"cilium.io/v2","description":"L3-L4 policy","kind":"CiliumNetworkPolicy","metadata":{"annotations":{},"name":"rule1","namespace":"default...
API Version: cilium.io/v2
Kind: CiliumNetworkPolicy
Metadata:
Cluster Name:
Creation Timestamp: 2018-05-25T20:20:27Z
Generation: 0
Resource Version: 2072
Self Link: /apis/cilium.io/v2/namespaces/default/ciliumnetworkpolicies/rule1
UID: 104b3215-6059-11e8-ac37-08002740ae4f
Spec:
Endpoint Selector:
Match Labels:
Any : Id: app1
Ingress:
From Endpoints:
Match Labels:
Any : Id: app2
To Ports:
Ports:
Port: 80
Protocol: TCP
Status:
Nodes:
K 8 S 1:
Enforcing: true
Last Updated: 2018-05-25T20:42:54.681213989Z
Local Policy Revision: 2
Ok: true
Resource Version: 942
Events: <none>
I used the ResourceVersion in the Status section (942) from the ObjectMeta.ResourceVersion that was in the CNP which was provided to us via K8s. However, because we update the CNP with the new Status containing the ResourceVersion being used, K8s gives the CNP a new ResourceVersion because it's been updated. So, comparing the ResourceVersion in the Status vs. in the Metadata won't work because updating the one in Status will result in a change in the one in Metadata.
I looked further into the K8s ObjectMeta type and found that there exists a Generation field:
However, this field never increases in value. For example, using the following policy:
$ cat foo.yaml
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
description: "L3-L4 policy"
metadata:
name: "rule1"
spec:
endpointSelector:
matchLabels:
id: app1
ingress:
- fromEndpoints:
- matchLabels:
id: app2
toPorts:
- ports:
- port: "84"
protocol: TCP
After importing, generation is zero:
$ kubectl describe cnp
Name: rule1
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"cilium.io/v2","description":"L3-L4 policy","kind":"CiliumNetworkPolicy","metadata":{"annotations":{},"name":"rule1","namespace":"default...
API Version: cilium.io/v2
Kind: CiliumNetworkPolicy
Metadata:
Cluster Name:
Creation Timestamp: 2018-05-25T20:20:27Z
Generation: 0
Resource Version: 4027
Self Link: /apis/cilium.io/v2/namespaces/default/ciliumnetworkpolicies/rule1
UID: 104b3215-6059-11e8-ac37-08002740ae4f
Spec:
Endpoint Selector:
Match Labels:
Any : Id: app1
Ingress:
From Endpoints:
Match Labels:
Any : Id: app2
To Ports:
Ports:
Port: 84
Protocol: TCP
Status:
Nodes:
K 8 S 1:
Enforcing: true
Last Updated: 2018-05-25T21:21:09.592687988Z
Local Policy Revision: 6
Ok: true
Resource Version: 4026
Events: <none>
I edited the port number from 84 --> 85 in the policy:
$ cat foo2.yaml
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
description: "L3-L4 policy"
metadata:
name: "rule1"
spec:
endpointSelector:
matchLabels:
id: app1
ingress:
- fromEndpoints:
- matchLabels:
id: app2
toPorts:
- ports:
- port: "85"
protocol: TCP
And imported it:
$ kubectl apply -f foo2.yaml
ciliumnetworkpolicy "rule1" configured
Generation does not increase:
$ kubectl describe cnp
Name: rule1
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"cilium.io/v2","description":"L3-L4 policy","kind":"CiliumNetworkPolicy","metadata":{"annotations":{},"name":"rule1","namespace":"default...
API Version: cilium.io/v2
Kind: CiliumNetworkPolicy
Metadata:
Cluster Name:
Creation Timestamp: 2018-05-25T20:20:27Z
Generation: 0
Resource Version: 7299
Self Link: /apis/cilium.io/v2/namespaces/default/ciliumnetworkpolicies/rule1
UID: 104b3215-6059-11e8-ac37-08002740ae4f
Spec:
Endpoint Selector:
Match Labels:
Any : Id: app1
Ingress:
From Endpoints:
Match Labels:
Any : Id: app2
To Ports:
Ports:
Port: 85
Protocol: TCP
Status:
Nodes:
K 8 S 1:
Enforcing: true
Last Updated: 2018-05-25T22:27:16.444970231Z
Local Policy Revision: 47
Ok: true
Resource Version: 7298
Events: <none>
Generation numbers are further documented here: https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md#concurrency-control-and-consistency
@michi-covalent discovered a GH issue around generation not being increased for CRDs: https://github.com/kubernetes/kubernetes/issues/62595#issuecomment-381398410 :
For 1.9.2, the metadata.generation for the custom resource will not change as the spec changes. It can be made to increment only from 1.10.
So, we can't really use this value for Cilium to reliably indicate what the realized version of the CNP is in the status because we support K8s versions >= 1.7 :(
@ashwinp discussed offline that we could use some type of label / annotation to the CNP in the ObjectMeta that we put in the Status once it's realized. I.e., for CNP foobar, we attach annotation baz and once it is realized in Cilium, we populate a field in Status called annotations with baz. Then, upon the next iteration of this CNP foobar, it would be annotated differently with boom instead of baz. Once it is realized in Cilium, we would update the annotations in the status with boom instead of baz. This does put the onus upon the user to update the annotations within the CNP, though, to be different upon each update of the same CNP.
This does put the onus upon the user to update the annotations within the CNP, though, to be different upon each update of the same CNP.
thanks for looking into this @ianvernon and good catch on resourceVersion stuff. for the usecase @ashwinp and i are interested in we can live with using either a label or an annotation as an identifier.
for the usecase @ashwinp and i are interested in we can live with using either a label or an annotation as an identifier.
The existing labels in the spec itself can be used for this already. You can add a generation=X label and just change it for each rule change and wait for the enforcing bit to become true.
The resourceVersion would work as well as k8s tells you what the new resource version is when updating any object. So yes, you can't compare it against the resourceVersion in metadata (which would be identical to just looking at the enforcing bit anyway) but you can compare it against the resourceVersion that was associated with your policy change.
@tgraf
The existing labels in the spec itself can be used for this already. You can add a generation=X label and just change it for each rule change and wait for the enforcing bit to become true.
I just tried this by editing the CNP rule to change a port from 8080 to 8081 AND changing the generation from X to Y. The localPolicyRevision and the port in the rule were updated, but the generation was not updated.
We'd like to apply a cilium network policy via kubectl and then answer the question -- Has the policy been enforced on all the nodes in my cluster? -- without maintaining any state. That is why, intuitively, it seems that if we can update a label/annotation in the CNP, to say policyuuid: Y from policyuuid: X, and then keep checking the output of kubectl get cnp's node status section until we see that value with enforcing: true, we can achieve this in a stateless fashion.
Proposed illustration:
"status": {
"nodes": {
"minikube": {
"enforcing": true,
"lastUpdated": "2018-05-28T17:53:04.666994387Z",
"localPolicyRevision": 20,
"metadata": {"annotations": {"policyuuid": Y}},
"ok": true
}
}
}
@ianvernon After moving to Annotations, things seem to be working fine:
"status": {
"nodes": {
"minikube": {
"annotations": {
"policyuuid": "http://box.local:32000/test",
....snip......
},
"enforcing": true,
"lastUpdated": "2018-06-06T23:20:38.195459781Z",
"localPolicyRevision": 10,
"ok": true
}
}
}
}
However, updating just the Annotations in the CNP does not result in the annotations in the status section to be updated. If an annotation is added/removed, the status section currently reflects the change only if there is a substantive change in the policy rules.
Part 1 PR is: https://github.com/cilium/cilium/pull/4412
Part 2 TODO: Status should reflect changes to Annotations even when none of the Policy rules have been updated.
Part 3 TODO: Unit Tests
FYI - I will be working on the updating of annotations of rules which have equivalent spec / specs. I will also have a fix out in the same PR to use the DeepCopy of the annotations instead of the pointer to the rule passed into addCiliumNetworkPolicyV2, as the object is a pointer that could be changed by other subsystems.
I'm coming to this from #4492, so I probably lack some context.
Is there a reason why we prefer the CNP status over the CEP status? In particular, the CEPs show the rule labels (I'm not sure about annotations) that are desired and enforced for that endpoint. I realise that this would require iterating over more objects but it seems a lot more reliable than assuming that the enforcing: true behaviour of the nodes means: the policy is now in effect on all endpoints on the node. This is especially true if nodes are scheduling many pods, since you have a race.
Is there a reason why we prefer the CNP status over the CEP status? In particular, the CEPs show the rule labels (I'm not sure about annotations) that are desired and enforced for that endpoint. I realise that this would require iterating over more objects but it seems a lot more reliable than assuming that the enforcing: true behaviour of the nodes means: the policy is now in effect on all endpoints on the node. This is especially true if nodes are scheduling many pods, since you have a race.
Are the CEPs guaranteed to be updated after the datapath changes have been committed? How do you know from looking at the CEPs, to which endpoints a policy rule should apply to and which CEPs should contain the rule labels?
Are the CEPs guaranteed to be updated after the datapath changes have been committed?
Yes, because they update every 10s or so. We also explicitly show the desired and realised policy so you can distinguish between the stages of implementation. All this is per endpoint, so it would be correct as they build.
We keep trying to optimise endpoint builds, however, so I'm not sure if similar problems to what #4492 fixes would crop up here too (probably yes).
How do you know from looking at the CEPs, to which endpoints a policy rule should apply to and which CEPs should contain the rule labels?
Unless you have a very specific selector you won't know before hand. You'd have to scan the whole CEP list. While that is more data moving around, the specificity isn't worse than relying on the node info: How do you know which node an endpoint is on? I realise that, in this case, we look at the policy itself but you can't assert much until all nodes report they are enforced: true and that means you have the list of nodes anyway.
Like I said, I lack context here so perhaps this is the most effective method. My primary confusion is why the node is more important than the endpoint, which is the real active unit in our system.
Yes, because they update every 10s or so. We also explicitly show the desired and realised policy so you can distinguish between the stages of implementation. All this is per endpoint, so it would be correct as they build.
We keep trying to optimise endpoint builds, however, so I'm not sure if similar problems to what #4492 fixes would crop up here too (probably yes).
The potential 10 second delay does not make it very attractive.
Unless you have a very specific selector you won't know before hand. You'd have to scan the whole CEP list. While that is more data moving around, the specificity isn't worse than relying on the node info: How do you know which node an endpoint is on?
What we have today provides the following:
This means that the inspector does not require to understand endpoints at all. Waiting for all node entries to become enforcing true with matching annotations is enough to understand when a policy has been implemented.
ok
The initial feature has ladned; https://github.com/cilium/cilium/pull/4492 is a fix on top of it, so moving this to 1.2-bugfix accordingly.
@joestringer The above referenced backports was closed with unmerged commits. Does this issue need to be backported still? Or if not can you provide some insight. Thanks.
Most helpful comment
@ianvernon After moving to Annotations, things seem to be working fine:
However, updating just the Annotations in the CNP does not result in the annotations in the status section to be updated. If an annotation is added/removed, the status section currently reflects the change only if there is a substantive change in the policy rules.