Cloud-on-k8s: Status subresource updates fail when the crd version changes

Created on 6 Dec 2019  路  3Comments  路  Source: elastic/cloud-on-k8s

Forked from https://github.com/elastic/cloud-on-k8s/pull/2184#issuecomment-562577557.

There is a bug in Kubernetes preventing status subresources to be updated when the storedVersion of a CRD has changed.

The update is rejected with eg.

{"reason":"FieldValueInvalid","message":"Invalid value: \"elasticsearch.k8s.elastic.co/v1beta1\": must be elasticsearch.k8s.elastic.co/v1","field":"apiVersion"}]},"code":422

This is fixed in Kubernetes 1.15.

>bug

Most helpful comment

A third approach discussed with @anyasabo:

  1. We could just catch that particular error, and when it happens, update the whole Kibana resource instead of the status only.

All 3 comments

Two potential solutions to this problem:

  1. Change the way we update the status: issue an update for the entire resource instead of updating the status subresource. It's not exactly clear to me what the benefit of updating the status subresource only is. It seems the entire resource is still part of the payload. Maybe it allows bypassing the resourceVersion optimistick locking check?

  2. Force an update of the resource to the latest version (which should be the version with storage: true) at the beginning of the reconciliation. This should be a simple no-op update with the same content. How to know what is the current storedVersion of a given resource? There is no clear API to do that. However we could store it ourselves in an annotation of the resource (elasticsearch.elastic.k8s.co/version: v1). Then just look at this annotation to know whether we should do the no-op update.

Number 2 has the benefit of also solving https://github.com/elastic/cloud-on-k8s/issues/2190: if we ever want to completely remove v1beta1, we need to convert existing resources first, which would be done here.
I think it could also help with breaking changes in future CRDs for cases where we cannot run conversion webhooks (old k8s version < 1.15 or webhooks disabled): we could embed the conversion logic as part of the resource update.

https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status explains the convention around the status subresource. IIUC, we should decouple updating the spec from updating the status. An update to the spec is not supposed to change the status, and the other way around.

A third approach discussed with @anyasabo:

  1. We could just catch that particular error, and when it happens, update the whole Kibana resource instead of the status only.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

thbkrkr picture thbkrkr  路  5Comments

barkbay picture barkbay  路  4Comments

pebrc picture pebrc  路  5Comments

anyasabo picture anyasabo  路  3Comments

spencergilbert picture spencergilbert  路  3Comments