Cluster-api: Problem with v1alpha2 references

Created on 23 Jan 2020  路  24Comments  路  Source: kubernetes-sigs/cluster-api

First off, huge shout out to @johnharris85 for finding this one and reaching out. We worked together to trace down what the problem is exactly. Strap in!

For background reading, please take a look at #2108.

Creating a cluster with the YAML in #2108 does not work. You will see the Cluster API machine controller getting stuck finding an empty dataSecretName. This is super weird because when we look at the bootstrap config object, the dataSecretName is populated correctly, but Cluster API reads it as an empty string.

Then we noticed the API version of the reference objects. The machine referenced a v1alpha2 type which it tries to fetch and read the dataSecretName from. You'll see that the v1alpha2 type has no dataSecretName. The way we check for the value will return an empty string and a false value which is ignored today.

The problem lies in the fact that when we get the object we've stored on disk with kubectl or a client, we return an upconverted hub version v1alpha3. But when Cluster API controllers fetch the version, they fetch the real version the reference is using. In this case it was v1alpha2.

We tested this hypothesis and changed the references to v1alpha3 and recreated the cluster. We got past that error and into another error where the ControlPlaneEndpoint could not be read. This was again, the fault of a reference pointing to a v1alpha2 type.

Suggestions to fix this issue

  • To mitigate the confusion, we should be checking for the false value that we are currently ignoring in most places where we call unstructured.NestedString.
  • There are a few ways we could consider fixing this. One simple solution, but not great for users, would be to tell users that the references should all point to hub versions. This would allow us to fetch the hub version instead of a spoke version that may be behind the hub version.
  • We could somehow fetch the hub version or perhaps run the spoke version manually through the conversion functions to create a hub version (but this seems very fragile).
  • ??? I'm still mulling over other solutions since I don't like any of these, but I suspect others here will have some thoughts :D

cc @ncdc @vincepri @detiber @randomvariable @noamran

/kind bug
/milestone v0.3.0
/priority important-soon

kinbug lifecyclactive prioritimportant-soon

Most helpful comment

I'm leaning towards having the controller fail in this case, because all the objects should be updated somehow, but I would like probably to chat a sync chat and bridge our minds for a better solution

All 24 comments

One thing we've discussed in the past is to remove the use of object references where we don't specifically need them in favor of a <thing>Name field, this would avoid the version issue for those use cases. Since we already know the type for these types of references, we can just use the version from the types we import and that should be sufficient to solve the problem.

For more generic external references it'll definitely be a bit more tricky.

@detiber I think that wouldn't work in this case because the references can live anywhere, not just known objects like a Secret, although that would be nice :/

I'm leaning towards having the controller fail in this case, because all the objects should be updated somehow, but I would like probably to chat a sync chat and bridge our minds for a better solution

FYI...:

The machine referenced a v1alpha2 type which it tries to fetch and read the dataSecretName from. You'll see that the v1alpha2 type has no dataSecretName. The way we check for the value will return an empty string and a false value which is ignored today.

To mitigate the confusion, we should be checking for the false value that we are currently ignoring in most places where we call unstructured.NestedString.

I don't disagree that it could be helpful to check the value of the returned bool; however, in this specific case (at least for KubeadmConfig), I believe because dataSecretName is optional, it won't be rendered by the apiserver unless there is a non-empty string value. For bootstrap providers, the expectation is that dataSecretName remains unpopulated until the secret exists, so it's acceptable to have an omitted dataSecretName. From the machine controller's perspective, I don't know that it can distinguish with 100% certainty and accuracy the difference between a dataSecretName that is omitted because the secret isn't ready yet, and one that is omitted because the bootstrap resource is from an older api version that does not have dataSecretName as a field.

I'm leaning towards having the controller fail in this case, because all the objects should be updated somehow, but I would like probably to chat a sync chat and bridge our minds for a better solution

How do they get updated, though. I'd prefer not to make it the responsibility of users to have to update all of there references. This is especially going to be tricky in places where we use the references as part of a hash to rollout changes (such as KubeadmControlPlane and MachineDeployment).

I'm partly wondering if we need to use a custom type that omits the apiversion and do discovery somehow to determine which version we should pull. For known types it'll be easy, less so for external provider-defined types.

This is unfortunately a super old issue that has no real resolution as of yet 鈽癸笍 xref https://github.com/kubernetes/kubernetes/pull/12951#discussion_r38799725

We could ignore the version portion and use API discovery to determine either the ServerPreferredVersion or the StorageVersion... That probably brings its own set of edge cases though.

We could ignore the version portion and use API discovery to determine either the ServerPreferredVersion or the StorageVersion... That probably brings its own set of edge cases though.

Yeah, I think that would introduce potential issues during upgrade after CRDs have been updated, but before the new controllers have been deployed.

We discussed this issue in a meeting together.

The simplest path forward will be to annotate CRDs with a label that indicate which version the CRD conforms to.

An example will be more clear. If a machine has a reference to a v1alpha2 kubeadm bootstrap config, a label will be attached to the v1alpha2 kubeadm bootstrap config CRD that says it conforms to the "v1alpha2 bootstrap config" interface. This is the interface we defined when we released v1alpha2 of cluster api (v0.2). If the reference contains a v1alpha3 kubeadm bootstrap config, the controller can determine that it conforms to the "v1alpha3 bootstrap config".

The reason for this label is to decouple the actual CRD API version from the interface we are expecting the object to conform to. This allows independent versioning of types. For example, another bootstrap provider could be at v1beta3 but conform to Cluster API's bootstrap provider v1alpha2. Assuming they are using the correct controller, all will be well.

Independently of this, we would like to talk with apimachinery and see if this sort of thing can be avoided in the future. @ncdc has a bunch of thoughts here, but this is mostly a longterm vision and outside the scope of this immediate issue.

Expected contracts:

  • InfrastructureCluster v1alpha2, v1alpha3
  • InfrastructureMachine v1alpha2, v1alpha3
  • Bootstrap v1alpha2, v1alpha3
  • ControlPlane v1alpha3

Then we can talk about which contracts our internal types (Machines, Clusters, MD, MS) and controllers are programmed against.

a label will be attached to the v1alpha2 kubeadm bootstrap config

Not the config itself, but the actual CustomResourceDefinition

a label will be attached to the v1alpha2 kubeadm bootstrap config

Not the config itself, but the actual CustomResourceDefinition

thanks, updated

FYI, #2183 is related

What do you think about "cluster.x-k8s.io/api-version-contract" as the label applied to the components?

/lifecycle active

What do you think about "cluster.x-k8s.io/api-version-contract" as the label applied to the components?

good a starting place as any, can adjust if necessary (before release) and deprecate if changed after release.

Works for me.

Given that the label / annotation would apply to all versions in the CRD, maybe we should do something like cluster.x-k8s.io/storage-version-contract to be specific that the label only applies to the storage version?

Is there any way to have multiple values? Scenario:

  • Everything is running v1a3
  • Upgrade CRD so it now also defines v1a4
  • Some controllers still only talk v1a3
  • Some controllers are upgraded and talk v1a4

I assume we'd want the v1a3 controllers to pull v1a3, and the v1a4 controllers to pull v1a4, right?

The contract version should only apply for Cluster API, and it only serves the purpose of updating the references to external objects if a new storage version matches the current Cluster API version.

If other controllers are still behind, they should be able to request a different version, or am I missing something?

Ok, I think I'm overcomplicating things. I think you're right, but it would probably be useful to chat on Zoom after office hours.

Outcome of the meeting / notes:

  • Providers MUST set the cluster.x-k8s.io/<version> to all Custom Resource Definitions related to Cluster API starting with v1alpha3.
  • The label is a map from an API Version of Cluster API (contract) to your Custom Resource Definition versions.

    • The value is a comma-delimited list of versions.

    • Each value MUST point to an available version in your CRD Spec.

  • The label allows Cluster API controllers will perform automatic conversions for object references, Cluster API
    picks the last available version in the list if multiple versions are found.
  • To apply the label to CRDs it's possible to use commonLabels in your kustomize.yaml file, usually in config/crd.

In this example we show how to map a particular Cluster API contract version to your own CRD using Kustomize's commonLabels feature:

commonLabels:
  cluster.x-k8s.io/v1alpha2: v1alpha1
  cluster.x-k8s.io/v1alpha3: v1alpha2
  cluster.x-k8s.io/v1beta1: v1alphaX,v1beta1

If this looks good, I'll add it to the v1alpha2->v1alpha3 document

How does this work with regards to different contract types? For example, the cluster v1alpha2 and the machine v1alpha2 contracts are different but there's no way to indicate that in the current suggestion.

Thoughts on something like:

machine.cluster.x-k8s.io/v1alpha2
bootstrap.cluster.x-k8s.io/v1alpha3

etc?

Although, it miiiiight be enough to use the labels without the component prefix since we generally know what kind of object we're working with based on convention. For example, infrastructure ref will always be an InfraMachine or InfraCluster and since, at the time of reading the reference, we know if we're looking for an InfraCluster or InfraMachine those could only fulfill the Cluster v1alphax or Machine v1alphax respectively.

The idea was to tackle contracts for infrastructure or bootstrap down the line (if there is a need to), for now we refer as Cluster API contract as a group that includes all the extension points for providers.

To clarify, cluster.x-k8s.io/v1alpha2 includes core, infrastructure, and bootstrap contracts for v1alpha2.

Was this page helpful?
0 / 5 - 0 ratings