Kops: Add option --update to kops create to make it idempotent

Created on 25 Apr 2017  路  41Comments  路  Source: kubernetes/kops

It would be nice for CI systems if typing
kops create...
would always give the same result.
But running is twice returns a message (...) already exists:

Repro steps

  • create cluster with create
kops create cluster \
  --name k8s.prod.eu-west-1.aws.redacted.net \
  --zones eu-west-1a,eu-west-1b,eu-west-1c \
  --state s3://k8s/kops \
  --dns-zone=redacted.net
  • Type the same command again
kops create cluster \
  --name k8s.prod.eu-west-1.aws.redacted.net \
  --zones eu-west-1a,eu-west-1b,eu-west-1c \
  --state s3://k8s/kops \
  --dns-zone=redacted.net

---> it fails with:
cluster "k8s.prod.eu-west-1.aws.redacted.net" already exists; use 'kops update cluster' to apply changes

Proposal

Add flag --update to kops create.

If the cluster already exists and this flag is present, update it instead of creating it.

Alternative

The way helm does this is adding a flag --install on the helm upgrade doc command instead of a flag --upgrade flag on the helm install command.
I think it makes it less easy to discover this feature, but I guess it also works.

Most helpful comment

To support idempotency, kops can provide the patch command like kubectl, e.g.:

kops patch cluster foo.example.com \
  -p '{"spec": {"kubernetesVersion": "1.8.2"}}'

All 41 comments

Note:
kops replace might have to be used instead of kops update.

update and replace are a bit confusing at the moment: https://github.com/kubernetes/kops/issues/2148

Workaround

# check if cluster exist, create if not, replace otherwise
if [[ ! $(kops get cluster --name "${CLUSTER_NAME}") ]]; then
  kops create -f cluster.yaml --name "${CLUSTER_NAME}" -v 10
else
  kops replace -f cluster.yaml --name "${CLUSTER_NAME}" -v 10
fi

All this code could be avoided with a flag --update.

I would also like an idempotent create-or-replace ... whatever the name ends up being

Right now, to achieve idempotency in your automation, you have basically two choices, both of them bad:

  1. Implement creation via kops create cluster and modification via yaml manipulation - double the effort
  2. Implement creation and modification via yaml manipulation, by populating static yaml templates with values - all well and good, until the config format changes (keys added/removed/renamed), causing anything between subtle, unexpected changes and complete breakage

Both options share the downside, that even if the format (keys and structure) remains the same, some values that kops would produce, will change over time (e.g. images in instance groups). So if you only use kops once, to create the cluster and then only modify the configs, you'll never update those values.

There are projects where easy automation is not a big deal and at most falls into the "nice to have category". Kubernetes is not one of them.

To support idempotency, kops can provide the patch command like kubectl, e.g.:

kops patch cluster foo.example.com \
  -p '{"spec": {"kubernetesVersion": "1.8.2"}}'

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

The intent of this issue (idempotent actions) can be achieved with kops replace. We use this in our CI system:

kops replace --force -f cluster.yml
kops update cluster ${CLUSTER_NAME}

no bash if statements needed.

Can we close this issue?

@rifelpet
Does replace --force
also work if the cluster does not exist?
If it does now, it didn't used to a year ago.

@kenden yes it was fixed in #4275 which was released in 1.9.0 back in April. We've been using it in our CI system ever since then without any issue.

@rifelpet, thanks for checking. Great! Closing this ticket then.
Fixed in 1.9.0

@rifelpet you may have skipped a few steps there ;]

What @canhnt suggested here https://github.com/kubernetes/kops/issues/2428#issuecomment-343117114 is exactly one line to change something.

What you're proposing requires the user to:

  • Save the cluster config to a file
  • Modify the file (not trivial in all languages, or at least requires additional tools/libraries)
  • Use kops replace (what side-effects does --force have ?)
  • Use kops update
  • Clean up the file

kops patch (or something similar) would only require kops and sh and would only take a single line to make a change, so regardless of what the answer to @kenden's question is, IMO this should not be closed.

Also, the kops replace workflow is subject to race conditions.

Reopening until we know if it solved!

I am a bit confused from this issue. What is the proper way to implement kops in a CI/CD pipeline, currently?

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

/remove-lifecycle rotten

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

This issue seems to have moved a bit away from the original problem.
If I understand things correctly, it is now about being able to run kops in CI? If so, does the file state store solve this?

The issue hasn't moved at all and certainly not to a different subject.
The problem remains as it was in April 2017.

I don't think adding the flag will ever happen as we try to nudge people as much as we can towards using spec files andreplace. create doesn't, and isn't meant to, contain all the options that one normally wants for production clusters.

Can you _create_ a cluster with replace ?

Yes, kops replace has a --force flag that will create resources if they dont already exist.

      --force              Force any changes, which will also create any non-existing resource

That's all I personally was looking for way bake when, just a single idempotent create-or-update command, I couldn't care less if it was create or replace or something else.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

/remove-lifecycle rotten

Was this page helpful?
0 / 5 - 0 ratings