Kubeadm: solve the kubeadm offline and air-gapped support issues

Created on 7 Aug 2018  路  13Comments  路  Source: kubernetes/kubeadm

we have numerous reports from users in offline scenarios, where kubeadm goes for fetching the latest version from the internet.
https://github.com/kubernetes/kubeadm/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+internet+
https://github.com/kubernetes/kubeadm/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+offline

known workaround is passing explicit, semantic --kubernetes-version if the sub command supports it, or passing a config with kubernetesVersion defined.

under "offline support" there are couple of uses cases:

  • kubeadm commands that really don't need a network interface to work like kubeadm config migrate, kubeadm token generate
  • kubeadm commands that can still boostrap a air-gapped cluster (init, join)

our docs already explain how to pull images for air-gapped scenarios, but the docs need a slight bump after this multi-arch image PR is merged:
https://github.com/kubernetes/kubernetes/pull/66960


areUX kinbug kintracking-issue lifecyclactive prioritcritical-urgent

All 13 comments

@xiangpengzhao @timothysc

continuing the offline-support discussion from https://github.com/kubernetes/kubernetes/pull/62721
and experimenting with kubeadm offline. k8s master is at 3b4eb0a4a0.

for commands such as:

kubeadm init
kubeadm config print-default
kubeadm config images list
kubeadm alpha phase kubeconfig [any-sub-cmd]

this is returned:

*: No default routes.

the error seems to come from:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apimachinery/pkg/util/net/interface.go#L350

investigating why api machinery needs internet.

kubeadm init
kubeadm config print-default
kubeadm config images list

@neolit123 I didn't set mock version for the above three commands intentionally in https://github.com/kubernetes/kubernetes/pull/62721. So, in the process of the commands, ConfigFileAndDefaultsToInternalConfig will be invoked. In that func, internet connection is needed. Before connecting, it will invoke ChooseBindAddress which will error out *: No default routes. finally.

For this command kubeadm alpha phase kubeconfig [any-sub-cmd] , IIRC, I set mock version for its sub-commands then it shouldn't try to connect internet. Not sure why it still try to connect yet...

i see where the problem with that is.
will post an update later.

i'm taking a look at this now, will update with pr or comments about how to close

I think we've been conflating offline and air-gapped. We want to support air-gapped environments, while truly offline environments are not something we need to spend too much time on. In a truly offline environment very few commands will actually work.

@kad @chuckha @xiangpengzhao et al.

i did send that PR:
https://github.com/kubernetes/kubernetes/pull/67397

for some commands i would argue that offline support is needed.
but please, have a look and let's discuss.

edit: also updated the air-gapped vs offline notes in the first post.

Looks like you covered everything needed to close this issue. Thanks & nice work!

Discussed in office hours: for air gapped we need to change the default k8s version to a real version that doesn't need internet support. Users can still use "stable-1.11" as the version but it will require internet.

Sorry, missed discussion during today's office hours.

IMHO, what we should do with it: by default keep "stable-1.x", function which fetches version from download server should have some reasonable timeout (few seconds), and in cases if it doesn't return within timeout or return failure (except maybe http 404), it should fall back to the version of kubeadm itself.

With fallback version, there are variants:

  1. kubeadm version exactly (including alpha/beta/ci build IDs)
  2. latest stable which is lower than kubeadm version.
    Examples: kubeadm v1.11.3-beta.0 should set v1.11.2. For 1.x.0-* versions, most probably it will make sense to keep it as is, like v1.11.0-rc.3, maybe with special handling for CI builds of kubeadm like v1.12.0-alpha.1.428+3956bba38418e9 it should map to v1.12.0-alpha.1

i like the timeout idea. this can work.

the version parsing is going to be tricky and we need to settle on something here.
option 2 makes more sense to me, because this will map to GCR images too.
https://console.cloud.google.com/gcr/images/google-containers/GLOBAL/kube-apiserver?gcrImageListsize=50&gcrImageListsort=-uploaded

we should probably settle on:
vMAJOR.MINOR.PATCH-yyy.x
were yyy is alpha, beta, rc etc.
but without the extra .428+3956bba38418e9 after that.

^ i can send a patch for that later today or tomorrow so that everyone can provide feedback and test this.

Was this page helpful?
0 / 5 - 0 ratings