kubeadm 1.9.2 doesn't work over proxy

Created on 31 Jan 2018 · 18Comments · Source: kubernetes/kubeadm

Versions

kubeadm version (use kubeadm version):

Environment:

Kubernetes version (use kubectl version): kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration: Vmware / Proxmox
OS (e.g. from /etc/os-release): Debian GNU/Linux 9 (stretch)
Kernel (e.g. uname -a): 4.9.65-3+deb9u2
Others:

What happened?

I try to execute kubeadm init --pod-network-cidr=192.168.0.0/16 and stucks with

[init] This might take a minute or longer if the control plane images have to be pulled.

What you expected to happen?

The kubadm runs fine and I get a working cluster node.

Anything else we need to know?

The problem is, that the first time I created a cluster, I did it on my Vmware Player with NAT and full access to the internet. In the second try, I created Vms (two for master on Proxmox VE (KVM) and two nodes on Vmware vSphere. The network is restricted with no direct internet connection. So I added to /etc/profile:

export http_proxy="http://192.168.42.214:3128"
export https_proxy="http://192.168.42.214:3128"
export no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com,.example.local,192.168.0.0/16,10.96.0.0/12,172.25.50.21,172.25.50.22,172.25.50.23,172.25.50.24"
export HTTP_PROXY="http://192.168.42.214:3128"
export HTTPS_PROXY="http://192.168.42.214:3128"
export NO_PROXY="localhost,127.0.0.1,localaddress,.localdomain.com,.example.local,192.168.0.0/16,10.96.0.0/12,172.25.50.21,172.25.50.22,172.25.50.23,172.25.50.24"

On the firewall log I can see, that there is still traffic to 173.194.76.82 (gcr.io) via HTTPS. That is bad. Also kubeadm hangs forever. So I added the host to the whitelist on the firewall (NAT) and than, I got:

...
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 75.501467 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node ina-test-kubm-01 as master by adding a label and a taint
[markmaster] Master ina-test-kubm-01 tainted and labelled with key/value: node-role.kubernetes.io/master=""
...

Now I can go forward with the network part.

Source

linuxmail

Most helpful comment

SOLVED - I had a cgroup driver mismatch between docker and kubelet. Rectified it and init completed successfully.

cneginha on 1 Mar 2018

🎉2

All 18 comments

Are you running kubeadm as sudo kubeadm .... ? If so, verify your sudoers settings for options about resetting environment variables. In many distros proxy settings will be not kept via sudo.

kad on 31 Jan 2018

hi,

no, executed directly with root permissions.

linuxmail on 31 Jan 2018

Then please verify that your session really has environment variables set. e.g. env | grep -i _proxy.
We are using kubeadm on a daily basis in the enterprise network behind proxies. I don't see ways why kubeadm will be not using proxy unless environment is not set properly.

kad on 31 Jan 2018

hi,

```

env | grep -i _proxy

HTTP_PROXY=http://192.168.42.214:3128
https_proxy=http://192.168.42.214:3128
http_proxy=http://192.168.42.214:3128
no_proxy=localhost,127.0.0.1,localaddress,.localdomain.com,.localdomain.local,192.168.0.0/16,10.96.0.0/12,172.25.50.21,172.25.50.22,172.25.50.23,172.25.50.24
NO_PROXY=localhost,127.0.0.1,localaddress,.localdomain.com,.localdomain.local,192.168.0.0/16,10.96.0.0/12,172.25.50.21,172.25.50.22,172.25.50.23,172.25.50.24
HTTPS_PROXY=http://192.168.42.214:3128
```
I had the same problem on the worker-nodes too. So I assume, that one or more processes drops the env, or does not use them.

What I can image is, that the process (dash/sh/) doesn't read the /etc/profile ...

linuxmail on 2 Feb 2018

👍2

The network calls in kubeadm are not spawning any subprocesses, thus should be utilising those proxies. Control plane components also getting proxy settings propagated.
The only component that I can imagine during setup, (and actually the only one which should connect to gcr.io IP) is docker daemon. It does not use /etc/profile and require configuration for proxies in systemd drop-in file.

Can you check what in your setup docker info shows, please ?

kad on 2 Feb 2018

i got the same issue.
ive updated the /etc/sysconfig/docker
to add the proxy
and docker info now shows :

Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 1 Server Version: 1.13.1 Storage Driver: overlay2 Backing Filesystem: xfs Supports d_type: true Native Overlay Diff: true Logging Driver: journald Cgroup Driver: systemd Plugins: Volume: local Network: bridge host macvlan null overlay Authorization: rhel-push-plugin Swarm: inactive Runtimes: runc oci Default Runtime: oci Init Binary: /usr/libexec/docker/docker-init-current containerd version: (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1) runc version: N/A (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f) init version: N/A (expected: 949e6facb77383876aeff8a6944dde66b3089574) Security Options: seccomp WARNING: You're not using the default seccomp profile Profile: /etc/docker/seccomp.json selinux Kernel Version: 4.14.16-200.fc26.x86_64 Operating System: Fedora 26 (Twenty Six) OSType: linux Architecture: x86_64 Number of Docker Hooks: 3 CPUs: 1 Total Memory: 3.877 GiB Name: hostnamexx ID: 2F4B:TXXA:YXXW:NFBD:LXXR:WZNE:QFAC:Y4Y5:NO37:U5DS:I6XT:XXXX Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Http Proxy: http://user:[email protected]:8887 No Proxy: localnet.net,localhost,127.0.0.1,192.168.0.0/16 Registry: https://index.docker.io/v1/ Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false Registries: docker.io (secure), registry.fedoraproject.org (secure), registry.access.redhat.com (secure), docker.io (secure)

jamalsia on 9 Feb 2018

@jamalsia so, which error you're getting now ?

kad on 9 Feb 2018

[init] Using Kubernetes version: v1.9.3
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
        [WARNING Hostname]: hostname "hostnameXX" could not be reached
        [WARNING Hostname]: hostname "hostnameXX" lookup hostnameXX on 10.0.10.150:53: server misbehaving
        [WARNING FileExisting-tc]: tc not found in system path
        [WARNING FileExisting-crictl]: crictl not found in system path
        [WARNING HTTPProxy]: Connection to "https://10.44.102.144:6443" uses proxy "http://user:[email protected]:1234". If that is not intended, adjust your proxy settings
        [WARNING HTTPProxyCIDR]: connection to "10.96.0.0/12" uses proxy "http://user:[email protected]:1234". This may lead to malfunctional cluster setup. Make sure that Pod and Services IP ranges specified correctly as exceptions in proxy configuration
        [WARNING HTTPProxyCIDR]: connection to "192.168.0.0/16" uses proxy "http://user:[email protected]:1234". This may lead to malfunctional cluster setup. Make sure that Pod and Services IP ranges specified correctly as exceptions in proxy configuration
[preflight] Some fatal errors occurred:
        [ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

but ping is working , which makes me believe that it is not related to the hosts no_proxy configuration. I am running a HyperV Virtual Machine behind a proxy :

ping hostnameXXX
PING hostnameXXX(hostnameXXX (fe80::215:5dff:fe68:3300%eth0)) 56 data bytes
64 bytes from hostnameXXX (fe80::215:5dff:fe68:3300%eth0): icmp_seq=1 ttl=64 time=0.104 ms
64 bytes from hostnameXXX (fe80::215:5dff:fe68:3300%eth0): icmp_seq=2 ttl=64 time=0.076 ms
64 bytes from hostnameXXX (fe80::215:5dff:fe68:3300%eth0): icmp_seq=3 ttl=64 time=0.030 ms
64 bytes from hostnameXXX (fe80::215:5dff:fe68:3300%eth0): icmp_seq=4 ttl=64 time=0.074 ms
64 bytes from hostnameXXX (fe80::215:5dff:fe68:3300%eth0): icmp_seq=5 ttl=64 time=0.030 ms

there are 5 nics . eth0 is the main one. the others are the virtual machines network cards.

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:15:5d:68:33:00 brd ff:ff:ff:ff:ff:ff
    inet 10.44.102.144/24 brd 10.44.102.255 scope global dynamic eth0
       valid_lft 13928sec preferred_lft 13928sec
    inet6 fe80::215:5dff:fe68:3300/64 scope link
       valid_lft forever preferred_lft forever
3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:16:cb:b1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:16:cb:b1 brd ff:ff:ff:ff:ff:ff
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:36:1c:6f:f2 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever

jamalsia on 12 Feb 2018

@kad

docker info
Containers: 44
 Running: 23
 Paused: 0
 Stopped: 21
Images: 14
Server Version: 17.12.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 89623f28b87a6004d4b785663257362d1658a729
runc version: b2567b37d7b75eb4cf325b77297b140ea686ce8f
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.0-5-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 996.3MiB
Name: ina-test-kubm-01
ID: E5US:GR7K:2SZA:OS6E:JSUN:MU7X:X4KS:VVXJ:ZLBH:HFZA:SFRC:VG23
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

As a hint: I remove the proxy vars and added the hosts to a whitelist to go forward. But if I remember right, there was some lines about proxy under /etc/kubernetes/ ....

cu denny

linuxmail on 12 Feb 2018

Greetings,

You may refer to https://docs.docker.com/config/daemon/systemd/#httphttps-proxy to set proxy for docker daemon.

guhuajun on 23 Feb 2018

kubeadm init still fails. Even after setting the docker proxy config (link in the above post).
It also doesnt help that using wildcards in no_proxy env variable doesnt work like it's supposed to on linux.

cneginha on 1 Mar 2018

@cneginha kubeadm and all other kubernetes code support CIDR notation of NO_PROXY.
Set NO_PROXY to `127.0.0.1,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,example.com" (replace example.com with your domain).

kad on 1 Mar 2018

@jamalsia you have multiple things that you need to solve in your setup:

swap
DNS and hostnames
set correctly NO_PROXY
check situation with docker. proxy setting might be needed to be adjusted there as well.

kad on 1 Mar 2018

@kad after setting NO_PROXY env variable explicitly with IP address of the nodes involved, I no longer get the proxy warning. However, kubadm init is still failing with -

====
kubeadm init
[init] Using Kubernetes version: v1.9.3
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
[WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Starting the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [kubeflow-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.10.10.4]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] It seems like the kubelet isn't running or healthy.
Unfortunately, an error has occurred:
timed out waiting for the condition

This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
- There is no internet connection, so the kubelet cannot pull the following control plane images:
- gcr.io/google_containers/kube-apiserver-amd64:v1.9.3
- gcr.io/google_containers/kube-controller-manager-amd64:v1.9.3
- gcr.io/google_containers/kube-scheduler-amd64:v1.9.3

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
couldn't initialize a Kubernetes cluster

========

docker pull gcr.io images works okay though..

cneginha on 1 Mar 2018

SOLVED - I had a cgroup driver mismatch between docker and kubelet. Rectified it and init completed successfully.

cneginha on 1 Mar 2018

🎉2

@cneginha
Can you explain to me?

4qv907rtet5r on 8 Mar 2018

closing.

timothysc on 7 Apr 2018

@4qv907rtet5r the cgroup driver of docker and kubelet were different.

You can find out what drivers are used with:

docker info | grep -i cgroup
cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

If they are different, edit the 10-kubeadm.conf :)

Info taken from the k8s docs: https://kubernetes.io/docs/tasks/tools/install-kubeadm/

proskehy on 7 Jun 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

ZFS isn't supported

kvaps · 3Comments

Node can't join to cluster

ep4eg · 3Comments

Provided minimal Resource requirement to install kube cluster via kubeadm

cnmade · 4Comments

Can not upgrade kube-apiserver from private registry.

ggaaooppeenngg · 4Comments

How to debug hanging "Created API client, waiting for the control plane to become ready"

andersla · 4Comments

kubeadm 1.9.2 doesn't work over proxy

Versions

What happened?

What you expected to happen?

Anything else we need to know?

Most helpful comment

All 18 comments

env | grep -i _proxy

Related issues