Cluster-api: Quick start guide does not work from behind an HTTP proxy

Created on 21 Dec 2020 · 5Comments · Source: kubernetes-sigs/cluster-api

I followed the instruction in https://cluster-api.sigs.k8s.io/user/quick-start.html with docker infrastructure provider from behind an HTTP proxy.

As expected the workload cluster's control plane is Initialized and not Ready.

kubectl get kubeadmcontrolplane --all-namespaces
NAMESPACE   NAME                            INITIALIZED   API SERVER AVAILABLE   VERSION   REPLICAS   READY   UPDATED   UNAVAILABLE
default     capi-quickstart-control-plane   true                                 v1.19.1   1                  1         1

But when I try to install the calico CNI, the calico-node pods do not start, they are stuck while pulling the images:

kubectl --kubeconfig=./capi-quickstart.kubeconfig -n kube-system get pods -l k8s-app=calico-node
NAME                READY   STATUS                  RESTARTS   AGE
calico-node-4sbzg   0/1     Init:ImagePullBackOff   0          2m7s
calico-node-64qvp   0/1     Init:ImagePullBackOff   0          2m7s
calico-node-t24q4   0/1     Init:ImagePullBackOff   0          2m7s
calico-node-t84cx   0/1     Init:ImagePullBackOff   0          2m7s

For each of these pods, I see errors when trying to pull the calico/cni:v3.15.3 Docker image:

Events:
  Type     Reason     Age                   From                                            Message
  ----     ------     ----                  ----                                            -------
  Normal   Scheduled  3m26s                 default-scheduler                               Successfully assigned kube-system/calico-node-64qvp to capi-quickstart-md-0-55fc4f8ccf-ncmkv
  Normal   Pulling    115s (x4 over 3m26s)  kubelet, capi-quickstart-md-0-55fc4f8ccf-ncmkv  Pulling image "calico/cni:v3.15.3"
  Warning  Failed     115s (x4 over 3m26s)  kubelet, capi-quickstart-md-0-55fc4f8ccf-ncmkv  Failed to pull image "calico/cni:v3.15.3": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/calico/cni:v3.15.3": failed to resolve reference "docker.io/calico/cni:v3.15.3": failed to do request: Head https://registry-1.docker.io/v2/calico/cni/manifests/v3.15.3: dial tcp: lookup registry-1.docker.io on 10.171.108.2:53: no such host
  Warning  Failed     115s (x4 over 3m26s)  kubelet, capi-quickstart-md-0-55fc4f8ccf-ncmkv  Error: ErrImagePull
  Normal   BackOff    103s (x6 over 3m26s)  kubelet, capi-quickstart-md-0-55fc4f8ccf-ncmkv  Back-off pulling image "calico/cni:v3.15.3"
  Warning  Failed     92s (x7 over 3m26s)   kubelet, capi-quickstart-md-0-55fc4f8ccf-ncmkv  Error: ImagePullBackOff

What did you expect to happen:
I expected to be able to use the workload cluster.

Environment:

Cluster-api version:
clusterctl version: &version.Info{Major:"0", Minor:"3", GitVersion:"v0.3.12", GitCommit:"9e1dd7e8e428e05bee406602952ae269d55bdbba", GitTreeState:"clean", BuildDate:"2020-12-15T16:42:14Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
Minikube/KIND version:
kind v0.9.0 go1.15.2 linux/amd64
Kubernetes version: (use kubectl version):
1.19.1
OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"

Workaround
From the host, inspecting the Docker containers I saw that the HTTP proxy envirnment variables are not set for the workload cluster nodes:

docker inspect capi-quickstart-control-plane-d2mvr --format '{{json .Config.Env }}' | jq .
[
  "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
  "container=docker"
]

As a workaround, I rebuilt the kindest/node:v1.19.1 simply adding the proxy environment variables inside the image:

cat > Dockerfile << EOF
FROM kindest/node:v1.19.1

ENV HTTP_PROXY ${http_proxy}
ENV HTTP_PROXY ${https_proxy}
ENV NO_PROXY ${NO_PROXY}
EOF

docker build -t kindest/node:v1.19.1-custom .

And I then used this image as the image for the workload cluster nodes:

sed -i "s|    spec: {}|    spec:\n      customImage: kindest/node:v1.19.1-custom|" capi-quickstart.yaml
sed -i "s|      extraMounts:|      customImage: kindest/node:v1.19.1-custom\n      extraMounts:|" capi-quickstart.yaml

With this workaround, the CNI is correctly deployed in the workload cluster:

kubectl --kubeconfig=./capi-quickstart.kubeconfig -n kube-system get pods -l k8s-app=calico-node
NAME                READY   STATUS    RESTARTS   AGE
calico-node-2d6k9   1/1     Running   0          102s
calico-node-bpvdc   1/1     Running   0          102s
calico-node-w25mc   1/1     Running   0          102s
calico-node-zc468   1/1     Running   0          102s

And the workload cluster becomes Ready:

kubectl get kubeadmcontrolplane --all-namespaces
NAMESPACE   NAME                            INITIALIZED   API SERVER AVAILABLE   VERSION   REPLICAS   READY   UPDATED   UNAVAILABLE
default     capi-quickstart-control-plane   true          true                   v1.19.1   1          1       1

Prefered solution
It would be nice if there was a way to automatically propagate the proxy environment variables to the workload cluster nodes.

/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

areprovidedocker help wanted kinbug prioritbacklog

Source

bgoareguer

Most helpful comment

In fact, it seems that the node does have the proxy env variables:

docker inspect kind-control-plane --format '{{json .Config.Env }}' | jq . | cut -d "=" -f 1
[
  "https_proxy
  "NO_PROXY
  "no_proxy
  "HTTP_PROXY
  "http_proxy
  "HTTPS_PROXY
  "PATH
  "container
]

but the capd-controller-manager deployment does not:

kubectl -n capd-system describe deployment.apps/capd-controller-manager
Name:                   capd-controller-manager
Namespace:              capd-system
CreationTimestamp:      Mon, 04 Jan 2021 14:25:57 +0100
Labels:                 cluster.x-k8s.io/provider=infrastructure-docker
                        clusterctl.cluster.x-k8s.io=
                        control-plane=controller-manager
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               cluster.x-k8s.io/provider=infrastructure-docker,control-plane=controller-manager
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  cluster.x-k8s.io/provider=infrastructure-docker
           control-plane=controller-manager
  Containers:
   kube-rbac-proxy:
    Image:      gcr.io/kubebuilder/kube-rbac-proxy:v0.4.0
    Port:       8443/TCP
    Host Port:  0/TCP
    Args:
      --secure-listen-address=0.0.0.0:8443
      --upstream=http://127.0.0.1:8080/
      --logtostderr=true
      --v=10
    Environment:  <none>
    Mounts:       <none>
   manager:
    Image:       gcr.io/k8s-staging-cluster-api/capd-manager:v0.3.12
    Ports:       9443/TCP, 9440/TCP
    Host Ports:  0/TCP, 0/TCP
    Args:
      --feature-gates=MachinePool=false
      --metrics-addr=0
      -v=4
    Liveness:     http-get http://:healthz/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:    http-get http://:healthz/readyz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /tmp/k8s-webhook-server/serving-certs from cert (ro)
      /var/run/docker.sock from dockersock (rw)
  Volumes:
   cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  capd-webhook-service-cert
    Optional:    false
   dockersock:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/docker.sock
    HostPathType:
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   capd-controller-manager-557796f4dd (1/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  78s   deployment-controller  Scaled up replica set capd-controller-manager-557796f4dd to 1

So, to me, the problem is that the CAPD manifest infrastructure-components-development.yaml does not let us specify environment variables for the capd-manager container.

bgoareguer on 4 Jan 2021

👍2

All 5 comments

/milestone v0.4.0
/area provider/docker
/priority backlog
/help

This could be addressed via documentation or with an implementaiton on the docker provider making it picking up proxy env variables from the host similarly to what kind is doing https://kind.sigs.k8s.io/docs/user/quick-start/#configure-kind-to-use-a-proxy

fabriziopandini on 4 Jan 2021

@fabriziopandini:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/milestone v0.4.0
/area provider/docker
/priority backlog
/help

This could be addressed via documentation or with an implementaiton on the docker provider making it picking up proxy env variables from the host similarly to what kind is doing https://kind.sigs.k8s.io/docs/user/quick-start/#configure-kind-to-use-a-proxy

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot on 4 Jan 2021

@fabriziopandini, maybe I am missing something but here we are passing the HTTP_PROXY variable (defined here) from the host to the Docker container.

tcordeu on 4 Jan 2021

@tcordeu you are right, I was not aware of this being already implemented in CAPD.
So then we should go back @bgoareguer and try to investigate why env variables were not passed to the capi-quickstart-control-plane-d2mvr container

fabriziopandini on 4 Jan 2021

In fact, it seems that the node does have the proxy env variables:

docker inspect kind-control-plane --format '{{json .Config.Env }}' | jq . | cut -d "=" -f 1
[
  "https_proxy
  "NO_PROXY
  "no_proxy
  "HTTP_PROXY
  "http_proxy
  "HTTPS_PROXY
  "PATH
  "container
]

but the capd-controller-manager deployment does not:

kubectl -n capd-system describe deployment.apps/capd-controller-manager
Name:                   capd-controller-manager
Namespace:              capd-system
CreationTimestamp:      Mon, 04 Jan 2021 14:25:57 +0100
Labels:                 cluster.x-k8s.io/provider=infrastructure-docker
                        clusterctl.cluster.x-k8s.io=
                        control-plane=controller-manager
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               cluster.x-k8s.io/provider=infrastructure-docker,control-plane=controller-manager
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  cluster.x-k8s.io/provider=infrastructure-docker
           control-plane=controller-manager
  Containers:
   kube-rbac-proxy:
    Image:      gcr.io/kubebuilder/kube-rbac-proxy:v0.4.0
    Port:       8443/TCP
    Host Port:  0/TCP
    Args:
      --secure-listen-address=0.0.0.0:8443
      --upstream=http://127.0.0.1:8080/
      --logtostderr=true
      --v=10
    Environment:  <none>
    Mounts:       <none>
   manager:
    Image:       gcr.io/k8s-staging-cluster-api/capd-manager:v0.3.12
    Ports:       9443/TCP, 9440/TCP
    Host Ports:  0/TCP, 0/TCP
    Args:
      --feature-gates=MachinePool=false
      --metrics-addr=0
      -v=4
    Liveness:     http-get http://:healthz/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:    http-get http://:healthz/readyz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /tmp/k8s-webhook-server/serving-certs from cert (ro)
      /var/run/docker.sock from dockersock (rw)
  Volumes:
   cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  capd-webhook-service-cert
    Optional:    false
   dockersock:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/docker.sock
    HostPathType:
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   capd-controller-manager-557796f4dd (1/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  78s   deployment-controller  Scaled up replica set capd-controller-manager-557796f4dd to 1

So, to me, the problem is that the CAPD manifest infrastructure-components-development.yaml does not let us specify environment variables for the capd-manager container.

bgoareguer on 4 Jan 2021

👍2

Was this page helpful?

0 / 5 - 0 ratings