Kubespray: Missing container image metrics-server/metrics-server:v0.3.7

Created on 10 Dec 2020  路  4Comments  路  Source: kubernetes-sigs/kubespray

Environment:

  • Cloud provider or hardware configuration:
    VM's running on Nutanix
  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    Linux 3.10.0-1127.8.2.el7.x86_64 x86_64
    NAME="Red Hat Enterprise Linux Server"
    VERSION="7.8 (Maipo)"
    ID="rhel"
    ID_LIKE="fedora"
    VARIANT="Server"
    VARIANT_ID="server"
    VERSION_ID="7.8"
    PRETTY_NAME="Red Hat Enterprise Linux"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:redhat:enterprise_linux:7.8:GA:server"
    HOME_URL="https://www.redhat.com/"
    BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.8
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.8"

  • Version of Ansible (ansible --version):
    ansible 2.9.14

  • Version of Python (python --version):
    python version = 3.6.9

Kubespray version (commit) (git rev-parse --short HEAD):
75d648ca

Network plugin used:
calico

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

Command used to invoke ansible:
ansible-playbook -i inventory/test --vault-password-file=vault-password --become kubernetes-upgrade.yaml --limit tst-kube-nextcloud-km1a -vv

Just a wraper to unlock docker, then just call the kubespray/upgrade-cluster.yml play.

  • name: Pre-upgrade tasks (all nodes)
    hosts: k8s-cluster
    become: yes
    tags: [pre]
    tasks:

    • name: Unlock docker-ce version
      command: yum versionlock delete docker-ce docker-ce-cli containerd.io
      args:
      warn: false
      register: versionlock_output
      changed_when: "'versionlock deleted: 3' in versionlock_output.stdout"
      failed_when: "versionlock_output.rc != 0 and 'no matches' not in versionlock_output.stderr"
  • import_playbook: ../kubespray/upgrade-cluster.yml

Output of ansible run:

TASK [download_container | Download image if required] *********************************
task path: /home/damo/Software/ansible/kubespray/roles/download/tasks/download_container.yml:52
FAILED - RETRYING: download_container | Download image if required (4 retries left).
FAILED - RETRYING: download_container | Download image if required (3 retries left).
FAILED - RETRYING: download_container | Download image if required (2 retries left).
FAILED - RETRYING: download_container | Download image if required (1 retries left).
fatal: [tst-kube-nextcloud-km1a -> tst-kube-nextcloud-km1a.cc.swin.edu.au]: FAILED! => {"attempts": 4, "changed": true, "cmd": ["/usr/bin/docker", "pull", "gcr.io/google-containers/metrics-server/metrics-server:v0.3.7"], "delta": "0:00:01.921387", "end": "2020-12-10 10:33:34.998386", "msg": "non-zero return code", "rc": 1, "start": "2020-12-10 10:33:33.076999", "stderr": "Error response from daemon: manifest for gcr.io/google-containers/metrics-server/metrics-server:v0.3.7 not found", "stderr_lines": ["Error response from daemon: manifest for gcr.io/google-containers/metrics-server/metrics-server:v0.3.7 not found"], "stdout": "", "stdout_lines": []}

Anything else do we need to know:

Trying to upgrade from 1.17.9 to 1.18.10.

kinbug

All 4 comments

Please check that your kube_image_repo is correctly set to kube_image_repo: "k8s.gcr.io"

https://github.com/kubernetes-sigs/kubespray/blob/master/roles/download/defaults/main.yml#L56-L58

I can confirm kube_image_repo: is set to "k8s.gcr.io".

damo@dm:~/Software/ansible/kubespray (release-2.14)$ grep "kube_image_repo:" roles/download/defaults/main.yml
kube_image_repo: "k8s.gcr.io"

But it seems to be pulling from: gcr.io

fatal: [tst-kube-nextcloud-km1a -> tst-kube-nextcloud-km1a.cc.swin.edu.au]: FAILED! => {
"attempts": 4,
"changed": true,
"cmd": [
"/usr/bin/docker",
"pull",
"gcr.io/google-containers/metrics-server/metrics-server:v0.3.7"
],
"delta": "0:00:01.968521",
"end": "2020-12-11 08:39:01.113002",
"invocation": {
"module_args": {
"_raw_params": "/usr/bin/docker pull gcr.io/google-containers/metrics-server/metrics-server:v0.3.7",
"_uses_shell": false,
"argv": null,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"stdin_add_newline": true,
"strip_empty_ends": true,
"warn": true
}
},
"msg": "non-zero return code",
"rc": 1,
"start": "2020-12-11 08:38:59.144481",
"stderr": "Error response from daemon: manifest for gcr.io/google-containers/metrics-server/metrics-server:v0.3.7 not found",
"stderr_lines": [
"Error response from daemon: manifest for gcr.io/google-containers/metrics-server/metrics-server:v0.3.7 not found"
],
"stdout": "",
"stdout_lines": []
}

You still have a wrong references to gcr.io/google-containers in your inventory, the command you did is on the kubespray code which is indeed good, the wrong references is in YOUR inventory 馃槈

https://github.com/kubernetes-sigs/kubespray/pull/5764/files

Yes, thank you @floryut good pick up.

It was in group_vars/k8s-cluster/k8s-cluster.yml, made the change and the upgrade worked.

-kube_image_repo: "gcr.io/google-containers"
+#kube_image_repo: "gcr.io/google-containers"
+kube_image_repo: "k8s.gcr.io"

Thanks for your help.

Was this page helpful?
0 / 5 - 0 ratings