BUG REPORT
i change the kube_apiserver_insecure_port as 8443,but i seem not work, it still use default 8080
kube_apiserver_insecure_port: 8443 # (http)
fatal: [node2]: FAILED! => {"attempts": 20, "changed": false, "content": "", "failed": true, "msg": "Status code was not [200]: Request failed: <urlopen error [Errno 111] 拒绝连接>", "redirected": false, "status": -1, "url": "http://localhost:8080/healthz"}
i also set the system_namespace as kube-system, but i seem not work too.
[root@node1 kargo]# cat /etc/kubernetes/kube-system-ns.yml
apiVersion: v1
kind: Namespace
metadata:
name: "{{system_namespace}}"
does anything i do wrongly?
thx in advance~
Environment:
printf "$(uname -srm)\n$(cat /etc/os-release)\n"):[root@node1 ~]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
[root@node1 ~]# uname
Linux
[root@node1 ~]#
ansible --version):[root@node1 ~]# ansible --version
ansible 2.3.0.0
config file = /etc/ansible/ansible.cfg
configured module search path = Default w/o overrides
python version = 2.7.5 (default, Nov 6 2016, 00:28:07) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)]
Kargo version (commit) (git rev-parse --short HEAD):
[root@node1 kargo]# git rev-parse --short HEAD
acae0fe
Network plugin used:
calico
Copy of your inventory file:
[root@node1 kargo]# cat inventory/inventory.cfg
[all]
node1 ansible_host=192.168.0.69 ip=192.168.0.69
node2 ansible_host=192.168.0.191 ip=192.168.0.191
[kube-master]
node1
node2
[kube-node]
node1
node2
[etcd]
node1
node2
[k8s-cluster:children]
kube-node
kube-master
[calico-rr]
Command used to invoke ansible:
nohup ansible-playbook -i inventory/inventory.cfg cluster.yml -b -v --private-key=~/.ssh/id_rsa > `pwd`/../install.log 2>&1 &
Output of ansible run:
fatal: [node2]: FAILED! => {"attempts": 20, "changed": false, "content": "", "failed": true, "msg": "Status code was not [200]: Request failed: <urlopen error [Errno 111] 拒绝连接>", "redirected": false, "status": -1, "url": "http://localhost:8080/healthz"}
Anything else do we need to know:
here is my full config k8s-cluster.yml
# Valid bootstrap options (required): ubuntu, coreos, centos, none
bootstrap_os: centos
#Directory where etcd data stored
etcd_data_dir: /var/lib/etcd
# Directory where the binaries will be installed
bin_dir: /usr/local/bin
# Kubernetes configuration dirs and system namespace.
# Those are where all the additional config stuff goes
# the kubernetes normally puts in /srv/kubernets.
# This puts them in a sane location and namespace.
# Editting those values will almost surely break something.
kube_config_dir: /etc/kubernetes
kube_script_dir: "{{ bin_dir }}/kubernetes-scripts"
kube_manifest_dir: "{{ kube_config_dir }}/manifests"
system_namespace: kube-system
# Logging directory (sysvinit systems)
kube_log_dir: "/var/log/kubernetes"
# This is where all the cert scripts and certs will be located
kube_cert_dir: "{{ kube_config_dir }}/ssl"
# This is where all of the bearer tokens will be stored
kube_token_dir: "{{ kube_config_dir }}/tokens"
# This is where to save basic auth file
kube_users_dir: "{{ kube_config_dir }}/users"
kube_api_anonymous_auth: false
## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.6.1
# Where the binaries will be downloaded.
# Note: ensure that you've enough disk space (about 1G)
local_release_dir: "/tmp/releases"
# Random shifts for retrying failed ops like pushing/downloading
retry_stagger: 5
# This is the group that the cert creation scripts chgrp the
# cert files to. Not really changable...
kube_cert_group: kube-cert
# Cluster Loglevel configuration
kube_log_level: 2
# Users to create for basic auth in Kubernetes API via HTTP
kube_api_pwd: "abcdefg"
kube_users:
kube:
pass: "{{kube_api_pwd}}"
role: admin
root:
pass: "{{kube_api_pwd}}"
role: admin
## It is possible to activate / deactivate selected authentication methods (basic auth, static token auth)
#kube_oidc_auth: false
#kube_basic_auth: false
#kube_token_auth: false
## Variables for OpenID Connect Configuration https://kubernetes.io/docs/admin/authentication/
## To use OpenID you have to deploy additional an OpenID Provider (e.g Dex, Keycloak, ...)
# kube_oidc_url: https:// ...
# kube_oidc_client_id: kubernetes
## Optional settings for OIDC
# kube_oidc_ca_file: {{ kube_cert_dir }}/ca.pem
# kube_oidc_username_claim: sub
# kube_oidc_groups_claim: groups
# Choose network plugin (calico, weave or flannel)
# Can also be set to 'cloud', which lets the cloud provider setup appropriate routing
kube_network_plugin: calico
# Enable kubernetes network policies
enable_network_policy: true
# Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.233.0.0/18
# internal network. When used, it will assign IP
# addresses from this range to individual pods.
# This network must be unused in your network infrastructure!
kube_pods_subnet: 10.233.64.0/18
# internal network node size allocation (optional). This is the size allocated
# to each node on your network. With these defaults you should have
# room for 4096 nodes with 254 pods per node.
kube_network_node_prefix: 24
# The port the API Server will be listening on.
kube_apiserver_ip: "{{ kube_service_addresses|ipaddr('net')|ipaddr(1)|ipaddr('address') }}"
kube_apiserver_port: 6443 # (https)
kube_apiserver_insecure_port: 8443 # (http)
# DNS configuration.
# Kubernetes cluster name, also will be used as DNS domain
cluster_name: cluster.local
# Subdomains of DNS domain to be resolved via /etc/resolv.conf for hostnet pods
ndots: 2
# Can be dnsmasq_kubedns, kubedns or none
dns_mode: dnsmasq_kubedns
# Can be docker_dns, host_resolvconf or none
resolvconf_mode: docker_dns
# Deploy netchecker app to verify DNS resolve as an HTTP service
deploy_netchecker: true
# Ip address of the kubernetes skydns service
skydns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(3)|ipaddr('address') }}"
dns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(2)|ipaddr('address') }}"
dns_domain: "{{ cluster_name }}"
# Path used to store Docker data
docker_daemon_graph: "/opt/docker"
## A string of extra options to pass to the docker daemon.
## This string should be exactly as you wish it to appear.
## An obvious use case is allowing insecure-registry access
## to self hosted registries like so:
docker_options: "--insecure-registry={{ kube_service_addresses }} --graph={{ docker_daemon_graph }}"
docker_bin_dir: "/usr/bin"
# Settings for containerized control plane (etcd/kubelet/secrets)
etcd_deployment_type: docker
kubelet_deployment_type: docker
cert_management: script
vault_deployment_type: docker
# K8s image pull policy (imagePullPolicy)
k8s_image_pull_policy: IfNotPresent
# Monitoring apps for k8s
efk_enabled: true
# Helm deployment
helm_enabled: false
I found the problem because kubernetes still use the 8080 port for health check on the api-server, and the configuration file is here
- name: Master | wait for the apiserver to be running
uri:
url: http://localhost:8080/healthz
register: result
until: result.status == 200
retries: 20
delay: 6
I tried to change it to the correct port, just like this
- name: Master | wait for the apiserver to be running
uri:
url: http://localhost:{{ kube_apiserver_insecure_port }}/healthz
register: result
until: result.status == 200
retries: 20
delay: 6
But the next error message makes me doubt my life
LED - RETRYING: Create kube system namespace (4 retries left).
FAILED - RETRYING: Create kube system namespace (3 retries left).
FAILED - RETRYING: Create kube system namespace (2 retries left).
FAILED - RETRYING: Create kube system namespace (1 retries left)
fatal: [node1]: FAILED! => {"attempts": 4, "changed": false, "cmd": ["/usr/local/bin/kubectl", "create", "-f", "/etc/kubernetes/kube-system-ns.yml"], "delta": "
0:00:00.140787", "end": "2017-06-05 22:57:08.321912", "failed": true, "rc": 1, "start": "2017-06-05 22:57:08.181125", "stderr": "The connection to the server lo
calhost:8080 was refused - did you specify the right host or port?", "stderr_lines": ["The connection to the server localhost:8080 was refused - did you specify
the right host or port?"], "stdout": "", "stdout_lines": []}
I found the reason is kargo try to use kubectl to create namespace, etc., but this time kubectl configuration has not been written, so kubectl still choose to connect to the default 8080 port
- name: Create kube system namespace
command: "{{ bin_dir }}/kubectl create -f {{kube_config_dir}}/{{system_namespace}}-ns.yml"
retries: 4
delay: "{{ retry_stagger | random + 3 }}"
register: create_system_ns
until: create_system_ns.rc == 0
changed_when: False
when: kubesystem|failed and inventory_hostname == groups['kube-master'][0]
tags: apps
......
I am sorry and I am not familiar with ansible, so I can't complete the modification
The static port 8080 is set in 3 different files, so when you change the kube_apiserver_insecure_port variable, you should also change it in those 3 files.
I had the same error a while ago, I've created a pull request with the changes to make the playbooks use the port defined in the kube_apiserver_insecure_port variable.
Check out PR #1332 to try it out.
@gstorme thx, it works, but new error comes up. it seems the cluster not started yet.
TASK [kubernetes/master : Create kube system namespace] *********************************************************************************************************************************************************************************
task path: /data/k8s/kargo/roles/kubernetes/master/tasks/main.yml:53
Thursday 08 June 2017 10:03:06 +0800 (0:00:01.548) 0:08:45.960 *********
FAILED - RETRYING: Create kube system namespace (4 retries left).
FAILED - RETRYING: Create kube system namespace (3 retries left).
FAILED - RETRYING: Create kube system namespace (2 retries left).
FAILED - RETRYING: Create kube system namespace (1 retries left).
fatal: [node1]: FAILED! => {"attempts": 4, "changed": false, "cmd": ["/usr/local/bin/kubectl", "create", "-f", "/etc/kubernetes/kube-system-ns.yml"], "delta": "0:00:00.236391", "end": "2017-06-08 10:03:26.157691", "failed": true, "rc": 1, "start": "2017-06-08 10:03:25.921300", "stderr": "error: error validating \"/etc/kubernetes/kube-system-ns.yml\": error validating data: unexpected end of JSON input; if you choose to ignore these errors, turn validation off with --validate=false", "stderr_lines": ["error: error validating \"/etc/kubernetes/kube-system-ns.yml\": error validating data: unexpected end of JSON input; if you choose to ignore these errors, turn validation off with --validate=false"], "stdout": "", "stdout_lines": []}
to retry, use: --limit @/data/k8s/kargo/cluster.retry
here is the kube-system-ns.yml
[root@node1 ~]# cat /etc/kubernetes/kube-system-ns.yml
apiVersion: v1
kind: Namespace
metadata:
name: "kube-system"
@kaybinwong Because kubectl is not configured, so still choose to use 8080 port access apiserver
@mritd kubectl not configured yet?any way to fix this?
@kaybinwong If kargo write kubectl configuration and then create namespace this problem will be resolved, but I still change this order.
@mritd i had read the source, but i still can not find it. in which section or module i can find the related-sources?
@kaybinwong I also did not find kubectl to create the configuration file source, but from the failure of the state can be seen kubectl configuration file has not been created.
Some fix for this issue ?
I am getting the below
FAILED - RETRYING: Master | wait for the apiserver to be running (1 retries left).
fatal: [ache1]: FAILED! => {"attempts": 20, "changed": false, "content": "", "failed": true, "msg": "Status code was not [200]: Request failed: <urlopen error [Errno 111] Connection refused>", "redirected": false, "status": -1, "url": "http://127.0.0.1:8080/healthz"}
Looks like the http://127.0.0.1:8080/healthz is not accessible but I can curl on http://0.0.0.0:8080/healthz
Any configuration change needed?
I had a running cluster and was trying to reprovision and got hit by this issue.
This is how the logs look like
TASK [kubernetes-apps/cluster_roles : Kubernetes Apps | Wait for kube-apiserver] ***************************************************************************************************
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (10 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (9 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (8 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (7 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (6 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (5 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (4 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (3 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (2 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (1 retries left).
fatal: [k8s-master-1]: FAILED! => {"attempts": 10, "changed": false, "content": "", "failed": true, "msg": "Status code was not [200]: Request failed: <urlopen error [Errno 111] Connection refused>", "redirected": false, "status": -1, "url": "http://127.0.0.1:8080/healthz"}
Do we know of a fix ?
it was a swap problem ,i solved it
swapoff -a
vim roles/download/tasks/download_container.yml
75 - name: Stop if swap enabled
76 assert:
77 that: ansible_swaptotal_mb == 0
78 when: kubelet_fail_swap_on|default(false)
PLAY RECAP **********************************************************************************************************************************************************************************
localhost : ok=2 changed=0 unreachable=0 failed=0
node1 : ok=274 changed=22 unreachable=0 failed=0
node2 : ok=257 changed=18 unreachable=0 failed=0
node3 : ok=257 changed=18 unreachable=0 failed=0
node4 : ok=234 changed=10 unreachable=0 failed=0
node5 : ok=207 changed=5 unreachable=0 failed=0
Wednesday 27 December 2017 23:07:54 +0800 (0:00:00.039) 0:10:24.688 ****
===============================================================================
reopen a new issue if you still experience an issue
Hello,
I am beginner in k8s and i am deploying a kargo solution with 4 nodes (2 master and 2 nodes).
TASK [kubernetes-apps/cluster_roles : Kubernetes Apps | Wait for kube-apiserver] *****
Monday 28 January 2019 14:44:49 +0100 (0:00:00.056) 0:20:46.903 **
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (10 retries left).[0m
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (9 retries left).[0m
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (8 retries left).[0m
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (7 retries left).[0m
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (6 retries left).[0m
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (5 retries left).[0m
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (4 retries left).[0m
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (3 retries left).[0m
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (2 retries left).[0m
[1;30mFAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (1 retries left).[0m
Have any suggestions please, i am blocked on this problem for days.
have you please fix the aforementioned problem. I think the 8080 signaled prolem was resolved and committed. hence, way i still get the issue with 6443 port for apiserver. In my configuration the inscure port is disabled.
Most helpful comment
I had a running cluster and was trying to reprovision and got hit by this issue.
This is how the logs look like
Do we know of a fix ?