While deploying Kibana on K8S >= 1.16:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned e2e-b5dae-venus/test-cross-ns-assoc-cc85-kb-7cdfb5f694-2klgl to eck-e2e-control-plane
Normal Pulled 85s kubelet, eck-e2e-control-plane Container image "docker.elastic.co/kibana/kibana:7.3.0" already present on machine
Normal Created 85s kubelet, eck-e2e-control-plane Created container kibana
Normal Started 85s kubelet, eck-e2e-control-plane Started container kibana
Warning Unhealthy 67s kubelet, eck-e2e-control-plane Readiness probe failed: Get https://10.244.0.10:5601/login: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 62s kubelet, eck-e2e-control-plane Readiness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 2s (x6 over 51s) kubelet, eck-e2e-control-plane Readiness probe errored: the read limit is reached
It seems to be an issue with the amount of data allowed when doing a http probe, it is now limited to 10Kb: https://github.com/kubernetes/kubernetes/blob/acc57be085cf5414f924680c1c740378cb712915/pkg/probe/http/http.go#L36
We might need to update the docs for the released versions 0.9 that ti will not work.
In case we don't find a right endpoint to query, we could still move away from the builtin k8s http healthcheck to a custom command healthcheck, doing the curl ourselves like we do with Elasticsearch.
But better ask Kibana team first if there's a better endpoint to request :)
A TCP healthcheck might be an option as well?
Some feedback from the Kibana team:
/api/statusshould work
- api/status will respond with 503 until the server is ready and able to talk to elasticsearch and run migrations
- If Kibana looses communication with ES and the
status.allowAnonymousis not set totruethen you will get 401 from Kibana on this endpoint- If Kibana looses communication with ES and the
status.allowAnonymousis set totruethen you will get 200 from Kibana on this endpoint with astatus.overall.stateproperty set tored
Response from that endpoint comes in at just 7.5kb at a single test I ran.
I think requiring anonymous access sounds like it's complicating things unnecessarily.
I think I am 馃憤 on @charith-elastic suggestion to use a simple TCP health check
Furthermore, the response of /api/status includes the status of each plugins. With more plugins, the payload of the response might reach 10kB.
On my side, a single test using apm_es_kibana.yaml, gives a response of 8.5kB.
> curl -u $user:$password https://$kibana_ip:5601/api/status -skI | grep length
content-length: 8452
So I tend to +1 to use a simple TCP health check. https://github.com/tevino/tcp-shaker could be a good option (already used in the past and works great).
馃憤 for the TCP check as described in https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-a-tcp-liveness-probe _(should be applicable to readinessProbe as well)_
Does the TCP check fail until Kibana is ready and connected to Elasticsearch?
Does the TCP check fail until Kibana is ready and connected to Elasticsearch?
IDK but I doubt it. But the current health check we use does not check for that either. You can have a Kibana that cannot talk to Elasticsearch and still serves up a login page (without login form but with HTML content indicating that it cannot talk to Elasticsearch)
If I'm understanding the release notes correctly, we should be okay with doing an http check against /api/status now that this was merged
https://github.com/kubernetes/kubernetes/pull/82669
where it should just truncate and not error out
This has been considered as a regression by the K8S project and fixed.
kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:36:53Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-24T05:54:40Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
--- PASS: TestSmoke (148.15s)
--- PASS: TestSmoke/K8S_should_be_accessible (0.02s)
--- PASS: TestSmoke/Elasticsearch_CRDs_should_exist (0.06s)
--- PASS: TestSmoke/Remove_Elasticsearch_if_it_already_exists (0.01s)
--- PASS: TestSmoke/K8S_should_be_accessible#01 (0.01s)
--- PASS: TestSmoke/Kibana_CRDs_should_exist (0.01s)
--- PASS: TestSmoke/Remove_Kibana_if_it_already_exists (0.01s)
--- PASS: TestSmoke/K8S_should_be_accessible#02 (0.01s)
--- PASS: TestSmoke/APM_Server_CRDs_should_exist (0.01s)
--- PASS: TestSmoke/Remove_the_resources_if_they_already_exist (0.01s)
--- PASS: TestSmoke/Creating_an_Elasticsearch_cluster_should_succeed (0.03s)
--- PASS: TestSmoke/Elasticsearch_cluster_should_be_created (0.00s)
--- PASS: TestSmoke/Creating_Kibana_should_succeed (0.01s)
--- PASS: TestSmoke/Kibana_should_be_created (0.00s)
--- PASS: TestSmoke/Creating_APM_Server_should_succeed (0.01s)
--- PASS: TestSmoke/APM_Server_should_be_created (0.00s)
--- PASS: TestSmoke/ES_certificate_authority_should_be_set_and_deployed (6.02s)
--- PASS: TestSmoke/ES_version_should_be_the_expected_one (3.01s)
--- PASS: TestSmoke/ES_pods_should_eventually_be_running (47.36s)
--- PASS: TestSmoke/ES_services_should_be_created (0.01s)
--- PASS: TestSmoke/ES_pods_should_eventually_be_ready (24.38s)
--- PASS: TestSmoke/ES_pods_should_eventually_have_a_certificate (0.02s)
--- PASS: TestSmoke/ES_services_should_have_endpoints (9.02s)
--- PASS: TestSmoke/ES_cluster_health_should_eventually_be_green (12.02s)
--- PASS: TestSmoke/ES_cluster_UUID_should_eventually_appear_in_the_ES_status (0.00s)
--- PASS: TestSmoke/Elastic_password_should_be_available (0.00s)
--- PASS: TestSmoke/Elasticsearch_data_volumes_should_be_of_the_specified_type (0.01s)
--- PASS: TestSmoke/ES_cluster_health_endpoint_should_eventually_be_reachable (0.16s)
--- PASS: TestSmoke/ES_version_should_be_the_expected_one#01 (0.03s)
--- PASS: TestSmoke/ES_endpoint_should_eventually_be_reachable (0.03s)
--- PASS: TestSmoke/ES_nodes_topology_should_eventually_be_the_expected_one (0.06s)
--- PASS: TestSmoke/Kibana_deployment_should_be_set (0.01s)
--- PASS: TestSmoke/Kibana_pods_count_should_match_the_expected_one (0.00s)
--- PASS: TestSmoke/Kibana_pods_should_eventually_be_running (0.00s)
--- PASS: TestSmoke/Kibana_services_should_be_created (0.00s)
--- PASS: TestSmoke/Kibana_services_should_have_endpoints (0.00s)
--- PASS: TestSmoke/Create_Kibana_client (0.04s)
--- PASS: TestSmoke/Kibana_should_be_able_to_connect_to_Elasticsearch (0.08s)
--- PASS: TestSmoke/ApmServer_deployment_should_be_created (0.00s)
--- PASS: TestSmoke/ApmServer_pods_count_should_match_the_expected_one (0.00s)
--- PASS: TestSmoke/ApmServer_pods_should_eventually_be_running (0.00s)
--- PASS: TestSmoke/ApmServer_services_should_be_created (0.00s)
--- PASS: TestSmoke/ApmServer_services_should_have_endpoints (0.00s)
--- PASS: TestSmoke/Every_secret_should_be_set_so_that_we_can_build_an_APM_client (0.16s)
--- PASS: TestSmoke/ApmServer_endpoint_should_eventually_be_reachable (0.01s)
--- PASS: TestSmoke/ApmServer_version_should_be_the_expected_one (0.00s)
--- PASS: TestSmoke/Events_should_be_accepted (0.00s)
--- PASS: TestSmoke/Events_should_eventually_show_up_in_Elasticsearch (12.33s)
--- PASS: TestSmoke/Deleting_Elasticsearch_should_return_no_error (0.01s)
--- PASS: TestSmoke/Elasticsearch_should_not_be_there_anymore (0.00s)
--- PASS: TestSmoke/Elasticsearch_pods_should_be_eventually_be_removed (15.04s)
--- PASS: TestSmoke/PVCs_should_eventually_be_removed (0.00s)
--- PASS: TestSmoke/Deleting_Kibana_should_return_no_error (0.01s)
--- PASS: TestSmoke/Kibana_should_not_be_there_anymore (0.00s)
--- PASS: TestSmoke/Kibana_pods_should_be_eventually_be_removed (9.02s)
--- PASS: TestSmoke/Deleting_the_resources_should_return_no_error (0.01s)
--- PASS: TestSmoke/The_resources_should_not_be_there_anymore (0.00s)
--- PASS: TestSmoke/APM_Server_pods_should_be_eventually_be_removed (9.03s)
PASS
ok github.com/elastic/cloud-on-k8s/test/e2e 148.171s
I just got this exact error on 7.7.1
K8s version:
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.3", GitCommit:"2e7996e3e2712684bc73f0dec0200d64eec7fe40", GitTreeState:"clean", BuildDate:"2020-05-20T12:43:34Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
what should i do?
@maxisam I am unable to reproduce it with minikube 1.18.3 and ES+Kibana 7.7.1 with ECK 1.1.2. Can you share the details of your environment, the manifests, and the specific logs and behavior you're seeing?
It works after i change readinessProbe. Thanks!
Most helpful comment
馃憤 for the TCP check as described in https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-a-tcp-liveness-probe _(should be applicable to
readinessProbeas well)_