Dashboard: i/o timeout on heapster metrics

Created on 27 Jan 2017  路  14Comments  路  Source: kubernetes/dashboard

Issue details

I am deploying kube-dash and heapster/influxdb on my on-premise Kubernetes cluster (1 master, 6 nodes). I got heapster deployed with influxdb and am able to query it just fine (both from the internal cluster endpoint and the service nodeport routed through kube-proxy). The issue is that the kubernetes dashboard is having timeouts when trying to access it, which makes the dashboard not just incredibly slow but also reduces the functionality of it by a lot.

Error from kubedash logs:

[2017-01-27T16:54:33Z] Incoming HTTP/1.1 GET /api/v1/pod/kube-system?itemsPerPage=10&page=1 request from 172.16.32.128:56826
Getting list of all pods in the cluster
Getting pod metrics
Skipping Heapster metrics because of error: an error on the server ("Error: 'dial tcp 172.16.40.138:8082: i/o timeout'\nTrying to reach: 'http://172.16.40.138:8082/api/v1/model/namespaces/kube-system/pod-list/centos-deployment-275141233-7yecd,elastickube-mongo-ny0eh,elastickube-server-o8psw,heapster-3in4w,influxdb-grafana-y8isc,kube-dns-v9-0fpvq,kube-dns-v9-4swto,kube-dns-v9-zbupd,kube2consul-m3vcr,kubernetes-dashboard-2929884197-cf349,nginx-deployment-2947857529-ugcst,ubuntu-deployment-4001649334-ugymt/metrics/cpu/usage_rate'") has prevented the request from succeeding (get services heapster)

Proof that that endpoint is actually accessible from within Kubernetes (using a dedicated curl container):

kubectl logs curl-deployment-675076690-s2pac --namespace=kube-system
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
{"items":[{"metrics":[{"timestamp":"2017-01-27T16:54:00Z","value":0},{"timestamp":"2017-01-27T17:01:00Z","value":0},{"timestamp":"2017-01-27T17:03:00Z","value":0},{"timestamp":"2017-01-27T17:04:00Z","value":0}],"latestTimestamp":"2017-01-27T17:04:00Z"},{"metrics":[{"timestamp":"2017-01-27T16:50:00Z","value":5},{"timestamp":"2017-01-27T16:51:00Z","value":6},{"timestamp":"2017-01-27T16:52:00Z","value":5},{"timestamp":"2017-01-27T16:53:00Z","value":5},{"timestamp":"2017-01-27T16:54:00Z","value":5},{"timestamp":"2017-01-27T16:55:00Z","value":6},
...
{"timestamp":"2017-01-27T16:51:00Z","value":0},{"timestamp":"2017-01-27T16:52:00Z","value":0},{"timestamp":"2017-01-27T16:53:00Z","value":0},{"timestamp":"2017-01-27T16:55:00Z","value":0},{"timestamp":"2017-01-27T16:56:00Z","value":0},{"timestamp":"2017-01-27T16:57:00Z","value":0},{"timestamp":"2017-01-27T16:58:00Z","value":0},{"timestamp":"2017-01-27T17:00:00Z","value":0},{"timestamp":"2017-01-27T17:02:00Z","value":0},{"timestamp":"2017-01-27T17:03:00Z","value":0}],"latestTimestamp":"2017-01-27T17:03:00Z"}]}
100 8513 0 8513 0 0 2911k 0 --:--:-- --:--:-- --:--:-- 4156k

Environment
Dashboard version: 1.5.1
Kubernetes version: 1.4.8
Operating system: Centos 7.2
Node.js version: na
Go version: na
Steps to reproduce


Deploy heapster 1.2 w/influx db and node port load balancer service
Deploy kubedash 1.5.1
Hit the kubedash ui pod section where metrics would normally be gathered.

Observed result

Observed results are that it is not able to communicate with the heapster endpoint.

Expected result

To be able to hit the internal heapster endpoint.

Comments

Most helpful comment

It has to do with which rest client KubeDash uses. It makes that determination depending on where heapster is hosted. I don't know how to fix it in KubeDash but I was able to get around it by specifying my heapster host manually.

Example manifest snippet:
spec:
containers:
- name: kubernetes-dashboard
image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.5.1
imagePullPolicy: Always
ports:
- containerPort: 9090
protocol: TCP
args:
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
- --apiserver-host=http://:8080
- --heapster-host=http://heapster
livenessProbe:
httpGet:
path: /
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30

Specifically the line: "- --heapster-host=http://heapster"

See this for the different rest client functions: https://github.com/kubernetes/dashboard/blob/1ee771c19ed387493b372a4e7379201cafb61d93/src/app/backend/client/heapsterclient.go

All 14 comments

Is the connection attempt made from the kubemaster by chance? I don't have it as part of the kube node networking cluster so therefore it can't resolv the internal cluster endpoint IP/port. I was just looking at heapster documentation here:

https://github.com/kubernetes/heapster/blob/master/docs/debugging.md

I can confirm that heapster is reachable from any worker. So a curl http://10.40.0.2:8082/metrics on a worker returns some data. The same on the master runs into a timout. Any ideas why?

It has to do with which rest client KubeDash uses. It makes that determination depending on where heapster is hosted. I don't know how to fix it in KubeDash but I was able to get around it by specifying my heapster host manually.

Example manifest snippet:
spec:
containers:
- name: kubernetes-dashboard
image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.5.1
imagePullPolicy: Always
ports:
- containerPort: 9090
protocol: TCP
args:
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
- --apiserver-host=http://:8080
- --heapster-host=http://heapster
livenessProbe:
httpGet:
path: /
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30

Specifically the line: "- --heapster-host=http://heapster"

See this for the different rest client functions: https://github.com/kubernetes/dashboard/blob/1ee771c19ed387493b372a4e7379201cafb61d93/src/app/backend/client/heapsterclient.go

great this worked for me! thanks.

By default if heapster-host is not specified we are trying to use in-cluster config. It means that it looks for heapster service inside the cluster and tries to connect through service proxy. In order for that to work cluster networking has to be configured properly (dns, possibly network overlay). Other option is to specify heapster-host manually in the dashboard yaml file.

If cluster configuration was the issue then can we close?

I installed the cluster using kubeadm and weave overlay network. Other services work fine. So I'm not sure if it is a cluster setup issue. I just followed the instructions and by default it did not work.

@floreks My heapster host is in fact in cluster and accessible through normal in-cluster endpoints. I use flanneld as my overlay network and use in cluster dns (which works). There is something bugged about the in cluster rest call which I'm not smart enough to figure out.

I was having the same problem and I just solved it, here it goes my case:

When the dashboard tries to get heapster metrics it calls something like:
http://MASTER_HOST:MASTER_API_PORT/api/v1/proxy/namespaces/kube-system/services/heapster/

Then it is the master trying to call the metrics (that is a guess of mine).

So, it is the master who has to have access to heapster, not the node or the pods.

To do that I configured flanneld also on the master and, VERY IMPORTANT, in the node I added those 2 rules to iptables:

iptables -I FORWARD 1 -i flannel.1 -o docker0 -j ACCEPT
iptables -I FORWARD 1 -i docker0 -o flannel.1 -j ACCEPT

As I was working just with one node for testing purposes I didn't notice those two rules weren't there, but I would if I had add more nodes I suppose.

This is MY case, maybe yours is different, but this is what I did to solve it and now it works perfectly, and the dashboard is now really usable in terms of speed. Love it!

@bmhkb4 It's worked for me!

@floreks I am also bh016088 (the op...that is my work account) and yes, you can close. I would simply suggest you take away the dependency for the flannel overlay network to be hosted on the master. The call from kube-dash to heapster should come from the kube-dash container (living within the overlay network). Thanks guys!

We do not have any direct dependencies for any overlay network. For in-cluster heapster configuration we are assuming that heapster is running inside the cluster and is accessible by using service proxy on path api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/....

Only requirement is that cluster networking is configured properly. You should not need to manually adapt iptables. kube-proxy component should do it for you. I have configured cluster couple of times and did not have to do that.

@dcodix I was having the same problem and I just solved it thanks a lot

Just in case it may help other people. I was having a similar issue with slow dashboard after heapster deployment.
My problem was that proxy was set at /etc/kubernetes/manifests/kube-apiserver.yaml . It looks like kubeadm updates it with what is set on /etc/environments.
I removed the proxy from kube-apiserver.yaml and all works fine.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

maciaszczykm picture maciaszczykm  路  3Comments

billcloud-me picture billcloud-me  路  5Comments

andrei-dascalu picture andrei-dascalu  路  3Comments

donspaulding picture donspaulding  路  5Comments

shu-mutou picture shu-mutou  路  3Comments