Dashboard: Pod metrics not working

Created on 4 Aug 2016 · 18Comments · Source: kubernetes/dashboard

Issue details

Environment

The dashboard is run as a replication controller in a kubernetes cluster from the image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.1.0

Dashboard version: v1.1.0
Heapster version: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.1.0
Influx DB version: gcr.io/google_containers/heapster_influxdb:v0.7
Kubernetes version:1.3
Operating system: Centos

Steps to reproduce

Load the dashbord and go to a single pod detauls

Observed result

Everything works apart from pod level metrics

Looking at the dashboard logs I see:

Incoming HTTP/1.1 GET /api/v1/pod/canary/cats-copper-pwl3p request from 172.16.35.0:19646
Getting details of cats-copper-pwl3p pod in canary namespace
Getting pod metrics
Skipping Heapster metrics because of error: invalid character '<' looking for beginning of value

Expected result

Pod level metrics to show as in the screenshots on the dashboard website

kinbug

Source

chbatey

All 18 comments

I think heapster version you've provided is wrong :)
Heapster version: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.1.0

Please update, so we can test.

floreks on 4 Aug 2016

👍1

@chbatey Can you tell us which version of Heapster are you running?

maciaszczykm on 5 Aug 2016

I am also running into the the same issue with very similar setup. All of my configurations are from the Kubernetes 1.3.3 release:
https://github.com/kubernetes/kubernetes/blob/v1.3.3/cluster/addons/dashboard
https://github.com/kubernetes/kubernetes/tree/v1.3.3/cluster/addons/cluster-monitoring/influxdb

Environment

Dashboard: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.1.0
Heapster: gcr.io/google_containers/heapster:v1.1.0
Influx DB: gcr.io/google_containers/heapster_influxdb:v0.5
Kubernetes version: 1.3.3
Operating system: Centos

Dashboard logs

Starting HTTP server on port 9090
Creating API server client for https://10.254.0.1:443
Successful initial request to the apiserver, version: v1.3.3
Creating in-cluster Heapster client
Getting application global configuration
Application configuration {"serverTime":1472135028393}
[2016-08-25T14:23:49Z] Incoming HTTP/1.0 GET /api/v1/workload?itemsPerPage=10&page=1 request from [::1]:58608
Getting lists of all workloads
Getting pod metrics
[2016-08-25T14:23:50Z] Outcoming response to [::1]:58608 with 201 status code
[2016-08-25T14:23:50Z] Incoming HTTP/1.0 GET /api/v1/namespace request from 127.0.0.1:35720
Getting namespace list
[2016-08-25T14:23:50Z] Outcoming response to 127.0.0.1:35720 with 201 status code
[2016-08-25T14:23:52Z] Incoming HTTP/1.0 GET /api/v1/workload/kube-system?itemsPerPage=10&page=1 request from [::1]:58622
Getting lists of all workloads
Getting pod metrics
Skipping Heapster metrics because of error: invalid character 'E' looking for beginning of value
[2016-08-25T14:24:52Z] Outcoming response to [::1]:58622 with 201 status code

bradwilliams-nm on 25 Aug 2016

same as @bradwilliams-nm here:

Dashboard: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.1.1
Heapster: gcr.io/google_containers/heapster:v1.1.0
Influx DB: gcr.io/google_containers/heapster_influxdb:v0.5
Kubernetes version: 1.3.5

Getting lists of all workloads
Getting pod metrics
Skipping Heapster metrics because of error: invalid character 'E' looking for beginning of value
[2016-08-25T16:45:03Z] Outcoming response to 10.1.45.3:37416 with 201 status code

The dashboard is non-functional because of this. Maybe a good fallback would be to serve the dashboard with placeholders instead of actual metrics?

janwillies on 25 Aug 2016

Can you tell me what heapster returns to you?

Do:

kubectl proxy
Then open URL like: http://localhost:8001/api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/empty-one/pod-list/nginx-fpjyf/metrics/cpu-usage,
Replace namespaces/empty-one/pod-list/nginx-fpjyf/ with a pod from one of your namespaces.
Paste here what is returned.

bryk on 26 Aug 2016

< HTTP/1.1 503 Service Unavailable
<
Error: 'dial tcp 10.1.48.10:8082: i/o timeout'
* Connection #0 to host localhost left intact
Trying to reach: 'http://10.1.48.10:8082/api/v1/model/namespaces/infra/pod-list/nginx-782089945-wnk8q/metrics/cpu-usage'

That explains the 'E' invalid character message. Has something with heapster changed?

When I query from inside a pod, I get a somewhat better result:

curl http://10.1.48.10:8082/api/v1/model/namespaces/infra/pod-list/nginx-782089945-wnk8q/metrics/cpu-usage
{"items":[{"metrics":[],"latestTimestamp":"0001-01-01T00:00:00Z"}]}

janwillies on 29 Aug 2016

Yeah, that's what I expected. Can you report a bug on heapster repo or talk with @piosz or @mwielgus ?

bryk on 29 Aug 2016

Action item for us would be to provide better logs for such problems

bryk on 29 Aug 2016

👍1

FWIW, I seem to be getting hit by https://github.com/kubernetes/heapster/issues/1237
workaround is to set docker cgroupfs driver instead of systemd

janwillies on 29 Aug 2016

issue still happens though

when I query with kubectl proxy:

curl -v localhost:8001/api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/infra/pod-list/nginx-ingress-controller-37qef/metrics/cpu-usage
> GET /api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/infra/pod-list/nginx-ingress-controller-37qef/metrics/cpu-usage HTTP/1.1
> Host: localhost:8001
> User-Agent: curl/7.43.0
> Accept: */*
>

< HTTP/1.1 503 Service Unavailable
< Content-Length: 174
< Content-Type: text/plain; charset=utf-8
< Date: Mon, 29 Aug 2016 19:10:50 GMT
<
Error: 'dial tcp 10.1.40.5:8082: i/o timeout'
* Connection #0 to host localhost left intact
Trying to reach: 'http://10.1.40.5:8082/api/v1/model/namespaces/infra/pod-list/nginx-ingress-controller-37qef/metrics/cpu-usage'

but when I query from within a pod, heapster responds:

curl -v http://10.1.40.5:8082/api/v1/model/namespaces/infra/pod-list/nginx-ingress-controller-37qef/metrics/cpu-usage
> GET /api/v1/model/namespaces/infra/pod-list/nginx-ingress-controller-37qef/metrics/cpu-usage HTTP/1.1
[...]
{"items":[{"metrics":[{"timestamp":"2016-08-29T19:01:00Z","value":3},{"timestamp":"2016-08-29T19:02:00Z","value":3},{"timestamp":"2016-08-29T19:03:00Z","value":3},{"timestamp":"2016-08-29T19:04:00Z","value":3},{"timestamp":"2016-08-29T19:05:00Z","value":3},{"timestamp":"2016-08-29T19:06:00Z","value":3},{"timestamp":"2016-08-29T19:07:00Z","value":3},{"timestamp":"2016-08-29T19:08:00Z","value":3},{"timestamp":"2016-08-29T19:09:00Z","value":3},{"timestamp":"2016-08-29T19:10:00Z","value":3},{"timestamp":"2016-08-29T19:11:00Z","value":1}],"latestTimestamp":"2016-08-29T19:11:00Z"}]}

janwillies on 29 Aug 2016

@janwillies Dashboard UI talks with heapster via service proxy, even inside the cluster. This is how services usually talk with each other in Kubernetes.

bryk on 30 Aug 2016

I looked into this again, and found the issue.

For the service proxy to work, you will need apiserver->pod communication which in our case means to deploy flannel also on the master components. BTW, dashboard-1.4 beta looks really cool!

janwillies on 6 Sep 2016

👍1

IIUC there is no issue in Heapster itself?

piosz on 6 Sep 2016

No, it isn't

janwillies on 7 Sep 2016

Should we close this then?

bryk on 7 Sep 2016

In my case, I must put the heapster port in the proxy URL, such as /api/v1/proxy/namespaces/kube-system/services/heapster:8082 to make it work. How can I make the dashboard work with heapster in this case?

Just a thought, I think hard-coded endpoints via service proxy or anything is easy to get things up without much configuration. But it should only be set as default, and there should be more custom configuration to allow flexible deploy.

ntquyen on 6 Nov 2016

In my case, I must put the heapster port in the proxy URL, such as /api/v1/proxy/namespaces/kube-system/services/heapster:8082 to make it work. How can I make the dashboard work with heapster in this case?

Dashboard UI container has --heapster-host parameter, you can point it to, e.g., http://kubernetes.default/api/v1/proxy/namespaces/kube-system/services/heapster:8082. It should work.

bryk on 14 Nov 2016

I think it's resolved already. Closing.

floreks on 1 Jun 2017

Was this page helpful?

0 / 5 - 0 ratings