Dashboard: Dashboard v2 can't get metrics (but kubectl top xxx works)

Created on 12 Dec 2019  路  13Comments  路  Source: kubernetes/dashboard

Environment
Installation method:
Kubernetes version: 1.16.2
Dashboard version: v2.0.0-beta8
Operating system: CentOS 7
Steps to reproduce
  1. Install metrics-server using https://github.com/helm/charts/tree/master/stable/metrics-server
  2. Install kubernetes dashboard v2 using this adapted helm chart : https://github.com/funkypenguin/helm-kubernetes-dashboard (AFAIK, the "official" helm chart is not yet updated for v2)
  3. Confirm it's possible to gather metrics using the Kube API, by running kubectl top nodes
Observed result
  1. Dashboard shows expected information, but per-pod CPU/memory metrics are missing.
  2. Dashboard logs show either:
2019/12/12 21:26:35 Getting pod metrics
2019/12/12 21:26:35 Internal error occurred: No metric client provided. Skipping metrics.

Or:

2019/12/12 21:31:11 Metric client health check failed: the server is currently unable to handle the request (get services dashboard-metrics-scraper). Retrying in 30 seconds.
Expected result

Dashboard correctly displays CPU/memory usage graphs per-pod

Comments

The metrics-scraper pod is logging the following, which leads me to believe that it's functioning normally (i.e., it has the necessary RBAC etc to scrape metrics).

192.168.33.24 - - [12/Dec/2019:21:33:38 +0000] "GET / HTTP/1.1" 200 6 "" "kube-probe/1.16"
192.168.33.24 - - [12/Dec/2019:21:33:48 +0000] "GET / HTTP/1.1" 200 6 "" "kube-probe/1.16"
{"level":"info","msg":"Database updated: 7 nodes, 205 pods","time":"2019-12-12T21:33:58Z"}
192.168.33.24 - - [12/Dec/2019:21:33:58 +0000] "GET / HTTP/1.1" 200 6 "" "kube-probe/1.16"
192.168.33.24 - - [12/Dec/2019:21:34:08 +0000] "GET / HTTP/1.1" 200 6 "" "kube-probe/1.16"
kinbug

Most helpful comment

After reading the code, I think the problem is the following code snippet in restclient.go.

// Get creates request to given path.
func (c inClusterSidecarClient) Get(path string) RequestInterface {
    return c.client.Get().
        Namespace(args.Holder.GetNamespace()).
        Resource("services").
        Name("dashboard-metrics-scraper").
        SubResource("proxy").
        Suffix(path)
}

// HealthCheck does a health check of the application.
// Returns nil if connection to application can be established, error object otherwise.
func (self inClusterSidecarClient) HealthCheck() error {
    _, err := self.client.Get().
        Namespace(args.Holder.GetNamespace()).
        Resource("services").
        Name("dashboard-metrics-scraper").
        SubResource("proxy").
        Suffix("/healthz").
        DoRaw(context.TODO())
    return err
}

For whatever reason, the sidecar client which connects to scraper could not pass the health check, hence unavailable when the real pod metric retrieval happens. In my case, the metricClient is empty and the code error out with "No metric client provided. Skipping metrics."

func (self *DataSelector) getMetrics(metricClient metricapi.MetricClient) (
    []metricapi.MetricPromises, error) {
    metricPromises := make([]metricapi.MetricPromises, 0)

    if metricClient == nil {
        return metricPromises, errors.NewInternal("No metric client provided. Skipping metrics.")
    }

    metricNames := self.DataSelectQuery.MetricQuery.MetricNames
    if metricNames == nil {
        return metricPromises, errors.NewInternal("No metrics specified. Skipping metrics.")
    }

    selectors := make([]metricapi.ResourceSelector, len(self.GenericDataList))
    for i, dataCell := range self.GenericDataList {
        // make sure data cells support metrics
        metricDataCell, ok := dataCell.(MetricDataCell)
        if !ok {
            log.Printf("Data cell does not implement MetricDataCell. Skipping. %v", dataCell)
            continue
        }

        selectors[i] = *metricDataCell.GetResourceSelector()
    }

    for _, metricName := range metricNames {
        promises := metricClient.DownloadMetric(selectors, metricName, self.CachedResources)
        metricPromises = append(metricPromises, promises)
    }

    return metricPromises, nil
}

I simply used the sidecar-host parameter to provide a service URL which connects to the metric-scraper.

args:
  - --auto-generate-certificates
  - --namespace=kubernetes-dashboard
  - --token-ttl=3600
  - --sidecar-host=http://dashboard-metrics-scraper:8000

All 13 comments

Do you add - --kubelet-insecure-tls in your metrics server values.yaml? Like this:

args: 
# enable this if you have self-signed certificates, see: https://github.com/kubernetes-incubator/metrics-server
 - --kubelet-insecure-tls

I've tried with and without --kubelet-insecure-tls - no difference.

It's worth pointing out that this works:

root@cn1:~# kubectl top nodes
NAME                     CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
mn1.kube-cluster.local   516m         6%     4494Mi          28%
mn2.kube-cluster.local   611m         7%     4456Mi          28%
mn3.kube-cluster.local   837m         10%    4911Mi          31%
wn1.kube-cluster.local   2328m        19%    22915Mi         35%
wn2.kube-cluster.local   1157m        9%     16994Mi         26%
wn3.kube-cluster.local   2773m        23%    20819Mi         32%
wn4.kube-cluster.local   3427m        28%    17151Mi         26%
root@cn1:~#

I'd say that it's either faulty helm chart or cluster config issue. Dashboard can't connect to the metrics scraper. Metrics server is working fine.

Could be helm chart - I modified the original helm chart for v1. Any ideas where to start re why dashboard can't connect to scraper?

I'd just use our official deployment methods. We do not really support anything else. Debugging app to app connectivity is a different issue. Official K8S docs can help you debug that. It's not really a Dashboard issue.

/close

@floreks: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

This could be just becuase of the RBAC. https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/README.md#admin-privileges might help. For me, this was the issue.

For reference, it was to do with the name of the metrics-scraper service. I've got a working chart here now: https://github.com/funkypenguin/helm-kubernetes-dashboard

After reading the code, I think the problem is the following code snippet in restclient.go.

// Get creates request to given path.
func (c inClusterSidecarClient) Get(path string) RequestInterface {
    return c.client.Get().
        Namespace(args.Holder.GetNamespace()).
        Resource("services").
        Name("dashboard-metrics-scraper").
        SubResource("proxy").
        Suffix(path)
}

// HealthCheck does a health check of the application.
// Returns nil if connection to application can be established, error object otherwise.
func (self inClusterSidecarClient) HealthCheck() error {
    _, err := self.client.Get().
        Namespace(args.Holder.GetNamespace()).
        Resource("services").
        Name("dashboard-metrics-scraper").
        SubResource("proxy").
        Suffix("/healthz").
        DoRaw(context.TODO())
    return err
}

For whatever reason, the sidecar client which connects to scraper could not pass the health check, hence unavailable when the real pod metric retrieval happens. In my case, the metricClient is empty and the code error out with "No metric client provided. Skipping metrics."

func (self *DataSelector) getMetrics(metricClient metricapi.MetricClient) (
    []metricapi.MetricPromises, error) {
    metricPromises := make([]metricapi.MetricPromises, 0)

    if metricClient == nil {
        return metricPromises, errors.NewInternal("No metric client provided. Skipping metrics.")
    }

    metricNames := self.DataSelectQuery.MetricQuery.MetricNames
    if metricNames == nil {
        return metricPromises, errors.NewInternal("No metrics specified. Skipping metrics.")
    }

    selectors := make([]metricapi.ResourceSelector, len(self.GenericDataList))
    for i, dataCell := range self.GenericDataList {
        // make sure data cells support metrics
        metricDataCell, ok := dataCell.(MetricDataCell)
        if !ok {
            log.Printf("Data cell does not implement MetricDataCell. Skipping. %v", dataCell)
            continue
        }

        selectors[i] = *metricDataCell.GetResourceSelector()
    }

    for _, metricName := range metricNames {
        promises := metricClient.DownloadMetric(selectors, metricName, self.CachedResources)
        metricPromises = append(metricPromises, promises)
    }

    return metricPromises, nil
}

I simply used the sidecar-host parameter to provide a service URL which connects to the metric-scraper.

args:
  - --auto-generate-certificates
  - --namespace=kubernetes-dashboard
  - --token-ttl=3600
  - --sidecar-host=http://dashboard-metrics-scraper:8000

After reading the code, I think the problem is the following code snippet in restclient.go.

// Get creates request to given path.
func (c inClusterSidecarClient) Get(path string) RequestInterface {
  return c.client.Get().
      Namespace(args.Holder.GetNamespace()).
      Resource("services").
      Name("dashboard-metrics-scraper").
      SubResource("proxy").
      Suffix(path)
}

// HealthCheck does a health check of the application.
// Returns nil if connection to application can be established, error object otherwise.
func (self inClusterSidecarClient) HealthCheck() error {
  _, err := self.client.Get().
      Namespace(args.Holder.GetNamespace()).
      Resource("services").
      Name("dashboard-metrics-scraper").
      SubResource("proxy").
      Suffix("/healthz").
      DoRaw(context.TODO())
  return err
}

For whatever reason, the sidecar client which connects to scraper could not pass the health check, hence unavailable when the real pod metric retrieval happens. In my case, the metricClient is empty and the code error out with "No metric client provided. Skipping metrics."

func (self *DataSelector) getMetrics(metricClient metricapi.MetricClient) (
  []metricapi.MetricPromises, error) {
  metricPromises := make([]metricapi.MetricPromises, 0)

  if metricClient == nil {
      return metricPromises, errors.NewInternal("No metric client provided. Skipping metrics.")
  }

  metricNames := self.DataSelectQuery.MetricQuery.MetricNames
  if metricNames == nil {
      return metricPromises, errors.NewInternal("No metrics specified. Skipping metrics.")
  }

  selectors := make([]metricapi.ResourceSelector, len(self.GenericDataList))
  for i, dataCell := range self.GenericDataList {
      // make sure data cells support metrics
      metricDataCell, ok := dataCell.(MetricDataCell)
      if !ok {
          log.Printf("Data cell does not implement MetricDataCell. Skipping. %v", dataCell)
          continue
      }

      selectors[i] = *metricDataCell.GetResourceSelector()
  }

  for _, metricName := range metricNames {
      promises := metricClient.DownloadMetric(selectors, metricName, self.CachedResources)
      metricPromises = append(metricPromises, promises)
  }

  return metricPromises, nil
}

I simply used the sidecar-host parameter to provide a service URL which connects to the metric-scraper.

args:
  - --auto-generate-certificates
  - --namespace=kubernetes-dashboard
  - --token-ttl=3600
  - --sidecar-host=http://dashboard-metrics-scraper:8000

thanks! works for me!

args:
  - --auto-generate-certificates
  - --namespace=kubernetes-dashboard
  - --token-ttl=3600
  - --sidecar-host=http://dashboard-metrics-scraper:8000

Genius!

@jerryyanmj This helped me also. Thank you.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

MichaelJCole picture MichaelJCole  路  5Comments

mhobotpplnet picture mhobotpplnet  路  3Comments

Eddman picture Eddman  路  4Comments

puja108 picture puja108  路  5Comments

wu105 picture wu105  路  3Comments