I have configured my cluster metrics. I'm able to see them in the webconsole.
I do not use persistent storage for them and I used auto-generated certificates:
$ oc secrets new metrics-deployer nothing=/dev/null
It looks fine, for every pod I have, I get the metrics (using heapster).
This are the logs of my heapster pod:
Starting Heapster with the following arguments: --source=kubernetes:https://kubernetes.default.svc:443?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250 --sink=hawkular:https://hawkular-metrics:443?tenant=_system&labelToTenant=pod_namespace&caCert=/hawkular-cert/hawkular-metrics-ca.certificate&user=hawkular&pass=uRXj1CFvyuQH_H8&filter=label(container_name:^/system.slice.*|^/user.slice) --logtostderr=true --tls_cert=/secrets/heapster.cert --tls_key=/secrets/heapster.key --tls_client_ca=/secrets/heapster.client-ca --allowed_users=system:master-proxy
I1209 07:16:00.166350 1 heapster.go:60] heapster --source=kubernetes:https://kubernetes.default.svc:443?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250 --sink=hawkular:https://hawkular-metrics:443?tenant=_system&labelToTenant=pod_namespace&caCert=/hawkular-cert/hawkular-metrics-ca.certificate&user=hawkular&pass=uRXj1CFvyuQH_H8&filter=label(container_name:^/system.slice.*|^/user.slice) --logtostderr=true --tls_cert=/secrets/heapster.cert --tls_key=/secrets/heapster.key --tls_client_ca=/secrets/heapster.client-ca --allowed_users=system:master-proxy
I1209 07:16:00.171115 1 heapster.go:61] Heapster version 0.18.0
I1209 07:16:00.171713 1 kube_factory.go:168] Using Kubernetes client with master "https://kubernetes.default.svc:443" and version "v1"
I1209 07:16:00.171726 1 kube_factory.go:169] Using kubelet port 10250
I1209 07:16:00.172023 1 driver.go:491] Initialised Hawkular Sink with parameters {_system https://hawkular-metrics:443?tenant=_system&labelToTenant=pod_namespace&caCert=/hawkular-cert/hawkular-metrics-ca.certificate&user=hawkular&pass=uRXj1CFvyuQH_H8&filter=label(container_name:^/system.slice.*|^/user.slice) 0xc20817afc0 }
I1209 07:16:00.359077 1 heapster.go:71] Starting heapster on port 8082
I scaled my test-project and it's using 40 pods.
Now I want to make an autoscaler for my project test.
The .yaml looks like this:
apiVersion: extensions/v1beta1
kind: HorizontalPodAutoscaler
metadata:
name: test-scaler
spec:
scaleRef:
kind: DeploymentConfig
name: test #name of my dc
apiVersion: v1
subresource: scale
minReplicas: 2
maxReplicas: 30
cpuUtilization:
targetPercentage: 60
I know the auto-scaler needs the cluster metrics. But that's working fine so I would think it should work but it isn't:
[centos@autoscaler]$ oc get hpa
NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE
test-scaler DeploymentConfig/test/scale 60% <waiting> 2 30 13m
[centos@autoscaler]$ oc describe hpa test-scaler
Name: test-scaler
Namespace: test
Labels: <none>
CreationTimestamp: Tue, 08 Dec 2015 11:21:49 +0000
Reference: DeploymentConfig/test/scale
Target CPU utilization: 60%
Current CPU utilization: <not available>
Min replicas: 2
Max replicas: 30
Can you post your DeploymentConfig JSON/YAML? Additionally, if you have access to the logs, do you see anything there? There's a couple issues that could be occurring (e.g. Origin might not be able to connect to Heapster or there's some incorrect configuration on you DC -- you need to specify CPU requests on your pods for the HPA to work).
@DirectXMan12
A description of the pod is all I can give:
[centos@]$ oc describe pod heapster-bub0e
Name: heapster-bub0e
Namespace: openshift-infra
Image(s): openshift/origin-metrics-heapster:latest
Node: ip-10-0-0-xx.eu-west-1.compute.internal/10.0.0.xx
Start Time: Thu, 10 Dec 2015 07:24:39 +0000
Labels: metrics-infra=heapster,name=heapster
Status: Running
Reason:
Message:
IP: 10.1.1.6
Replication Controllers: heapster (1/1 replicas created)
Containers:
heapster:
Container ID: docker://7c9a01a0b4d1c502a770901e181c68f0c8cbee4a927cd453235ff28cbc920b01
Image: openshift/origin-metrics-heapster:latest
Image ID: docker://ef2c651384befe07342290c8f3a7b01c2fa0d7b4310500aa96dffd177c7e26b1
QoS Tier:
cpu: BestEffort
memory: BestEffort
State: Running
Started: Thu, 10 Dec 2015 07:25:52 +0000
Last Termination State: Terminated
Reason: Error
Exit Code: 255
Started: Thu, 10 Dec 2015 07:25:30 +0000
Finished: Thu, 10 Dec 2015 07:25:33 +0000
Ready: True
Restart Count: 2
Environment Variables:
Conditions:
Type Status
Ready True
Volumes:
heapster-secrets:
Type: Secret (a secret that should populate this volume)
SecretName: heapster-secrets
hawkular-metrics-certificate:
Type: Secret (a secret that should populate this volume)
SecretName: hawkular-metrics-certificate
hawkular-metrics-account:
Type: Secret (a secret that should populate this volume)
SecretName: hawkular-metrics-account
heapster-token-pnlme:
Type: Secret (a secret that should populate this volume)
SecretName: heapster-token-pnlme
There is an 'error' but I think it was because I started my Origin-server at that moment so everything was recreated.
I used this template to create it all, hope this helps:
#!/bin/bash
apiVersion: "v1"
kind: "Template"
metadata:
name: metrics-deployer-template
annotations:
description: "Template for deploying the required Metrics integration. Requires cluster-admin 'metrics-deployer' service account and 'metrics-deployer' secret."
tags: "infrastructure"
labels:
metrics-infra: deployer
provider: openshift
component: deployer
objects:
-
apiVersion: v1
kind: Pod
metadata:
generateName: metrics-deployer-
spec:
containers:
- image: ${IMAGE_PREFIX}metrics-deployer:${IMAGE_VERSION}
name: deployer
volumeMounts:
- name: secret
mountPath: /secret
readOnly: true
- name: empty
mountPath: /etc/deploy
env:
- name: PROJECT
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: IMAGE_PREFIX
- name: IMAGE_VERSION
value: ${IMAGE_VERSION}
- name: PUBLIC_MASTER_URL
value: ${PUBLIC_MASTER_URL}
- name: MASTER_URL
value: ${MASTER_URL}
- name: REDEPLOY
value: ${REDEPLOY}
- name: USE_PERSISTENT_STORAGE
value: ${USE_PERSISTENT_STORAGE}
- name: HAWKULAR_METRICS_HOSTNAME
value: ${HAWKULAR_METRICS_HOSTNAME}
- name: CASSANDRA_NODES
value: ${CASSANDRA_NODES}
- name: CASSANDRA_PV_SIZE
value: ${CASSANDRA_PV_SIZE}
- name: METRIC_DURATION
value: ${METRIC_DURATION}
dnsPolicy: ClusterFirst
restartPolicy: Never
serviceAccount: metrics-deployer
volumes:
- name: empty
emptyDir: {}
- name: secret
secret:
secretName: metrics-deployer
parameters:
-
description: 'Specify prefix for metrics components; e.g. for "openshift/origin-metrics-deployer:v1.1", set prefix "openshift/origin-"'
name: IMAGE_PREFIX
value: "openshift/origin-"
-
description: 'Specify version for metrics components; e.g. for "openshift/origin-metrics-deployer:v1.1", set version "v1.1"'
name: IMAGE_VERSION
value: "latest"
-
description: "Internal URL for the master, for authentication retrieval"
name: MASTER_URL
value: "https://kubernetes.default.svc:443"
-
description: "External hostname where clients will reach Hawkular Metrics"
name: HAWKULAR_METRICS_HOSTNAME
required: true
-
description: "If set to true the deployer will try and delete all the existing components before trying to redeploy."
name: REDEPLOY
value: "false"
-
description: "Set to true for persistent storage, set to false to use non persistent storage"
name: USE_PERSISTENT_STORAGE
value: "true"
-
description: "The number of Cassandra Nodes to deploy for the initial cluster"
name: CASSANDRA_NODES
value: "1"
-
description: "The persistent volume size for each of the Cassandra nodes"
name: CASSANDRA_PV_SIZE
value: "1Gi"
-
description: "How many days metrics should be stored for."
name: METRIC_DURATION
value: "7"
and to execute:
oc process -f metrics.yaml -v \
HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.apps.example.com,USE_PERSISTENT_STORAGE=false \
| oc create -f -
It's working and showing up in the metrics-tab in my webconsole. But it's unaccessible for my autoscaler.
The logs of my heapster look different than a few hours ago:
Starting Heapster with the following arguments: --source=kubernetes:https://kubernetes.default.svc:443?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250 --sink=hawkular:https://hawkular-metrics:443?tenant=_system&labelToTenant=pod_namespace&caCert=/hawkular-cert/hawkular-metrics-ca.certificate&user=hawkular&pass=uRXj1CFvyuQH_H8&filter=label(container_name:^/system.slice.*|^/user.slice) --logtostderr=true --tls_cert=/secrets/heapster.cert --tls_key=/secrets/heapster.key --tls_client_ca=/secrets/heapster.client-ca --allowed_users=system:master-proxy
I1210 07:25:52.969486 1 heapster.go:60] heapster --source=kubernetes:https://kubernetes.default.svc:443?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250 --sink=hawkular:https://hawkular-metrics:443?tenant=_system&labelToTenant=pod_namespace&caCert=/hawkular-cert/hawkular-metrics-ca.certificate&user=hawkular&pass=uRXj1CFvyuQH_H8&filter=label(container_name:^/system.slice.*|^/user.slice) --logtostderr=true --tls_cert=/secrets/heapster.cert --tls_key=/secrets/heapster.key --tls_client_ca=/secrets/heapster.client-ca --allowed_users=system:master-proxy
I1210 07:25:52.985274 1 heapster.go:61] Heapster version 0.18.0
I1210 07:25:52.985813 1 kube_factory.go:168] Using Kubernetes client with master "https://kubernetes.default.svc:443" and version "v1"
I1210 07:25:52.985835 1 kube_factory.go:169] Using kubelet port 10250
I1210 07:25:52.986153 1 driver.go:491] Initialised Hawkular Sink with parameters {_system https://hawkular-metrics:443?tenant=_system&labelToTenant=pod_namespace&caCert=/hawkular-cert/hawkular-metrics-ca.certificate&user=hawkular&pass=uRXj1CFvyuQH_H8&filter=label(container_name:^/system.slice.*|^/user.slice) 0xc2081946c0 }
I1210 07:25:53.165210 1 heapster.go:71] Starting heapster on port 8082
W1210 09:25:53.101503 1 reflector.go:224] /tmp/gopath/src/k8s.io/heapster/sources/pods.go:173: watch of *api.Pod ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [47713/47026]) [48712]
2015/12/10 11:20:23 http: TLS handshake error from 10.1.1.1:54211: tls: first record does not look like a TLS handshake
Logs in my webconsole:
11:20:02 AM test-scaler HorizontalPodAutoscaler FailedGetMetrics failed to get CPU consumption and request: some pods do not have request for cpu (352 times in the last 2 hours, 55 minutes)
11:20:32 AM test-scaler HorizontalPodAutoscaler FailedComputeReplicas failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu (353 times in the last 2 hours, 56 minutes)
The steps I perform to set up my heapster are just these steps: https://docs.openshift.org/latest/install_config/cluster_metrics.html#metrics-deployer
Litterally following them with oc secrets new metrics-deployer nothing=/dev/null and persistent storage: false. It works except for my hpa
Yeah, it looks like you're missing a CPU request on your pods (to confirm, I'd need to see the output of kubectl get dc $YOUR_DC -o yaml). In order to use the CPU autoscaling, you'll need to specify a CPU request under the resources section for your pod spec (CPU autoscaling is based on a percentage of the requested CPU: https://docs.openshift.org/latest/dev_guide/pod_autoscaling.html#hpa-supported-metrics). For example:
...
spec:
containers:
- image: nginx
name: nginx
resources:
requests:
cpu: 400m
...
Thanks, you were right. This fixed it.
Hi, I've been following this issue because I have the same problem with the same configuration but the last configuration doesn't fix the problem.
I don't use persistent storage and I use auto-generated certificates.
Heapster is running in openshift-infra project while the pods and hpa are running in a different project.
This is the the hpa:
oc describe hpa frontend-scaler Name: frontend-scaler Namespace:Labels: CreationTimestamp: Fri, 11 Dec 2015 08:41:14 +0000 Reference: DeploymentConfig/jupyter-requests/scale Target CPU utilization: 70% Current CPU utilization: Min replicas: 1 Max replicas: 3
Logs in web-console:
9:40:04 AM HorizontalPodAutoscaler frontend-scaler FailedComputeReplicas failed to get cpu utilization: failed to get CPU consumption and request: metrics obtained for 0/1 of pods 9:40:04 AM HorizontalPodAutoscaler frontend-scaler FailedGetMetrics failed to get CPU consumption and request: metrics obtained for 0/1 of pods
This is the output of kubectl get dc:
...
spec:
containers:
- image: .../openshift/jupyter-python
imagePullPolicy: IfNotPresent
name: jupyter-requests
ports:
- containerPort: 8000
protocol: TCP
resources:
limits:
cpu: 200m
memory: 400Mi
requests:
cpu: 100m
memory: 200Mi
....
Thanks.
Most helpful comment
Yeah, it looks like you're missing a CPU request on your pods (to confirm, I'd need to see the output of
kubectl get dc $YOUR_DC -o yaml). In order to use the CPU autoscaling, you'll need to specify a CPU request under the resources section for your pod spec (CPU autoscaling is based on a percentage of the requested CPU: https://docs.openshift.org/latest/dev_guide/pod_autoscaling.html#hpa-supported-metrics). For example: