On GKE cluster that has 1.16.0-gke.20 recently. Activation works, but the HPA never kicks in. It shows 0 current replicas even when it's at 1. I think the way KEDA activates may be bumping into the HPA changes that allow external metrics to drive scaling all the way down to zero. Not sure
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#other-notable-changes-11
To clarify too this was working on 1.15.5, and then in the last week GKE upgraded it to 1.16 at which point none of my deployments will scale past 1 and HPA never seems to kick in and start reading external metrics.
// cc @Aarthisk
@jeffhollan do you have an output from the HPA and metrics server?
The KEDA metrics server just has the standard few lines of startup code. I don鈥檛 see any errors or info. Not sure if switching to debug logs (v=1) will show more? I couldn鈥檛 figure out how to get logs from the HPA itself other than kubectl describe. Not sure if you have any tips of how I can get more helpful logs.
okay, so the metrics server seems to be ok.
as for HPA, kubectl describe or kubect get -oyaml should give you some output. It might be worth checking this during the scaling. There you could see some messages about (not) getting the metrics, so we could rule out problems with getting the metrics.
Hope this helps, I've been using 1.16.3 and i can see hpa working with keda. Scale to zero and up is working for me. This is with microk8s not gke.
Thanks @balchua - hard to know if a bug with 1.16.0 or gke rapid channel or something. I did repro this though on a clean cluster on 1.16.0-gke20. when I start pushing load I see the metrics are working fine. I compared the APIs to a version on 1.15 in gke that was working fine and everything is identical, so not sure why it's having an issue
k get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/functions/queueLength
{
"kind":"ExternalMetricValueList",
"apiVersion":"external.metrics.k8s.io/v1beta1",
"metadata":{
"selfLink":"/apis/external.metrics.k8s.io/v1beta1/namespaces/functions/queueLength"
},
"items":[
{
"metricName":"queueLength",
"metricLabels":null,
"timestamp":"2019-12-17T05:42:07Z",
"value":"281"
}
]
}
But the HPA doesn't seem to see the custom metrics or get updated with the current replica count (KEDA has created 1 replica when I ran this, so current should be 1).
k get --raw /apis/autoscaling/v2beta2/namespaces/functions/horizontalpodautoscalers/keda-hpa-azure-function
{
"kind":"HorizontalPodAutoscaler",
"apiVersion":"autoscaling/v2beta2",
"metadata":{
"name":"keda-hpa-azure-function",
"namespace":"functions",
"selfLink":"/apis/autoscaling/v2beta2/namespaces/functions/horizontalpodautoscalers/keda-hpa-azure-function",
"uid":"3babf67b-f748-4bde-8f07-701ce179dcad",
"resourceVersion":"1542",
"creationTimestamp":"2019-12-17T05:40:49Z",
"ownerReferences":[
{
"apiVersion":"keda.k8s.io/v1alpha1",
"kind":"ScaledObject",
"name":"azure-function",
"uid":"4a5bc2f1-123a-4ee9-9a04-641b87093bf9",
"controller":true,
"blockOwnerDeletion":true
}
]
},
"spec":{
"scaleTargetRef":{
"kind":"Deployment",
"name":"azure-function",
"apiVersion":"apps/v1"
},
"minReplicas":1,
"maxReplicas":100,
"metrics":[
{
"type":"External",
"external":{
"metric":{
"name":"queueLength",
"selector":{
"matchLabels":{
"deploymentName":"azure-function"
}
}
},
"target":{
"type":"AverageValue",
"averageValue":"5"
}
}
}
]
},
"status":{
"currentReplicas":0,
"desiredReplicas":0,
"currentMetrics":null,
"conditions":null
}
}
Did one last test which was created the scaledObject with minReplicaCount of 1 to see if it would be happier. Same issue - hpa still reports currentReplica of 0 even though the deployment clearly shows 1. I wonder if some other bug going on here specific to GKE. Not sure if another provider who has an easy way to spin up 1.16 to test
Tested on Digital Ocean 1.16.2 and worked fine
We might want to document this somewhere @jeffhollan if you're up for it?
A "Known Issues" section or so, but also brings me back to https://github.com/kedacore/keda-docs/issues/49
For this who use GKE 1.16 and see this error when using kubectl describe hpa keda-hpa-[name]:
unable to fetch metrics from external metrics API: access_frequency.external.metrics.k8s.io is forbidden: User "system:vpa-recommender" cannot list resource "access_frequency" in API group "external.metrics.k8s.io" in the namespace "default": RBAC: clusterrole.rbac.authorization.k8s.io "external-metrics-reader" not found
You can fix it by creating clusterrole and clusterrolebinding as specified here:
https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: external-metrics-reader
rules:
- apiGroups:
- "external.metrics.k8s.io"
resources:
- "*"
verbs:
- list
- get
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: external-metrics-reader
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-metrics-reader
subjects:
- kind: ServiceAccount
name: horizontal-pod-autoscaler
namespace: kube-system
Hopefully this helps someone that is having the same problems we had
@Cerberussian strange, similar clusterrole and clusterrolebinding should be created during the installation of KEDA:
https://github.com/kedacore/keda/blob/master/deploy/20-metrics-cluster_role.yaml#L1-L16
https://github.com/kedacore/keda/blob/master/deploy/21-metrics-role_binding.yaml#L36-L51
maybe keda-external-metrics-reader should be used then instead of external-metrics-reader? is it possible keda creates keda-external-metrics-reader but actually never uses it and instead uses external-metrics-reader?
The name is not important, the permissions are. Both clusterroles grants similar permissions. The question is, why do you need your version of clusterrole and clusterrolebinding, when the same is being created during the instalation. You should check whether the installation was successful and all objects were created.
I probably don't fully understand something but it clearly states name of a cluster role it cant find:
RBAC: clusterrole.rbac.authorization.k8s.io "external-metrics-reader" not found
so I think the name is important.
I checked and KEDA's resources were created successfully and are in place if you do kubectl get clusterrole / clusterrolebinding. If name was not important, I believe creating the same resource with only a different name would not help us solve the problem.

@Cerberussian thanks for the snippet! It's missing apiVersion: rbac.authorization.k8s.io/v1 in the first line, took me some time to figure out why the role was not added :)
@Cerberussian thanks for the snippet! It's missing
apiVersion: rbac.authorization.k8s.io/v1in the first line, took me some time to figure out why the role was not added :)
Good catch!
It was there but due to Markdown error I've made it didn't show up.
Fixed it.
Happy to hear it helped you!
For posterity, this is a known issue with GKE: https://issuetracker.google.com/issues/160597676
@thefirstofthe300 thanks for the confirmation!
It might be worth adding this to the FAQ or somewhere else in the docs, WDYT @tomkerkhove ?
Sounds good to me - Either FAQ or Troubleshooting
Do you want to contribute this @thefirstofthe300?
Most helpful comment
Hope this helps, I've been using 1.16.3 and i can see hpa working with keda. Scale to zero and up is working for me. This is with microk8s not gke.