As I have no access to the autoscaler pod on GKE, is there any way I can view the logs?
I'm trying to diagnose why a cluster in not scaling down - even with the a pod disruption config set.
Neither logs nor metrics from master components are available to the users. To observe what CA is doing, you can look for events published by Cluster Autoscaler for pods and nodes (such as TriggeredScaleUp, NotTriggeredScaleUp etc.) as well as configmap with status, which should display cluster and node pool health (including any scale-up backoffs or unregistered nodes.)
Thank you! The configmap in question seems to be this: kubectl describe -n kube-system configmap cluster-autoscaler-status
Yup, that'd be it.
Well that seems to show me that no node candidates are available to scale down. And I guess I can't find any additional information about that in this environment?
It's really difficult if not impossible to understand what blocks CA from scaling down cluster without any logs from CA
Well that seems to show me that no node candidates are available to scale down. And I guess I can't find any additional information about that in this environment?
Well that seems to show me that no node candidates are available to scale down. And I guess I can't find any additional information about that in this environment?
Short of access to the logs, what would be the preferred way to expose this information?
Access through stack-driver would be the best solution if possible?
It would be nice to be able to configure the cluster-autoscalers behavior through a configmap in the kube-system namespace that it looks for. Does this make sense or is there an obvious issue with such an idea? I'm eager to learn :)
Access through stack-driver would be the best solution if possible?
What kind of information would be useful, other than exposing logs?
It would be nice to be able to configure the cluster-autoscalers behavior through a configmap in the kube-system namespace that it looks for. Does this make sense or is there an obvious issue with such an idea? I'm eager to learn :)
This feature was called "dynamic autoscaler" and was removed in #851.
What kind of information would be useful, other than exposing logs?
I'm not sure what is in the log, but i'm after information that explains why the node pool was scaled up/same/down.
i.e I have this, and want to understand why it seems so suboptimal.

My cluster was stuck for the longest time with 'updating/upgrading master' and it was auto-triggered. 'kubectl describe configmap cluster-autoscaler-status -n kube-system' does not display anything about master.
Having some type of access that can tell for a given operation, how much is completed and how much is left can be useful.
@shailvipx That's a feature request to GKE in general. If the status configmap is not there it means that Cluster Autoscaler isn't running at all at the moment, so there is nothing it can do to provide status.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I know this is stale, but logs are in stackdriver now.
logName="projects/<PROJECT_ID>/logs/container.googleapis.com%2Fcluster-autoscaler-visibility" OR (
(resource.type="k8s_pod" OR resource.type="k8s_cluster") AND (
jsonPayload.source.component="cluster-autoscaler"
)
)
should get you started
I know this is stale, but logs are in stackdriver now.
Note that those are not raw CA logs, but an aggregated summary. It's documented in: https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-autoscaler-visibility
Most helpful comment
It's really difficult if not impossible to understand what blocks CA from scaling down cluster without any logs from CA