Autoscaler: How can I access the logs on GKE?

Created on 15 Jun 2018 · 18Comments · Source: kubernetes/autoscaler

As I have no access to the autoscaler pod on GKE, is there any way I can view the logs?

I'm trying to diagnose why a cluster in not scaling down - even with the a pod disruption config set.

areprovidegcp cluster-autoscaler lifecyclrotten

Source

chrissound

👍15

Most helpful comment

It's really difficult if not impossible to understand what blocks CA from scaling down cluster without any logs from CA

kirgene on 17 Jun 2018

👍22

All 18 comments

Neither logs nor metrics from master components are available to the users. To observe what CA is doing, you can look for events published by Cluster Autoscaler for pods and nodes (such as TriggeredScaleUp, NotTriggeredScaleUp etc.) as well as configmap with status, which should display cluster and node pool health (including any scale-up backoffs or unregistered nodes.)

aleksandra-malinowska on 15 Jun 2018

Thank you! The configmap in question seems to be this: kubectl describe -n kube-system configmap cluster-autoscaler-status

chrissound on 15 Jun 2018

👍10

Yup, that'd be it.

aleksandra-malinowska on 15 Jun 2018

Well that seems to show me that no node candidates are available to scale down. And I guess I can't find any additional information about that in this environment?

chrissound on 15 Jun 2018

It's really difficult if not impossible to understand what blocks CA from scaling down cluster without any logs from CA

kirgene on 17 Jun 2018

👍22

Well that seems to show me that no node candidates are available to scale down. And I guess I can't find any additional information about that in this environment?

Short of access to the logs, what would be the preferred way to expose this information?

aleksandra-malinowska on 18 Jun 2018

Access through stack-driver would be the best solution if possible?

LarsOL on 25 Jun 2018

👍3

It would be nice to be able to configure the cluster-autoscalers behavior through a configmap in the kube-system namespace that it looks for. Does this make sense or is there an obvious issue with such an idea? I'm eager to learn :)

consideRatio on 26 Jun 2018

👍1

Access through stack-driver would be the best solution if possible?

What kind of information would be useful, other than exposing logs?

It would be nice to be able to configure the cluster-autoscalers behavior through a configmap in the kube-system namespace that it looks for. Does this make sense or is there an obvious issue with such an idea? I'm eager to learn :)

This feature was called "dynamic autoscaler" and was removed in #851.

aleksandra-malinowska on 27 Jun 2018

❤1

What kind of information would be useful, other than exposing logs?

I'm not sure what is in the log, but i'm after information that explains why the node pool was scaled up/same/down.

i.e I have this, and want to understand why it seems so suboptimal.

LarsOL on 28 Jun 2018

My cluster was stuck for the longest time with 'updating/upgrading master' and it was auto-triggered. 'kubectl describe configmap cluster-autoscaler-status -n kube-system' does not display anything about master.
Having some type of access that can tell for a given operation, how much is completed and how much is left can be useful.

shailvipx on 30 Aug 2018

@shailvipx That's a feature request to GKE in general. If the status configmap is not there it means that Cluster Autoscaler isn't running at all at the moment, so there is nothing it can do to provide status.

MaciekPytel on 30 Aug 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 28 Nov 2018

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 28 Dec 2018

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 27 Jan 2019

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot on 27 Jan 2019

I know this is stale, but logs are in stackdriver now.

logName="projects/<PROJECT_ID>/logs/container.googleapis.com%2Fcluster-autoscaler-visibility" OR (
(resource.type="k8s_pod" OR resource.type="k8s_cluster") AND (
jsonPayload.source.component="cluster-autoscaler"
)
)

should get you started