Autoscaler: Make cluster-autoscaler log info logs to stdout and errors to stderr

Created on 12 Sep 2019  路  11Comments  路  Source: kubernetes/autoscaler

Hi,
I'm using cluster-autoscaler with EKS with these args

    spec:
      containers:
      - command:
        - ./cluster-autoscaler
        - --cloud-provider=aws
        - --namespace=infra
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/production
        - --logtostderr=true
        - --stderrthreshold=ERROR
        - --v=4

I want to to log everything with severity < ERROR to stdout, >= ERROR to stdout, as is common practice with most containerized workloads.

By default cluster-autoscaler is logging everything on stderr (even warning, info, debug logs) using the flag --logtostderr=true.

How do I set it up for the desired logging? Is it documented anywhere? Is is using the google/glog library?

Thanks!

lifecyclstale

Most helpful comment

/remove-lifecycle stale
If this isn't resolved yet, i'll bump it up - this would be very, very helpful.
Our CA is configured like this:
logtostderr: true
stderrthreshold: ERROR
v: 4
and we're getting ~100k logs per day.

All 11 comments

馃憤

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

This would be a helpful feature.

I'm assuming OP meant:

I want to to log everything with severity < ERROR to stdout, >= ERROR to ~stdout~ stderr, as is common practice with most containerized workloads.

We're having the same issue resulting in an overload of logging on stderr such as:

I0320 12:19:49.401822       1 static_autoscaler.go:187] Starting main loop
I0320 12:19:49.402396       1 utils.go:622] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
I0320 12:19:49.402413       1 filter_out_schedulable.go:63] Filtering out schedulables
I0320 12:19:49.402485       1 filter_out_schedulable.go:80] No schedulable pods
I0320 12:19:49.402503       1 static_autoscaler.go:334] No unschedulable pods
I0320 12:19:49.402514       1 static_autoscaler.go:381] Calculating unneeded nodes
I0320 12:19:49.402527       1 utils.go:579] Skipping ip-10-12-7-13.eu-west-1.compute.internal - node group min size reached
I0320 12:19:49.402536       1 utils.go:579] Skipping ip-10-12-28-58.eu-west-1.compute.internal - node group min size reached
I0320 12:19:49.402546       1 utils.go:579] Skipping ip-10-12-45-1.eu-west-1.compute.internal - node group min size reached
I0320 12:19:49.402553       1 utils.go:579] Skipping ip-10-12-60-44.eu-west-1.compute.internal - node group min size reached
I0320 12:19:49.402675       1 static_autoscaler.go:410] Scale down status: unneededOnly=false lastScaleUpTime=2020-03-18 11:20:27.701663087 +0000 UTC m=+20.516289144 lastScaleDownDeleteTime=2020-03-18 11:20:27.701663185 +0000 UTC m=+20.516289243 lastScaleDownFailTime=2020-03-18 11:20:27.701663277 +0000 UTC m=+20.516289334 scaleDownForbidden=false isDeleteInProgress=false
I0320 12:19:49.402702       1 static_autoscaler.go:420] Starting scale down
I0320 12:19:49.402740       1 scale_down.go:771] No candidates for scale down
I0320 12:19:59.415352       1 static_autoscaler.go:187] Starting main loop
I0320 12:19:59.488376       1 utils.go:622] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
I0320 12:19:59.488405       1 filter_out_schedulable.go:63] Filtering out schedulables
I0320 12:19:59.488504       1 filter_out_schedulable.go:80] No schedulable pods
I0320 12:19:59.488520       1 static_autoscaler.go:334] No unschedulable pods
I0320 12:19:59.488538       1 static_autoscaler.go:381] Calculating unneeded nodes
I0320 12:19:59.488553       1 utils.go:579] Skipping ip-10-12-45-1.eu-west-1.compute.internal - node group min size reached
I0320 12:19:59.488561       1 utils.go:579] Skipping ip-10-12-60-44.eu-west-1.compute.internal - node group min size reached
I0320 12:19:59.488569       1 utils.go:579] Skipping ip-10-12-7-13.eu-west-1.compute.internal - node group min size reached
I0320 12:19:59.488578       1 utils.go:579] Skipping ip-10-12-28-58.eu-west-1.compute.internal - node group min size reached
I0320 12:19:59.488787       1 static_autoscaler.go:410] Scale down status: unneededOnly=false lastScaleUpTime=2020-03-18 11:20:27.701663087 +0000 UTC m=+20.516289144 lastScaleDownDeleteTime=2020-03-18 11:20:27.701663185 +0000 UTC m=+20.516289243 lastScaleDownFailTime=2020-03-18 11:20:27.701663277 +0000 UTC m=+20.516289334 scaleDownForbidden=false isDeleteInProgress=false
I0320 12:19:59.488832       1 static_autoscaler.go:420] Starting scale down
I0320 12:19:59.488879       1 scale_down.go:771] No candidates for scale down

Any updates? BTW, tried all these in 1.15.6 and didn't see differences:

--stderrthreshold=ERROR
--stderrthreshold=FATAL
--stderrthreshold=2
--stderrthreshold=3
--stderrthreshold=info

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

This is still a valid issue IMO

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale
If this isn't resolved yet, i'll bump it up - this would be very, very helpful.
Our CA is configured like this:
logtostderr: true
stderrthreshold: ERROR
v: 4
and we're getting ~100k logs per day.

According to cluster-autoscaler usage help:

--stderrthreshold severity: logs at or above this threshold go to stderr (default 2)
--v Level: number for the log level verbosity

So, try to use this (for example):
--logtostderr=true
--stderrthreshold=2
--v=2

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lexsys27 picture lexsys27  路  5Comments

losipiuk picture losipiuk  路  7Comments

mboersma picture mboersma  路  6Comments

clamoriniere picture clamoriniere  路  5Comments

whereisaaron picture whereisaaron  路  7Comments