Hi,
I'm using cluster-autoscaler with EKS with these args
spec:
containers:
- command:
- ./cluster-autoscaler
- --cloud-provider=aws
- --namespace=infra
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/production
- --logtostderr=true
- --stderrthreshold=ERROR
- --v=4
I want to to log everything with severity < ERROR to stdout, >= ERROR to stdout, as is common practice with most containerized workloads.
By default cluster-autoscaler is logging everything on stderr (even warning, info, debug logs) using the flag --logtostderr=true.
How do I set it up for the desired logging? Is it documented anywhere? Is is using the google/glog library?
Thanks!
馃憤
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
This would be a helpful feature.
I'm assuming OP meant:
I want to to log everything with severity < ERROR to stdout, >= ERROR to ~stdout~ stderr, as is common practice with most containerized workloads.
We're having the same issue resulting in an overload of logging on stderr such as:
I0320 12:19:49.401822 1 static_autoscaler.go:187] Starting main loop
I0320 12:19:49.402396 1 utils.go:622] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
I0320 12:19:49.402413 1 filter_out_schedulable.go:63] Filtering out schedulables
I0320 12:19:49.402485 1 filter_out_schedulable.go:80] No schedulable pods
I0320 12:19:49.402503 1 static_autoscaler.go:334] No unschedulable pods
I0320 12:19:49.402514 1 static_autoscaler.go:381] Calculating unneeded nodes
I0320 12:19:49.402527 1 utils.go:579] Skipping ip-10-12-7-13.eu-west-1.compute.internal - node group min size reached
I0320 12:19:49.402536 1 utils.go:579] Skipping ip-10-12-28-58.eu-west-1.compute.internal - node group min size reached
I0320 12:19:49.402546 1 utils.go:579] Skipping ip-10-12-45-1.eu-west-1.compute.internal - node group min size reached
I0320 12:19:49.402553 1 utils.go:579] Skipping ip-10-12-60-44.eu-west-1.compute.internal - node group min size reached
I0320 12:19:49.402675 1 static_autoscaler.go:410] Scale down status: unneededOnly=false lastScaleUpTime=2020-03-18 11:20:27.701663087 +0000 UTC m=+20.516289144 lastScaleDownDeleteTime=2020-03-18 11:20:27.701663185 +0000 UTC m=+20.516289243 lastScaleDownFailTime=2020-03-18 11:20:27.701663277 +0000 UTC m=+20.516289334 scaleDownForbidden=false isDeleteInProgress=false
I0320 12:19:49.402702 1 static_autoscaler.go:420] Starting scale down
I0320 12:19:49.402740 1 scale_down.go:771] No candidates for scale down
I0320 12:19:59.415352 1 static_autoscaler.go:187] Starting main loop
I0320 12:19:59.488376 1 utils.go:622] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
I0320 12:19:59.488405 1 filter_out_schedulable.go:63] Filtering out schedulables
I0320 12:19:59.488504 1 filter_out_schedulable.go:80] No schedulable pods
I0320 12:19:59.488520 1 static_autoscaler.go:334] No unschedulable pods
I0320 12:19:59.488538 1 static_autoscaler.go:381] Calculating unneeded nodes
I0320 12:19:59.488553 1 utils.go:579] Skipping ip-10-12-45-1.eu-west-1.compute.internal - node group min size reached
I0320 12:19:59.488561 1 utils.go:579] Skipping ip-10-12-60-44.eu-west-1.compute.internal - node group min size reached
I0320 12:19:59.488569 1 utils.go:579] Skipping ip-10-12-7-13.eu-west-1.compute.internal - node group min size reached
I0320 12:19:59.488578 1 utils.go:579] Skipping ip-10-12-28-58.eu-west-1.compute.internal - node group min size reached
I0320 12:19:59.488787 1 static_autoscaler.go:410] Scale down status: unneededOnly=false lastScaleUpTime=2020-03-18 11:20:27.701663087 +0000 UTC m=+20.516289144 lastScaleDownDeleteTime=2020-03-18 11:20:27.701663185 +0000 UTC m=+20.516289243 lastScaleDownFailTime=2020-03-18 11:20:27.701663277 +0000 UTC m=+20.516289334 scaleDownForbidden=false isDeleteInProgress=false
I0320 12:19:59.488832 1 static_autoscaler.go:420] Starting scale down
I0320 12:19:59.488879 1 scale_down.go:771] No candidates for scale down
Any updates? BTW, tried all these in 1.15.6 and didn't see differences:
--stderrthreshold=ERROR
--stderrthreshold=FATAL
--stderrthreshold=2
--stderrthreshold=3
--stderrthreshold=info
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
This is still a valid issue IMO
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
If this isn't resolved yet, i'll bump it up - this would be very, very helpful.
Our CA is configured like this:
logtostderr: true
stderrthreshold: ERROR
v: 4
and we're getting ~100k logs per day.
According to cluster-autoscaler usage help:
--stderrthreshold severity: logs at or above this threshold go to stderr (default 2)
--v Level: number for the log level verbosity
So, try to use this (for example):
--logtostderr=true
--stderrthreshold=2
--v=2
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
Most helpful comment
/remove-lifecycle stale
If this isn't resolved yet, i'll bump it up - this would be very, very helpful.
Our CA is configured like this:
logtostderr: truestderrthreshold: ERRORv: 4and we're getting ~100k logs per day.