image: docker.io/timberio/vector:0.9.0-alpine
oc --context build01 get cm -n api-audit-log vector-audit-log-config -o yaml
apiVersion: v1
data:
vector.toml: |-
[sources.kube_apiserver_audit_logs]
type = "file"
include = ["/host/var/log/kube-apiserver/audit.log"]
ignore_older = 86400
[sources.openshift_apiserver_audit_logs]
type = "file"
include = ["/host/var/log/openshift-apiserver/audit.log"]
ignore_older = 86400
[sinks.aws_cloudwatch_logs]
type = "aws_cloudwatch_logs"
inputs = ["kube_apiserver_audit_logs", "openshift_apiserver_audit_logs"]
group_name = "ci-build01-audit-logs"
region = "us-east-1"
stream_name = "{{ host }}"
encoding = "json"
batch.max_size = 500000
kind: ConfigMap
Apr 21 22:37:25.150 ERROR sink{name=aws_cloudwatch_logs type=aws_cloudwatch_logs}:request{request_id=1}: vector::sinks::util::retries: encountered non-retriable error. error=CloudwatchError::Put: Upload too large: 1048914 bytes exceeds limit of 1048576
Apr 21 22:37:25.150 ERROR sink{name=aws_cloudwatch_logs type=aws_cloudwatch_logs}: vector::sinks::util::sink: Request failed. error=CloudwatchError::Put: Upload too large: 1048914 bytes exceeds limit of 1048576
Apr 21 22:37:25.150 ERROR sink{name=aws_cloudwatch_logs type=aws_cloudwatch_logs}: vector::sinks::aws_cloudwatch_logs: Fatal cloudwatchlogs sink error: CloudwatchError::Put: Upload too large: 1048914 bytes exceeds limit of 1048576
Apr 21 22:37:25.151 ERROR vector::topology: Unhandled error
Apr 21 22:37:25.151 INFO vector: Shutting down.
Seem batch.max_size = 500000 did not take effect.
Thanks for reporting, we'll prioritize this.
With docker.io/timberio/vector:0.8.2-alpine, still saw
Apr 21 22:17:07.623 ERROR sink{name=aws_cloudwatch_logs type=aws_cloudwatch_logs}: vector::sinks::util: request failed. error=CloudwatchError::Put: Upload too large: 1048631 bytes exceeds limit of 1048576
The pod kept Running and I also checked on cloudwatch and the logs are there.
This is different from 0.9.0 which the pod CrashLoopBackOff.
@hongkailiu using 0.9.X can you set the batch.max_events to a lower value and remove the max_size setting. From what I can tell that option is from an older version and is telling the batcher to batch 50,000 events. So I would suggest turning that number down. I also have a fix for the docs.
On a side note I realized we could probably be smarter about batching so I opened this issue https://github.com/timberio/vector/issues/2396.
@LucioFranco
Thanks for the quick reply.
We tried batch.max_events = 500 which seems working.
Subscribed https://github.com/timberio/vector/issues/2396 for better solution in the future.
This should be fixed by #2916. If this is not the case, please re-open.
@bruceg thanks for fixing this! Do you know when the next release is planned so we can test this?
@alvaroaleman It is being worked on as we speak.
Most helpful comment
Thanks for reporting, we'll prioritize this.