Describe the bug
Trying to push chunk of logs from FluentD to Loki. Fluentd put logs in buffer and on buffer flush Loki is refusing to receive.
Is there some chunk size limitation in Loki? Is that size adjustable?
To Reproduce
Send chunk with more than 4194304 bytes in size
Expected behavior
Loki receives all messages.
Environment:
Screenshots, Promtail config, or terminal output
Loki logs:
020-06-27T07:04:34.242790819Z level=warn ts=2020-06-27T07:04:34.234793259Z caller=logging.go:49 traceID=64131507cd842576 msg="POST /loki/api/v1/push (500) 169.780175ms Response: \"rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5229930 vs. 4194304)\\n\" ws: false; Accept: */*; Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3; Content-Length: 5537209; Content-Type: application/json; User-Agent: Ruby; "
2020-06-27T07:04:36.914846213Z level=warn ts=2020-06-27T07:04:36.914619292Z caller=logging.go:49 traceID=7390203342d5661a msg="POST /loki/api/v1/push (500) 170.021947ms Response: \"rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5229930 vs. 4194304)\\n\" ws: false; Accept: */*; Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3; Content-Length: 5537209; Content-Type: application/json; User-Agent: Ruby; "
2020-06-27T07:04:39.907138126Z level=warn ts=2020-06-27T07:04:39.906926876Z caller=logging.go:49 traceID=2779ac80c54cbfa7 msg="POST /loki/api/v1/push (500) 249.833406ms Response: \"rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5229930 vs. 4194304)\\n\" ws: false; Accept: */*; Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3; Content-Length: 5537209; Content-Type: application/json; User-Agent: Ruby; "
Loki config
uth_enabled: false
chunk_store_config:
max_look_back_period: 0s
ingester:
chunk_block_size: 262144
chunk_idle_period: 3m
chunk_retain_period: 1m
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
max_transfer_retries: 0
limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
schema_config:
configs:
- from: "2018-04-15"
index:
period: 168h
prefix: index_
object_store: filesystem
schema: v9
store: boltdb
server:
http_listen_port: 3100
storage_config:
boltdb:
directory: /data/loki/index
filesystem:
directory: /data/loki/chunks
table_manager:
retention_deletes_enabled: true
retention_period: 336h
You could try to increase those limits here https://github.com/grafana/loki/tree/master/docs/configuration#grpc_client_config and here https://github.com/grafana/loki/tree/master/docs/configuration#server_config
@cyriltovena Thank you!
Are there any recommendations for this value grpc_server_max_recv_msg_size? Or I can put whatever I want here? Could its change affect another parts of Loki and therefore they need to be adjusted too?
You should be fine increasing this. Alternatively you could look at controlling the buffer sizes that fluent uses before flushing and aligning it to this config. You'll want to make sure that the send/receive max sizes are aligned across the grpc_client_config and the server_config.
We inherit some of these from our upstream dependency Cortex, but it looks like they don't align automatically.
@cyriltovena @slim-bean WDYT, should we align these defaults or PR Cortex to do so?
@owen-d
In case of FluentD is log shipper to Loki as I understand FluentD is always grpc client and Loki is the server. The same in Grafana-Loki chain (Loki is the server). Am I right?
So I wonder why do we need to adjust grpc_client_config and not just server_config?
Also I've checked defaults:
# The maximum size in bytes the client can receive
[max_recv_msg_size: <int> | default = 104857600]
# The maximum size in bytes the client can send
[max_send_msg_size: <int> | default = 16777216]
---
# Max gRPC message size that can be received
[grpc_server_max_recv_msg_size: <int> | default = 4194304]
# Max gRPC message size that can be sent
[grpc_server_max_send_msg_size: <int> | default = 4194304]
By aligning do you mean making them equal? Didn't really get this...
Thank you!
We vendor another project, Owen was talking about this.
Fluentd is actually using http, but that server is also used internally between components running in the same process so yeah I recommend you change them all.
As long as fluent isn't sending payloads larger than the server's grpc_server_max_send_msg_size, it should be fine. If you still see similar errors, I'd make sure that the client/server sizes are equal (what I meant by aligned). This is because under the hood, Loki's separate components talk to itself within the same process via grpc (I doubt you'd see this problem though).
Good luck!
Thank you guys!
Your explanations are much appreciated.
For now i've increased grpc_server_max_recv_msg_size and grpc_server_max_send_msg_size to 8MB and these errors are gone. Will see...
@nomatterz Could you please show your updated config file of loki and fluentd both.?I am facing the same issue. Thanks a lot
Most helpful comment
@nomatterz Could you please show your updated config file of loki and fluentd both.?I am facing the same issue. Thanks a lot