Hello, I am working on implementing a logging solution, and have a few questions that I could use help with. Here is some context:
I have searched through the docs, and I haven't found a solid answer on these questions:
Thank you for your help.
Hey @JoshFerge, thanks for asking. @ofrobots are you able to help with any of these questions?
Hi @JoshFerge.
The first thing to note is that the google-cloud/logging library already does perform batching of requests internally. Specifically, here's the relevant section of config:
"WriteLogEntries": {
"timeout_millis": 30000,
"retry_codes_name": "non_idempotent",
"retry_params_name": "default",
"bundling": {
"element_count_threshold": 1000,
"request_byte_threshold": 1048576,
"delay_threshold_millis": 50
}
},
What this means is that, by default, the logging library will batch up to 1000 log entries, or up to 1MB of serialized log entry data, or up to a duration of 50 milliseconds – whichever happens first. This will already give you some bundling of your calls to log.write and, as a result, mitigate some of the performance and quota concerns.
On top of this, if you are using google-cloud/logging-bunyan as the library to interface with logging, you'll get an additional layer of buffering as the bunyan logging stream is a proper WritableStream, which will do some internal buffering as well.
What is the recommended amount of entries to batch before writing?
I think the default batching configuration should be a reasonable starting point for most applications.
Will gcloud-winston or gcloud-bunyan handle batching log entries?
The answer is yes, as per above. If the defaults are not adequate for the needs of your application, you could probably batch more in your own code to tweak the performance further.
Note, however, that there is a tradeoff here – by buffering more, you can probably better performance, but you will also increase the risk is that more log entries will be lost in case your application crashes before the buffer has been flushed to the network. The memory consumption of your app will increase as well.
Is there an example of batching log entries in a high throughput scenario?
You could probably follow the example used by this third party module: bunyan-stackdriver. Their motivation was quota rather than performance.
Hope this helps. Let me know if you have additional questions.
closing the issue as there is no bug to fix, but feel free to continue discussing
@ofrobots thanks for the awesome and thorough response! If my understanding is correct then, the statement in the documentation under write (https://googlecloudplatform.github.io/google-cloud-node/#/docs/logging/1.0.0/logging/log?method=write)
While you may write a single entry at a time, batching multiple entries together is preferred to avoid reaching the queries per second limit."
is erroneous?
I would agree that it is misleading. Quotas are at a project-level. If you have many applications running in the same project, all with very high QPS, it may still be possible to deplete your quota.
Batching in the application will help in those cases.
@stephenplusplus is it worth rewording this?
Yes, sounds good. PR welcome to remove the line altogether 👍
Fixed in #2409. Thanks @JoshFerge!
This issue was moved to googleapis/nodejs-logging-bunyan#6