Hi, I have a few questions that would help me understand Loki better, and I didn't find the answers in the design document:
Batching logs will allow better compression ratios and bigger blobs (which means lower per-operation costs), but must be balanced with the risk of data loss - what is the strategy here?
Can Loki this handle multi-line logs? Let's say a regex matches a line that is really part of a multi-line log - will the search then return all the related lines for that log?
Labels are important for Loki, and I understand that the focus is on k8s first, with automatic labelling
3.a. Exactly what labels are automatically assigned?
3.b. Is there some mechanism to add your own labels, for example to group sources by operating system, or are you limited only to those labels assigned by Loki?
3.c. If you are limited to labels assignd by Loki, how will you handle labelling as you expand out of k8s and accept logs for other sources (e.g. syslog)?
Logs really have _2_ timestamps associated with them: the time the log was _generated_, and the time the log was _ingested_ by Loki - will Loki be able to parse the log generation time out of logs (where it's included, and it usually is, such as with syslog), or will it only use the log ingestion time for time range searches?
Batching logs will allow better compression ratios and bigger blobs (which means lower per-operation costs), but must be balanced with the risk of data loss - what is the strategy here?
100% agree - Loki already batches entries for the same log stream into what we call a "chunk", and then flushes these chunks to S3/GCS etc. To protect against dataloss, each entry is written to ~3 different replicas, and soon the replicas will maintain a write ahead logs.
Can Loki this handle multi-line logs? Let's say a regex matches a line that is really part of a multi-line log - will the search then return all the related lines for that log?
Not yet; this is something we're discussing in #74.
Exactly what labels are automatically assigned?
This is completely in the control of the end user, and configures by our use of Prometheus service discover and relabelling riles.
The only one that is automatically produced is the __filename__ label at the moment, although I'm sure we'll add more.
Most setups will also have a job label, which be default takes on the name from the scrape config. On kubernetes (with our example configs), the job labels consists of <namespace>/<pod name>, we add an independent namespace label, and we propagate any labels from the pods themselves, such as version, app etc.
Is there some mechanism to add your own labels, for example to group sources by operating system, or are you limited only to those labels assigned by Loki?
Of course :-) You can configure promtail to extract arbitrary metadata from your chosen service discovery mechanism, plus you can configure promtail for "external labels" that are appended to every outgoing stream - useful for things like OS version, cluster name, hostname etc.
3.c. If you are limited to labels assignd by Loki, how will you handle labelling as you expand out of k8s and accept logs for other sources (e.g. syslog)?
Good question! And I don't know. Right now we're thinking of adding the ability to extract labels from the log entry using a regular expression, and things like journald already have a fair bit of metadata.
Logs really have 2 timestamps associated with them: the time the log was generated, and the time the log was ingested by Loki - will Loki be able to parse the log generation time out of logs (where it's included, and it usually is, such as with syslog), or will it only use the log ingestion time for time range searches?
Current we have two options, but for both of them the timestamp comes from the promtail agent, and never from the loki server. Option (1) is for docker-style json logs (as produced by k8s nodes), where there is a timestamp included that is the time the log was written to the pipe by the container. Option (2) is to use the time at which promtail read the log, which assuming there isn't a lot of catch up to do will be pretty accurate.
We're already planning on adding the ability for promtail to extract timestamps for the log entries themselves.
Thanks for the quick and informative response!
One thing I'm still not clear on though, is the mechanics of the batching:
By 'log stream', do you mean "all logs from a particular source"? If so, is the assumption that all logs from each source always have the same, constant set of labels associated with them? Perhaps anonther way of phrasing this, is are labels associated with a log stream, or with individual logs?
Does 1 chunk equate to one S3 blob, or just part of a larger blob?
When you talk of entries being written to replicas, do you mean each log is written to 3 different S3 buckets, or do you mean to some local caching mechanism ? (sorry if this is a silly question, but I'm not familiar with any details of S3!)
By 'log stream', do you mean "all logs from a particular source"? If so, is the assumption that all logs from each source always have the same, constant set of labels associated with them? Perhaps anonther way of phrasing this, is are labels associated with a log stream, or with individual logs?
A log stream is defined as all entries with the same labels - typically this would be all entries from a single source, but in the k8s/docker cases we split STDERR and STDOUT into two different streams for each container. When we tail local files, each file becomes a stream.
Does 1 chunk equate to one S3 blob, or just part of a larger blob?
Each chunk becomes a single blob; a stream is made of multiple chunks.
When you talk of entries being written to replicas, do you mean each log is written to 3 different S3 buckets, or do you mean to some local caching mechanism ?
Each entry is replicated to 3 ingesters, and will appear in 3 chunks. These three chunks will be written to the same bucket.
That was really informative, thanks!
Each entry is replicated to 3 ingesters, and will appear in 3 chunks. These three chunks will be written to the same bucket.
Do the investors persist to local storage while buffering chunks, or are they stateless, buffering in memory?
I'm not familiar with S3, but I'm guessing one chunk is the 'master' and the other 2 are replicas/copies that are held in different availability zones (for fault tolerance within the same region)?
Can i understand that loki just create something like index for log files, if log files changed or updated, the index thing would auto update? So that loki do not spend too much disk to store even with 3 replicas?
@icereed to avoid confusion, I think it would be better if you removed this comment and opened a separate issue with your feature request, leaving this one free and uncluttered for the original Q&A :smile:
@cocowalla close this, feel free to continue discuss here.
Most helpful comment
100% agree - Loki already batches entries for the same log stream into what we call a "chunk", and then flushes these chunks to S3/GCS etc. To protect against dataloss, each entry is written to ~3 different replicas, and soon the replicas will maintain a write ahead logs.
Not yet; this is something we're discussing in #74.
This is completely in the control of the end user, and configures by our use of Prometheus service discover and relabelling riles.
The only one that is automatically produced is the
__filename__label at the moment, although I'm sure we'll add more.Most setups will also have a
joblabel, which be default takes on the name from the scrape config. On kubernetes (with our example configs), thejoblabels consists of<namespace>/<pod name>, we add an independentnamespacelabel, and we propagate any labels from the pods themselves, such as version, app etc.Of course :-) You can configure promtail to extract arbitrary metadata from your chosen service discovery mechanism, plus you can configure promtail for "external labels" that are appended to every outgoing stream - useful for things like OS version, cluster name, hostname etc.
Good question! And I don't know. Right now we're thinking of adding the ability to extract labels from the log entry using a regular expression, and things like journald already have a fair bit of metadata.
Current we have two options, but for both of them the timestamp comes from the promtail agent, and never from the loki server. Option (1) is for docker-style json logs (as produced by k8s nodes), where there is a timestamp included that is the time the log was written to the pipe by the container. Option (2) is to use the time at which promtail read the log, which assuming there isn't a lot of catch up to do will be pretty accurate.
We're already planning on adding the ability for promtail to extract timestamps for the log entries themselves.