Currently attempting to insert Out of Order throws 500 errors. We also don't support inserting two lines with the same timestamp.
I dont think we need to immediately support out-of-order log streams (we can buffer and re-order on the agent side if needed) but we do need to support stream entries with the same timestamp.
It seems to be quite common for journald to have log entries with the same timestamp. This is most notable for dockerd.
eg
~# journalctl --no-pager --output json|grep "1547128356924191"
{ "__CURSOR" : "s=3eababeec1754cb8aa44cf11ab923dc8;i=21d71;b=e3043240a1e6439cb6c6e86de2cf79fa;m=41d521e463a;t=57f1ae422eb1f;x=fb21916ec5ece135", "__REALTIME_TIMESTAMP" : "1547128356924191", "__MONOTONIC_TIMESTAMP" : "4523978278458", "_BOOT_ID" : "e3043240a1e6439cb6c6e86de2cf79fa", "PRIORITY" : "6", "_UID" : "0", "_GID" : "0", "_CAP_EFFECTIVE" : "3fffffffff", "_SELINUX_CONTEXT" :"unconfined\n", "_MACHINE_ID" : "e27657d46dcdcc4a15f967e35356370a", "_HOSTNAME" : "anthony-desktop", "_SYSTEMD_SLICE" : "system.slice", "_TRANSPORT" : "stdout", "_STREAM_ID" : "188cd29e9f664bc0bc5fda3cb32e26e9", "SYSLOG_IDENTIFIER" : "dockerd", "_PID" : "3010", "_COMM" : "dockerd", "_EXE" : "/usr/bin/dockerd", "_CMDLINE" : "/usr/bin/dockerd -H fd://", "_SYSTEMD_CGROUP" : "/system.slice/docker.service", "_SYSTEMD_UNIT" : "docker.service", "_SYSTEMD_INVOCATION_ID" : "ef4cf2180e234f09b4ff66a49b76eafd", "MESSAGE" : "time=\"2019-01-10T21:52:36.924060634+08:00\" level=info msg=\"No non-localhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]\"" }
{ "__CURSOR" : "s=3eababeec1754cb8aa44cf11ab923dc8;i=21d72;b=e3043240a1e6439cb6c6e86de2cf79fa;m=41d521e463a;t=57f1ae422eb1f;x=d9033a1beade7d53", "__REALTIME_TIMESTAMP" : "1547128356924191", "__MONOTONIC_TIMESTAMP" : "4523978278458", "_BOOT_ID" : "e3043240a1e6439cb6c6e86de2cf79fa", "PRIORITY" : "6", "_UID" : "0", "_GID" : "0", "_CAP_EFFECTIVE" : "3fffffffff", "_SELINUX_CONTEXT" :"unconfined\n", "_MACHINE_ID" : "e27657d46dcdcc4a15f967e35356370a", "_HOSTNAME" : "anthony-desktop", "_SYSTEMD_SLICE" : "system.slice", "_TRANSPORT" : "stdout", "_STREAM_ID" : "188cd29e9f664bc0bc5fda3cb32e26e9", "SYSLOG_IDENTIFIER" : "dockerd", "_PID" : "3010", "_COMM" : "dockerd", "_EXE" : "/usr/bin/dockerd", "_CMDLINE" : "/usr/bin/dockerd -H fd://", "_SYSTEMD_CGROUP" : "/system.slice/docker.service", "_SYSTEMD_UNIT" : "docker.service", "_SYSTEMD_INVOCATION_ID" : "ef4cf2180e234f09b4ff66a49b76eafd", "MESSAGE" : "time=\"2019-01-10T21:52:36.924158301+08:00\" level=info msg=\"IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]\"" }
though the timestamps defined in these log MESSAGE are different, we would need custom parsing rules for each application to extract the field. Even then not all messages include a time field.
The standard approach is to just use the __REALTIME_TIMESTAMP provided by journald which is the time the message was received by the journal.
see also https://github.com/grafana/loki/pull/26#issuecomment-453391871
I believe the support of out of order logs is mandatory for the release of Loki. Here is why.
Here is a simple client library https://github.com/afiskon/promtail-client It's README includes a client example.
If you run it in a single process everything works just fine:
$ ./loki-example proto
2019/01/17 16:16:40 Error: The time is 2019-01-17 16:16:40.509173 +0300 MSK m=+0.019247359, i = 1
2019/01/17 16:16:41 Error: The time is 2019-01-17 16:16:41.512412 +0300 MSK m=+1.022477432, i = 2
2019/01/17 16:16:42 Error: The time is 2019-01-17 16:16:42.517708 +0300 MSK m=+2.027764858, i = 3
2019/01/17 16:16:43 Error: The time is 2019-01-17 16:16:43.519887 +0300 MSK m=+3.029934055, i = 4
However when I start multiple processes in different terminals logs get rejected with out-of-order errors:
2019/01/17 16:17:23 promtail.ClientProto: Unexpected HTTP status code: 500, message: rpc error: code = Unknown desc = Entry out of order
In other words currently Loki can be used only with a single process or microservice. If your system has 2+ microservices, currently you just can't use Loki.
@woodsaj I notice this dates back since 19/12 and hasn't been labelled as a bug (as per the comment above). Perhaps stick the label on so it gets some TLC :)
support for log entries with the same timestamp is being addressed in #288
Hello,
I decided to test the patch using docker-compose and recent images from DockerHub. I still get:
2019/02/08 14:34:05 promtail.ClientProto: Unexpected HTTP status code: 400, message: Entry out of order
... in the second terminal window. Images versions:
$ docker-compose images
Container Repository Tag Image Id Size
--------------------------------------------------------------------
temp_grafana_1 grafana/grafana master 8905a7bc010e 232 MB
temp_loki_1 grafana/loki master 66862f6b38aa 32.2 MB
temp_promtail_1 grafana/promtail master 42300bf69b30 50.2 MB
Docker hub says that these images were updated ~20 minutes ago, thus I believe they should include recent changes. I did docker-compose down and docker-compose pull before testing and checked that the new images were downloaded. It seem that the patch didn't work.
i am getting the same issue as afiskon
Here are exact steps to reproduce:
1) Make sure no images are running:
$ docker-compose ps
Name Command State Ports
------------------------------
2) Create a new empty directory:
mkdir loki-test
cd loki-test
3) Create docker-compose.yaml:
version: "3"
networks:
loki:
services:
loki:
image: grafana/loki:master
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
networks:
- loki
promtail:
image: grafana/promtail:master
volumes:
- /var/log:/var/log
command:
-config.file=/etc/promtail/docker-config.yaml
networks:
- loki
grafana:
image: grafana/grafana:master
ports:
- "3000:3000"
environment:
GF_EXPLORE_ENABLED: "true"
networks:
- loki
4) Run docker-compose pull, then docker-compose up
5) Open http://localhost:3000/, login/password are admin/admin
6) Open http://localhost:3000/login, add a new Loki datasource with URL: http://loki:3100 (loki is a local Docker network)
7) Build loki-example using the source code in README.md in this repository https://github.com/afiskon/promtail-client
$ pwd
/Users/eax/go/src/go-sandbox/loki-example
$ ls
main.go
$ go build
8) Execute ./loki-example proto in one terminal window and simultaneously ./loki-example proto in the second window.
Expected behavior: Both programs terminate normally.
Actual behavior: There is a Unexpected HTTP status code: 400, message: Entry out of order error in the second terminal window.
Just to add, since the the prev comments didn't point out which images exactly, I can confirm bug still exists in grafana/loki:master-7527487 (which was built 2 hours ago).
@olafure thanks for the info! We don't support out of order, but we are supposed to support multiple entries with the same timestamp. I'll build some unit tests to see whats going on here.
We don't support out of order
Then you may want to rename issue before closing it. Otherwise, people will find it as fixed and closed, while in fact out of order entries will not be supported.
I am having issues with multiple lines same time stamp.
I built a bunch of unit tests (see #303) and a bunch more info logging (see #304) and we seem to be handling this correctly. Please check you are using an up to date image version (docker pull grafana/loki:master) and try again, but pretty sure this works now - its working in our infra.
I'm a little confused by latest comments. Can Loki currently add logs from more than one service (see the test case described above: https://github.com/grafana/loki/issues/168#issuecomment-461779539 )? If not, are there any estimates when this might be implemented and/or an open issue?
Can Loki currently add logs from more than one service
@afiskon you can send logs from multiple services, but the log streams from each service need to be unique (ie they have different labels). The simplest way to ensure that is to just add a "source" label that describes where the logs are coming from.
@woodsaj Thank you a lot for the reply! I can confirm that Loki accepts logs with no problems from multiple services if they use different source value. I did corresponding changes to the README of my client library https://github.com/afiskon/promtail-client
One more little question. We would like to use Loki instead of ELK in our project https://github.com/insolar/insolar . However we would prefer to wait a little until the official release. Unfortunately I didn't manage to find a roadmap. Is there an estimate when approximately Loki will be released?
@woodsaj any recommendation what I should do if i'd like to add client side logs (e.g. websites javascript console.log's or logs from a mobile app) to loki? it's impossible to use a user id or even device id as label.. but it's quite powerful to have user logs from the client side merged with logs from the server.. is this just out of scope for loki?
@hpoul sending logs from clients directly would be hard, especially as you have no control over the accuracy of the time on the client devices.
The easiest solution would be to just have the logs from clients sent to a central logging service that then writes to Loki, setting the timestamp of each log message to the timestamp of when the log message was received by the logging service and not the timestamp provided by the client when when the log message was created.
@woodsaj thanks for your fast response - well accuracy of the timestamp isn't really important, just that it's consistent within itself for each client.. since clients would send batches of log entries, this central logging service would set the same timestamp for possible hundreds of log lines.. would loki keep the "order" of log lines correct, if all have the same timestamp? also - could this be a feature of loki's ingester itself to tell it to rewrite timestamps? or when sending "ts": null - or a option on the stream when sending logs via the /api/prom/push? :) i really like the idea that i basically only need a single binary (plus a http proxy for auth and grafana) to capture logs..
Most helpful comment
I believe the support of out of order logs is mandatory for the release of Loki. Here is why.
Here is a simple client library https://github.com/afiskon/promtail-client It's README includes a client example.
If you run it in a single process everything works just fine:
However when I start multiple processes in different terminals logs get rejected with out-of-order errors:
In other words currently Loki can be used only with a single process or microservice. If your system has 2+ microservices, currently you just can't use Loki.