Loki: Support for JSON log lines

Created on 17 Dec 2018  路  33Comments  路  Source: grafana/loki

Most helpful comment

We've reviewing the final design doc. It's coming !

All 33 comments

What sort of support are you looking for?

Loki is format-agnostic and ingests log lines as string lines, may they be access logs, logfmt key/value pairs, or JSON.

Grafana's Explore UI shows you the log lines, and if they are JSON, some support for in-browser parsing to plot distributions of values. Notice how the fields in the log line have an orange underline, that means they were parsed successfully:

screenshot 2018-12-18 at 11 15 46

Does it have an ability to search against json fields?

it would be great if from example, i could filter my logs through a level field contained in the log

@r-moiseev You can use regex as part of the query to "search" against your json fields.

@draeron I don't think this feature will be available soon. In the mean time, you can add the "key" you want to filtered as part of your label that send to the Loki server. Or You could add additional parser for promtail that parse your logs in json format.

Just to share an example, in GKE/Stackdriver I can just search for key/values in the jsonPayload:
jsonPayload.@l="Warning"
Would be awesome to have some direct support for structured logging.

Here's a reference on how Filebeat handles structured/json logs.

With the 0.1.0 release there is included a pipeline which includes a json stage that allows extraction of json log data to be used in labels and/or metrics using JMESPath expressions.

Great! 馃

I have a pod that emits logs in json format. But the logs are not being displayed as nested objects in Loki but a long string (the content of log field):

{"log":"{\"verb\":\"UPDATED\",\"event\":{\"metadata\":{\"name\":\"minio-backup.15c16e910ad17555\",\"namespace\":\"minio\",\"selfLink\":\"/api/v1/namespaces/minio/events/minio-backup.15c16e910ad17555\",\"uid\":\"75b2664e-d08a-11e9-aaf1-42010aa40066\",\"resourceVersion\":\"805300\",\"creationTimestamp\":\"2019-09-06T09:41:11Z\"},\"involvedObject\":{\"kind\":\"CronJob\",\"namespace\":\"minio\",\"name\":\"minio-backup\",\"uid\":\"a355d0fa-cf90-11e9-aaf1-42010aa40066\",\"apiVersion\":\"batch/v1beta1\",\"resourceVersion\":\"35722856\"},\"reason\":\"UnexpectedJob\",\"message\":\"Saw a job that the controller did not create or forgot: test-minio-backup\",\"source\":{\"component\":\"cronjob-controller\"},\"firstTimestamp\":\"2019-09-05T03:55:14Z\",\"lastTimestamp\":\"2019-09-06T10:46:18Z\",\"count\":1373,\"type\":\"Warning\"},\"old_event\":{\"metadata\":{\"name\":\"minio-backup.15c16e910ad17555\",\"namespace\":\"minio\",\"selfLink\":\"/api/v1/namespaces/minio/events/minio-backup.15c16e910ad17555\",\"uid\":\"75b2664e-d08a-11e9-aaf1-42010aa40066\",\"resourceVersion\":\"805295\",\"creationTimestamp\":\"2019-09-06T09:41:11Z\"},\"involvedObject\":{\"kind\":\"CronJob\",\"namespace\":\"minio\",\"name\":\"minio-backup\",\"uid\":\"a355d0fa-cf90-11e9-aaf1-42010aa40066\",\"apiVersion\":\"batch/v1beta1\",\"resourceVersion\":\"35722856\"},\"reason\":\"UnexpectedJob\",\"message\":\"Saw a job that the controller did not create or forgot: test-minio-backup\",\"source\":{\"component\":\"cronjob-controller\"},\"firstTimestamp\":\"2019-09-05T03:55:14Z\",\"lastTimestamp\":\"2019-09-06T10:41:14Z\",\"count\":1355,\"type\":\"Warning\"}}\n","stream":"stdout","time":"2019-09-06T10:46:18.681193448Z"}

Does Loki automatically handle json format or something else still missing?

Hi @minhdanh
i just found a solution for eventrouter :)

    - match: 
        selector: '{app="eventrouter"}'
        stages:
          - json:
              expressions:
                log:
          - json:
              source: log
              expressions:
                event_verb: verb
                event:
          - json:
              source: event
              expressions:
               event_reason: reason
               involvedObject:
               source:
          - json:
              source: involvedObject
              expressions:
                event_kind: kind
                event_namespace: namespace
                event_name: name
          - json:
              source: source
              expressions:
                event_source_host: host
                event_source_component: component
          - labels:
              event_verb:
              event_kind:
              event_reason:
              event_namespace:
              event_name:
              event_source_host:
              event_source_component:

Hi @Lucaber
Thanks for the solution. But looks like it doesn't work for me.
I added your snippet to promtail's pipelineStages config: https://github.com/grafana/loki/blob/master/production/helm/promtail/values.yaml#L29

promtail:
  pipelineStages:
    - match:
        selector: '{app="eventrouter"}'
        stages:
          - json:
              expressions:
                log:
          - json:
              source: log
              expressions:
                event_verb: verb
                event:
          - json:
              source: event
              expressions:
                event_reason: reason
                involvedObject:
                source:
          - json:
              source: involvedObject
              expressions:
                event_kind: kind
                event_namespace: namespace
                event_name: name
          - json:
              source: source
              expressions:
                event_source_host: host
                event_source_component: component
          - labels:
              event_verb:
              event_kind:
              event_reason:
              event_namespace:
              event_name:
              event_source_host:
              event_source_component:

Then deployed promtail again. But the still the same in Loki.

@minhdanh loki only knows logs as byte arrays for storage, everything is basically a string.

Your log example looks like the output of a docker log line, which has json nested inside json.

I'm not quite sure what you are ultimately looking for in Grafana? But the simplest pipeline config would just include the docker stage which will unroll the docker json, and set the log json as the log line, which should then be un-esacaped and appear like normal json.

The config @Lucaber pasted is setting a series of labels from the log but is not manipulating the output sent to Loki, you must use an output pipeline stage for this (the docker stage internally is just a json, timestamp, label, and output stage)

Also @Lucaber I believe you could make your config a little more concise and probably a little faster:

promtail:
  pipelineStages:
    - match:
        selector: '{app="eventrouter"}'
        stages:
          - docker:
          - json:
              expressions:
                event_verb: verb
                event_kind: event.involvedObject.kind
                event_reason: event.reason
                event_namespace: event.involvedObject.namespace
                event_name: event.metadata.name
                event_source_host: event.source.host
                event_source_component: event.source.component
           - labels:
               event_verb:
               event_kind:
               event_reason:
               event_namespace:
               event_name:
               event_source_host:
               event_source_component:

If all your logs are docker, you could also move that outside the match:

promtail:
  pipelineStages:
    - docker:
    - match:
        selector: '{app="eventrouter"}'
        stages:
          - json:
...

The advantage of using the docker stage is that it will set the timestamp from the log line as well as set the output to the un-escaped json of the actual log message

Ohh yes, i was looking for loki labels to easily filter the logs.
I previously tried something similar:

- match:
    selector: '{app="eventrouter"}'
    stages:
      - json:
          expressions:
            event_verb: log.verb
      - labels:
          event_verb:

I also tried verb instead of log.verb but my label was still empty (null). Maybe the docker stage does the trick, i will try this again later.

@slim-bean Thank you. Apparently I removed docker: {} in the pipeline stages and it didn't work. I added it again and it's working with correct json format in Grafana.

I'm not quite sure what you are ultimately looking for in Grafana?

With json supported by Loki I was expecting to search/query the logs using something like object.property=value. This is possible, right?

Currently no, neither grafana/logql have any higher level support for JSON, if you are using logcli you can use -o raw and pipe into something like jq to manipulate json directly. In grafana your option is currently to regex (but this will just match an entire log line).

There are plans to include better handling of JSON in the future but for now all logs are stored and treated the same.

@Lucaber

i just found a solution for eventrouter :)

    - match: 

Hi. Could you provide your full promtail.yaml (or helm values.yaml) for your eventrouter-promtail-loki solution? That would be great :-)

If we do not have nested JSON objects, can I expect this json log line

{"log":"database hrdb is not running\n","loglevel":"error","time":"2020-01-12T01:11:11.870000000-07.00"}

to be converted to the following format ?

  ts                                       output                            loglevel
  ===================================================================================
  2020-01-12T01:11:11.870000000-07.00      database hrdb is not running\n    error

I am using the following config and expecting it to parse the json log line.

````

  • job_name: logjson
    static_configs:

    • targets:



      • localhost


        labels:


        job: jsonlogs


        __path__: /tmp/log.json


        pipeline_stages:



    • json:

      expressions:

      output: log

      loglevel: loglevel

      timestamp: time

    • labels:

      loglevel:

    • timestamp:

      source: time

      format: RFC3339Nano

This is kind of a deal breaker for us because:

  • we advise all the teams to log in a structured manner in JSON (but each team defines own format, no need for central format)
  • we don't want to manage a centralized configuration for everyone where we transform all JSON formats, which means we push this responsibility to "query" side. This approach is already supported by many logging solutions, without writing custom transformation logic on the "write" side.

I love the loki design but, same as @DenisBiondic, for us, without dynamic structured logging, which is what json would bring, it's tough for us to use loki.

We are using serilog in our C# stack to log lots of fields, not labels, just fields here and there. Using labels wouldn't work since they are plenty of fields with high cardinality. Each team is responsible of keeping in sync the field/log generation from our code and the queries - basically grafana dashboards with variables. Using regex only would be a huge step backwards the structured logging path we took (and are very happy with).

@DenisBiondic check https://github.com/grafana/loki/pull/1848 and leave us some feedback like @alexvaut this will help our internal discussion with the team.

I stumbled upon this issue, and I'm looking to introduce Loki to my team as well. As with @DenisBiondic we are running mostly structured logging, and I'm not looking forward to doing any sort of regex to find things, that seems like a step backwards.

Other resources I've found were:
https://stackoverflow.com/questions/58564836/how-to-promtail-parse-json-to-label-and-timestamp
https://github.com/grafana/loki/blob/master/docs/clients/promtail/pipelines.md?ts=4
https://grafana.com/blog/2019/07/25/lokis-path-to-ga-adding-structure-to-unstructured-logs/

All of which point to what feels like needing to know json fields ahead of time, and even converting a json log line back into a "structured text" line.

We're working on solving this via LogQL. You'll be able to select which property at query time you want to show if not all (but that;s hard to read).

Any update here, we are attempting to roll this out company-wide but JSON logging seems to be a blocker.

We've reviewing the final design doc. It's coming !

@cyriltovena any update?

@cyriltovena why is it closed, any resolution here?

@slim-bean any chance to have this reopened? This issue is more about full json support (ingesting logs + querying logs with json support). Only the first part is done with json stage in pipeline and more and more people need this.

Yep I鈥檓 working on the implementation ETA observabilityCON.

any update on this ?

Yes, there was a short demo yesterday in observabilityCON

Jump to the 36:15 offset in this video to see Loki 2.0 short overview including the new 'json' parser stage.

I think on Wed Oct 28th at 12 pm EST, there will be a deep dive into loki improvements. See this page

Really looking forward to the demo'd feature. Any ETA on release?

It's in loki 2.0.0.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

suppix picture suppix  路  3Comments

SuperQ picture SuperQ  路  5Comments

Horkyze picture Horkyze  路  5Comments

kylos101 picture kylos101  路  4Comments

cyriltovena picture cyriltovena  路  4Comments