Fluent-bit: Logs using epoch/unix time

Created on 7 Feb 2018  路  13Comments  路  Source: fluent/fluent-bit

Situation
I have logs that store time in epoch/unix time in key log_time. For example:

"log_time": 1518034685.55049

How can I configure the parser to utilize this time input in elasticsearch? I have the important parts of my config below to better explain:

[INPUT]
    Name              tail
    Parser            twistd
    ...

[OUTPUT]
    Name            es
    Match           *
    Logstash_Format On
    Logstash_Prefix kube
    Time_Key        log_time
    ...

[PARSER]
    Name     twistd
    Format   json
    Time_Key log_time

This config does not function. If I remove Time_Key log_time from es, then it uses @timestamp based on when the log is sent.

Request
I may be wrong, but I do not believe there is a way to do this currently; so I would like to request a way to achieve this.

fixed question

Most helpful comment

Actually there were some more issues:

  • the pattern to match 1518034685.55049 should probably be %s.%L, not just %L
  • the time key in the input JSON has to be a string (cf open issue #662). Types log_time:float in the parser configuration probably only applies to the output of the parser

All 13 comments

@ayenter

as defined in the Parser documentation, the parser definition only takes place in the parsers.conf file:

All parsers must be defined in a parsers.conf file, not in the Fluent Bit global configuration file. The parsers file expose all parsers available that can be used by the Input plugins that are aware of this feature.

move your parser definition to the correct file and it should be ok.

Oh. The parser is in its own file. I just pasted all of the pieces together here. The logging works if I remove the Time_Key from es. My main issue is that it doesn't pick up the Unix Time in log_time.

Question answered. If you have any issue feel free to comment out.

@ayenter please provide example log file

ping ^

Full Config:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: kube-system
  labels:
    k8s-app: fluent-bit
data:
  # Configuration files: server, input, filters and output
  # ======================================================
  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     info
        Daemon        off
        Parsers_File  parsers.conf

    @INCLUDE input-kubernetes.conf
    @INCLUDE filter-logging.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE output-elasticsearch.conf

  input-kubernetes.conf: |
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/fluent-bit/*.log
        Parser            twistd
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10

  filter-logging.conf: |
    [FILTER]
        Name record_modifier
        Match kube.*
        Remove_key log_logger

  filter-kubernetes.conf: |
    [FILTER]
        Name           kubernetes
        Match          kube.*
        Kube_URL       https://kubernetes.default.svc:443
        Merge_JSON_Log On
        Regex_Parser   kube-custom

  output-elasticsearch.conf: |
    [OUTPUT]
        Name            es
        Match           *
        Host            ${FLUENT_ELASTICSEARCH_HOST}
        Port            ${FLUENT_ELASTICSEARCH_PORT}
        Logstash_Format On
        Logstash_Prefix kube
        Retry_Limit     False

  parsers.conf: |
    [PARSER]
        Name    kube-custom
        Format  regex
        Regex   var\.log\.fluent-bit\.(?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$

    [PARSER]
        Name        twistd
        Format      json
        Time_Key    log_time
        Time_Format %L
        Types       log_time:float

More of Log File:

{"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.blue", "log_time": 1518034362.703279, "log_source": null, "log_format": "Generating"}
{"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.red.one", "log_time": 1518034362.962467, "log_source": "abc", "deleted_id": 134456, "log_format": "Deleting"}
{"log_level": {"name": "warn", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.red.one", "log_time": 1518034362.962795, "log_source": "abc", "log_format": "Deleted"}
{"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.red.one", "log_time": 1518034362.962955, "log_source": "abc", "log_format": "Getting"}
{"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.blue", "log_time": 1518034362.984067, "log_source": null, "log_format": "Generating"}

This is with using twisted's logger. However, I just decided to chose a different logger altogether to circumvent the issues I was having. Since I switched to python's built-in logger, I have more control over the logging and do not need to use unix/epoch time. But a solution to this may help somebody in the future...

@ayenter looking at your example log file I can see the timestamp field is inside a nested map, the parser expects to have it at the first level of the map.

@edsiper

"log_time": 1518034362.984067, {"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.blue", "log_source": null, "log_format": "Generating"}

such as this?

@ayenter yes

Fixed as solved (log fields must be at first level of the map).

Actually there were some more issues:

  • the pattern to match 1518034685.55049 should probably be %s.%L, not just %L
  • the time key in the input JSON has to be a string (cf open issue #662). Types log_time:float in the parser configuration probably only applies to the output of the parser

How to set the Parser for handling the time field for logs like this:

"log_processed": {
      "instant": {
        "epochSecond": 1598599671,
        "nanoOfSecond": 241587000
      },
      "thread": "SomeJavaThread01-consumer-0",
      "level": "INFO",
      "loggerName": "com.myproject.xyz.ClassName",
      "message": "Consumer fetched no records; validating Kafka connection...",
      "endOfBatch": false,
      "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
      "contextMap": {},
      "threadId": 40,
      "threadPriority": 5
    }

This is default log format for logs printed in JSON layout using log4j2 in Java. How to specify two different fields, whose combination gives us the correct time?

Should this work?

[PARSER]
        Name   json
        Format json
        Time_Key instant
        Time_Format %s.%L
        Time_Keep On

The instant key here is a JSON string map with epoch and nanoOfSeconds fields.

Was this page helpful?
0 / 5 - 0 ratings