Situation
I have logs that store time in epoch/unix time in key log_time. For example:
"log_time": 1518034685.55049
How can I configure the parser to utilize this time input in elasticsearch? I have the important parts of my config below to better explain:
[INPUT]
Name tail
Parser twistd
...
[OUTPUT]
Name es
Match *
Logstash_Format On
Logstash_Prefix kube
Time_Key log_time
...
[PARSER]
Name twistd
Format json
Time_Key log_time
This config does not function. If I remove Time_Key log_time from es, then it uses @timestamp based on when the log is sent.
Request
I may be wrong, but I do not believe there is a way to do this currently; so I would like to request a way to achieve this.
@ayenter
as defined in the Parser documentation, the parser definition only takes place in the parsers.conf file:
All parsers must be defined in a parsers.conf file, not in the Fluent Bit global configuration file. The parsers file expose all parsers available that can be used by the Input plugins that are aware of this feature.
move your parser definition to the correct file and it should be ok.
Oh. The parser is in its own file. I just pasted all of the pieces together here. The logging works if I remove the Time_Key from es. My main issue is that it doesn't pick up the Unix Time in log_time.
Question answered. If you have any issue feel free to comment out.
@ayenter please provide example log file
ping ^
Full Config:
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: kube-system
labels:
k8s-app: fluent-bit
data:
# Configuration files: server, input, filters and output
# ======================================================
fluent-bit.conf: |
[SERVICE]
Flush 1
Log_Level info
Daemon off
Parsers_File parsers.conf
@INCLUDE input-kubernetes.conf
@INCLUDE filter-logging.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-elasticsearch.conf
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/fluent-bit/*.log
Parser twistd
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
filter-logging.conf: |
[FILTER]
Name record_modifier
Match kube.*
Remove_key log_logger
filter-kubernetes.conf: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Merge_JSON_Log On
Regex_Parser kube-custom
output-elasticsearch.conf: |
[OUTPUT]
Name es
Match *
Host ${FLUENT_ELASTICSEARCH_HOST}
Port ${FLUENT_ELASTICSEARCH_PORT}
Logstash_Format On
Logstash_Prefix kube
Retry_Limit False
parsers.conf: |
[PARSER]
Name kube-custom
Format regex
Regex var\.log\.fluent-bit\.(?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$
[PARSER]
Name twistd
Format json
Time_Key log_time
Time_Format %L
Types log_time:float
More of Log File:
{"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.blue", "log_time": 1518034362.703279, "log_source": null, "log_format": "Generating"}
{"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.red.one", "log_time": 1518034362.962467, "log_source": "abc", "deleted_id": 134456, "log_format": "Deleting"}
{"log_level": {"name": "warn", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.red.one", "log_time": 1518034362.962795, "log_source": "abc", "log_format": "Deleted"}
{"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.red.one", "log_time": 1518034362.962955, "log_source": "abc", "log_format": "Getting"}
{"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.blue", "log_time": 1518034362.984067, "log_source": null, "log_format": "Generating"}
This is with using twisted's logger. However, I just decided to chose a different logger altogether to circumvent the issues I was having. Since I switched to python's built-in logger, I have more control over the logging and do not need to use unix/epoch time. But a solution to this may help somebody in the future...
@ayenter looking at your example log file I can see the timestamp field is inside a nested map, the parser expects to have it at the first level of the map.
@edsiper
"log_time": 1518034362.984067, {"log_level": {"name": "info", "__class_uuid__": "02e59486-f24d-46ad-8224-3acdf2a5732a"}, "log_logger": {"unpersistable": true}, "log_namespace": "abc.blue", "log_source": null, "log_format": "Generating"}
such as this?
@ayenter yes
Fixed as solved (log fields must be at first level of the map).
Actually there were some more issues:
%s.%L, not just %LTypes log_time:float in the parser configuration probably only applies to the output of the parserHow to set the Parser for handling the time field for logs like this:
"log_processed": {
"instant": {
"epochSecond": 1598599671,
"nanoOfSecond": 241587000
},
"thread": "SomeJavaThread01-consumer-0",
"level": "INFO",
"loggerName": "com.myproject.xyz.ClassName",
"message": "Consumer fetched no records; validating Kafka connection...",
"endOfBatch": false,
"loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
"contextMap": {},
"threadId": 40,
"threadPriority": 5
}
This is default log format for logs printed in JSON layout using log4j2 in Java. How to specify two different fields, whose combination gives us the correct time?
Should this work?
[PARSER]
Name json
Format json
Time_Key instant
Time_Format %s.%L
Time_Keep On
The instant key here is a JSON string map with epoch and nanoOfSeconds fields.
Most helpful comment
Actually there were some more issues:
%s.%L, not just%LTypes log_time:floatin the parser configuration probably only applies to the output of the parser