Description:
I followed https://github.com/fluent/fluent-bit/issues/729 including the nginx regex pattern but to no avail. I tried a different pattern, will be mentioned below, that works in Rubular. Log is not extracted and merged.
Steps to reproduce:
apiVersion: v1
kind: ConfigMap
metadata:
name: efk-fluent-bit-config
labels:
app: efk-fluent-bit
data:
fluent-bit.conf: |-
[SERVICE]
Flush 1
Daemon Off
Log_Level info
Parsers_File parsers.conf
[INPUT]
Name tail
Tag kube.default.nginx.*
Path /var/log/containers/*_nginx*.log
Parser docker
DB /var/log/flb_kube.db
Refresh_Interval 10
Mem_Buf_Limit 5MB
Skip_Long_Lines On
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Merge_Log On
K8S-Logging.Parser On
[OUTPUT]
Name es
Match *
Host efk-elasticsearch-client.default.svc.cluster.local
Port 9200
Logstash_Format On
Retry_Limit False
Type flb_type
Logstash_Prefix kubernetes_cluster
Time_Key @timestamp_es
parsers.conf: |-
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# Command | Decoder | Field | Optional Action
# =============|==================|=================
Decode_Field_As escaped log
[PARSER]
Name nginx
Format regex
Regex ^(?<remote>\S*) (?<host>\S*) (?<user>\S*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?:\s+\S*)?)?" (?<code>\S*) (?<size>\S*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?:\s+(?<http_x_forwarded_for>\S+))?)?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
And in my deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: redacted-app
labels:
app: redacted
spec:
replicas: 1
selector:
matchLabels:
app: redacted
template:
metadata:
annotations:
fluentbit.io/parser: nginx
labels:
app: redacted
And from the logs (from Kibana)
{
"_index": "kubernetes_cluster-2018.10.09",
"_type": "flb_type",
"_id": "ogP3WGYBlVl5afTGSP3L",
"_version": 1,
"_score": null,
"_source": {
"@timestamp_es": "2018-10-09T13:14:52.960Z",
"log": "172.17.0.1 - - [09/Oct/2018:13:14:52 +0000] \"POST /enterprise/info/ HTTP/1.1\" 404 104 \"http://192.168.39.251:30309/enterprise/info/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n",
"stream": "stdout",
"time": "2018-10-09T13:14:52.960867435Z",
"kubernetes": {
"pod_name": "redacted-app-586c454745-7ztk8",
"namespace_name": "default",
"pod_id": "d1361c4f-cbc0-11e8-abc9-7487d6dd4e5c",
"labels": {
"app": "redacted",
"pod-template-hash": "1427010301"
},
"annotations": {
"fluentbit.io/parser": "nginx"
},
"host": "minikube",
"container_name": "nginx-proxy",
"docker_id": "b6ba28183a2edc3269657037f967c7cc2d70874a109509548ef56e3f2657e1c8"
}
},
"fields": {
"@timestamp_es": [
"2018-10-09T13:14:52.960Z"
],
"time": [
"2018-10-09T13:14:52.960Z"
]
},
"highlight": {
"kubernetes.annotations.fluentbit.io/parser.keyword": [
"@kibana-highlighted-field@nginx@/kibana-highlighted-field@"
],
"kubernetes.annotations.fluentbit.io/parser": [
"@kibana-highlighted-field@nginx@/kibana-highlighted-field@"
],
"kubernetes.container_name": [
"@kibana-highlighted-field@nginx@/kibana-highlighted-field@-proxy"
]
},
"sort": [
1539090892960
]
}
The difference from above is the regex used ^(?<remote>\S*) (?<host>\S*) (?<user>\S*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?:\s+\S*)?)?" (?<code>\S*) (?<size>\S*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?:\s+(?<http_x_forwarded_for>\S+))?)?$
Test: http://rubular.com/r/glM0o40A9M
Q: Does the last part of the log \" \"-\"\n from the json matter ?
Expected behavior
Same expected output from https://github.com/fluent/fluent-bit/issues/729. To get all the fields that I described in the Rubular output in the fluent-bit log.
_Originally posted by @aldnav in https://github.com/fluent/fluent-bit/issue_comments#issuecomment-428192888_
Inspecting the logs yield
Fluentbit logs
[2018/10/10 06:38:44] [debug] [in_tail] file=/var/log/containers/redacted-app-586c454745-7ztk8_default_nginx-proxy-cb46caffcb63ad165faa760e5f4dd009f167d3b7afded8b3b5e36745c0f71c07.log event
[2018/10/10 06:38:44] [debug] [filter_kube] could not merge JSON log as requested
[2018/10/10 06:38:44] [debug] [input tail.0] [mem buf] size = 638
Tailing that log
root@fluent-bit-p4ng2:/# tail /var/log/containers/redacted-app-586c454745-7ztk8_default_nginx-proxy-cb46caffcb63ad165faa760e5f4dd009f167d3b7afded8b3b5e36745c0f71c07.log
{"log":"172.17.0.1 - - [10/Oct/2018:06:35:09 +0000] \"GET /api-sileo/winners/winners-page/filter/?top=0\u0026bottom=8 HTTP/1.1\" 200 12 \"http://192.168.39.251:30309/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:35:09.267040602Z"}
{"log":"172.17.0.1 - - [10/Oct/2018:06:35:09 +0000] \"GET /api-sileo/winners/cf-tournament-winners/filter/?top=0\u0026bottom=8 HTTP/1.1\" 200 12 \"http://192.168.39.251:30309/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:35:09.271704302Z"}
{"log":"172.17.0.1 - - [10/Oct/2018:06:35:09 +0000] \"GET /api/video/video-player/filter/?slug=how-to-win-in-redacted HTTP/1.1\" 200 14 \"http://192.168.39.251:30309/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:35:09.287174688Z"}
{"log":"172.17.0.1 - - [10/Oct/2018:06:35:09 +0000] \"GET /api/video/video-player/filter/?slug=cfschool-what-is-redactedcom HTTP/1.1\" 200 14 \"http://192.168.39.251:30309/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:35:09.288450651Z"}
{"log":"172.17.0.1 - - [10/Oct/2018:06:35:09 +0000] \"GET /api-sileo/home/profile/filter/?top=0 HTTP/1.1\" 200 12 \"http://192.168.39.251:30309/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:35:09.298623044Z"}
{"log":"172.17.0.1 - - [10/Oct/2018:06:35:09 +0000] \"GET /api-sileo/landing/featured/filter/?size_per_request=10 HTTP/1.1\" 200 12 \"http://192.168.39.251:30309/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:35:09.322164595Z"}
{"log":"172.17.0.1 - - [10/Oct/2018:06:38:43 +0000] \"GET / HTTP/1.1\" 200 40431 \"-\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:38:43.631043381Z"}
{"log":"172.17.0.1 - - [10/Oct/2018:06:38:43 +0000] \"GET /session-info/get-session-info/ HTTP/1.1\" 200 407 \"http://192.168.39.251:30309/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:38:43.677699068Z"}
{"log":"172.17.0.1 - - [10/Oct/2018:06:38:44 +0000] \"GET /session-info/get-session-info/ HTTP/1.1\" 200 407 \"http://192.168.39.251:30309/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:38:44.255370783Z"}
{"log":"172.17.0.1 - - [10/Oct/2018:06:38:44 +0000] \"GET /session-info/get-session-info/ HTTP/1.1\" 200 407 \"http://192.168.39.251:30309/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2018-10-10T06:38:44.321339656Z"}
I made adjustments with the regex used to consider the backslashes.
See http://rubular.com/r/jw07YfUfTn
But this config did not work either. [filter_kube] could not merge JSON log as requested still appears and fields aren't parsed and merged.
I made a test with my local build of fluentbit
testconf.conf
[SERVICE]
Flush 1
Daemon Off
Log_Level debug
Parsers_File testparser.conf
[INPUT]
Name tail
### This input is copied straight from nginx stdout access log ###
Tag nginx.data
Path nginxlog
[INPUT]
Name dummy
Tag dummy.data
#### This input has backslash from kibana json view ###
Dummy {"log": "172.17.0.1 - - [10/Oct/2018:09:38:14 +0000] \"GET /api-sileo/landing/opentournament/filter/?status=1u0026started_date__lte=1539164292u0026prize_order=u0026top=0u0026bottom=9 HTTP/1.1\" 200 12 \"http://192.168.39.251:30309/browsing/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n"}
[FILTER]
Name parser
Match nginx.*
Key_Name log
Parser nginxp
[FILTER]
Name parser
Match dummy.*
Key_Name log
Parser dummy_test_2
[OUTPUT]
Name stdout
Match *
testparser.conf
[PARSER]
Name nginxp
Format regex
### clean nginx regex (get input from nginx tail) ###
Regex ^(?<remote>\S*) (?<host>\S*) (?<user>\S*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?:\s+\S*)?)?" (?<code>\S*) (?<size>\S*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?:\s+(?<http_x_forwarded_for>\S+))?)?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name dummy_test_2
Format regex
### get input from kibana json (with backslashes) ###
Regex ^(?<remote>\S*) (?<host>\S*) (?<user>\S*) \[(?<time>[^\]]*)\] \\"(?<method>\S+)(?: +(?<path>[^\"]*?)(?:\s+\S*)?)?" (?<code>\S*) (?<size>\S*)(?: \\"(?<referer>[^\"]*)\\" \\"(?<agent>[^\"]*)\\"(?:\s+\\"(?<http_x_forwarded_for>\S+)\\"\S+)?)?$
The test uses these two regex patterns:
With backslash http://rubular.com/r/glM0o40A9M
Without backslash http://rubular.com/r/jw07YfUfTn
I tested both on kubernetes hoping that the parser pipeline follows docker -> nginx but the log remains intact.
The local test has the following results.
➜ build bin/fluent-bit -c testconf.conf
Fluent-Bit v0.14.1
Copyright (C) Treasure Data
[2018/10/10 17:58:20] [ info] [engine] started (pid=5241)
[2018/10/10 17:58:20] [debug] [in_tail] inotify watch fd=19
[2018/10/10 17:58:20] [debug] [in_tail] scanning path nginxlog
[2018/10/10 17:58:20] [debug] [in_tail] add to scan queue nginxlog, offset=0
[2018/10/10 17:58:20] [debug] [router] input=tail.0 'DYNAMIC TAG'
[2018/10/10 17:58:20] [debug] [router] match rule dummy.0:stdout.0
[2018/10/10 17:58:20] [debug] [input tail.0] [mem buf] size = 1388
[2018/10/10 17:58:20] [debug] [in_tail] file=nginxlog read=1532 lines=4
[2018/10/10 17:58:20] [debug] [in_tail] file=nginxlog promote to TAIL_EVENT
[0] nginx.data: [1539164292.000000000, {"remote"=>"172.17.0.1", "host"=>"-", "user"=>"-", "method"=>"GET", "path"=>"/api-sileo/landing/featured/filter/?size_per_request=10", "code"=>"200", "size"=>"12", "referer"=>"http://192.168.39.251:30309/browsing/", "agent"=>"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36", "http_x_forwarded_for"=>""-""}]
[2018/10/10 17:58:20] [debug] [task] created task=0x1bcd880 id=0 OK
[1] nginx.data: [1539164292.000000000, {"remote"=>"172.17.0.1", "host"=>"-", "user"=>"-", "method"=>"GET", "path"=>"/api-sileo/landing/trendingtournament/filter/?top=0&bottom=9", "code"=>"200", "size"=>"12", "referer"=>"http://192.168.39.251:30309/browsing/", "agent"=>"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36", "http_x_forwarded_for"=>""-""}]
[2] nginx.data: [1539164293.000000000, {"remote"=>"172.17.0.1", "host"=>"-", "user"=>"-", "method"=>"GET", "path"=>"/api-sileo/landing/battlingtournament/filter/?status=3&started_date__lte=1539164292&prize_order=&top=0&bottom=9", "code"=>"200", "size"=>"12", "referer"=>"http://192.168.39.251:30309/browsing/", "agent"=>"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36", "http_x_forwarded_for"=>""-""}]
[3] nginx.data: [1539164294.000000000, {"remote"=>"172.17.0.1", "host"=>"-", "user"=>"-", "method"=>"GET", "path"=>"/api-sileo/landing/rankingtournament/filter/?status=2&started_date__lte=1539164292&prize_order=&top=0&bottom=9", "code"=>"200", "size"=>"12", "referer"=>"http://192.168.39.251:30309/browsing/", "agent"=>"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36", "http_x_forwarded_for"=>""-""}]
[2018/10/10 17:58:20] [debug] [input dummy.0] [mem buf] size = 416
[2018/10/10 17:58:20] [debug] [task] destroy task=0x1bcd880 (task_id=0)
[2018/10/10 17:58:20] [debug] [dyntag tail.0] 0x1bcdb90 destroy (tag=nginx.data, bytes=1388)
[0] dummy.data: [1539165501.000329612, {"remote"=>"172.17.0.1", "host"=>"-", "user"=>"-", "time"=>"10/Oct/2018:09:38:14 +0000", "method"=>"GET", "path"=>"/api-sileo/landing/opentournament/filter/?status=1u0026started_date__lte=1539164292u0026prize_order=u0026top=0u0026bottom=9", "code"=>"200", "size"=>"12", "referer"=>"http://192.168.39.251:30309/browsing/", "agent"=>"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36", "http_x_forwarded_for"=>"-"}]
[2018/10/10 17:58:21] [debug] [task] created task=0x1bcd880 id=0 OK
[2018/10/10 17:58:21] [debug] [input dummy.0] [mem buf] size = 416
[2018/10/10 17:58:21] [debug] [task] destroy task=0x1bcd880 (task_id=0)
^C[engine] caught signal (SIGINT)
[2018/10/10 17:58:22] [ info] [input] pausing tail.0
[2018/10/10 17:58:22] [ info] [input] pausing dummy.0
The local test parsed it correctly.
The pattern is:
^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
But you have
Dummy {"log": "172.17.0.1 - - [10/Oct/2018:09:38:14 +0000] \"GET /api-sileo/landing/opentournament/filter/?status=1u0026started_date__lte=1539164292u0026prize_order=u0026top=0u0026bottom=9 HTTP/1.1\" 200 12 \"http://192.168.39.251:30309/browsing/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36\" \"-\"\n"}
a trailing "-"\n
Unescaped, your log is:
172.17.0.1 - - [10/Oct/2018:09:38:14 +0000] "GET /api-sileo/landing/opentournament/filter/?status=1u0026started_date__lte=1539164292u0026prize_order=u0026top=0u0026bottom=9 HTTP/1.1" 200 12 "http://192.168.39.251:30309/browsing/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.16 Safari/537.36" "-"\n
So, i'm not sure why you have that "-"\n on the end. If you have changed your nginx config to use a different format in access log, you'll need to change your signature. If you feel your log is correct, the signature would be:
^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?
consider also setting "merge_log_trim" in the Kubernetes config if you are using it.
Thanks @donbowman The funny thing is I did not customize the log_format of Nginx.
I also tried "Merge_Log_Trim On". And tried your last regexp but I still get [filter_kube] could not merge JSON log as requested
If this matters:
I managed to set nginx acess_log to:
log_format test '$remote_addr $host $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
access_log /var/log/nginx/access.log test;
and tailing the log file
root@fluent-bit-47p6z:/# tail /var/log/containers/redacted-app-59564c9548-c9g52_default_nginx-proxy-19a6fc87c0c06558c1fcd18039c1aad51ca1df74936c75bf202f22e79c38f42e.log
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:51 +0000] \"GET /session-info/get-session-info/ HTTP/1.1\" 200 221 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:51.199419312Z"}
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:52 +0000] \"GET /session-info/get-session-info/ HTTP/1.1\" 200 221 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:52.655943811Z"}
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:52 +0000] \"GET /session-info/get-session-info/ HTTP/1.1\" 200 221 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:52.674953975Z"}
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:52 +0000] \"GET /?ajax=true HTTP/1.1\" 200 539 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:52.785946146Z"}
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:52 +0000] \"GET /api-sileo/winners/winners-page/filter/?top=0\u0026bottom=8 HTTP/1.1\" 200 12 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:52.835820492Z"}
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:52 +0000] \"GET /api-sileo/winners/cf-tournament-winners/filter/?top=0\u0026bottom=8 HTTP/1.1\" 200 12 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:52.840595389Z"}
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:52 +0000] \"GET /api/video/video-player/filter/?slug=how-to-win-in-redacted HTTP/1.1\" 200 14 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:52.866050142Z"}
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:52 +0000] \"GET /api-sileo/landing/featured/filter/?size_per_request=10 HTTP/1.1\" 200 12 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:52.866085395Z"}
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:52 +0000] \"GET /api/video/video-player/filter/?slug=cfschool-what-is-redactedcom HTTP/1.1\" 200 14 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:52.867881901Z"}
{"log":"172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:57 +0000] \"GET /business/?ajax=true HTTP/1.1\" 200 441 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n","stream":"stdout","time":"2018-10-16T02:14:57.244942033Z"}
and from Kibana JSON view
"_source": {
"@timestamp_es": "2018-10-16T02:14:57.244Z",
"log": "172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:57 +0000] \"GET /business/?ajax=true HTTP/1.1\" 200 441 \"http://192.168.39.251:31714/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n",
"stream": "stdout",
"time": "2018-10-16T02:14:57.244942033Z",
"kubernetes": {
"pod_name": "redacted-app-59564c9548-c9g52",
"namespace_name": "default",
"pod_id": "07149771-d0e9-11e8-a6a1-7487d6dd4e5c",
"labels": {
"app": "redacted",
"pod-template-hash": "1512075104"
},
"annotations": {
"fluentbit.io/parser": "nginxp"
},
"host": "minikube",
"container_name": "nginx-proxy",
"docker_id": "19a6fc87c0c06558c1fcd18039c1aad51ca1df74936c75bf202f22e79c38f42e"
}
},
\n and \" are still in the log so I used the following regex pattern
^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \\"(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: \\"(?<referer>[^\"]*)\\" \\"(?<agent>[^\"]*)\\")?\\n$
a slight modification from the nginx parser regex. As in http://rubular.com/r/KZRaYAMrDK
However, I still get
[2018/10/16 02:14:57] [debug] [filter_kube] could not merge JSON log as requested
fluent-bit.conf
[SERVICE]
Flush 1
Daemon Off
Log_Level debug
Parsers_File parsers-wtf.conf
[INPUT]
Name tail
Tag kube.default.nginx.*
Path /var/log/containers/*_nginx*.log
Parser docker
DB /var/log/flb_kube.db
Refresh_Interval 10
Mem_Buf_Limit 5MB
Skip_Long_Lines On
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Merge_Log_Trim On
Merge_Log On
# Merge_Log_Key rlog
K8S-Logging.Parser On
[OUTPUT]
Name es
Match *
Host efk-elasticsearch-client.default.svc.cluster.local
Port 9200
Logstash_Format On
Retry_Limit False
Type flb_type
Logstash_Prefix kubernetes_cluster
Time_Key @timestamp_es
[OUTPUT]
Name stdout
Match *
parsers-wtf.conf
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# Command | Decoder | Field | Optional Action
# =============|==================|=================
Decode_Field_As escaped log
Decode_Field_As escaped stream
[PARSER]
Name nginxp
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \\"(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: \\"(?<referer>[^\"]*)\\" \\"(?<agent>[^\"]*)\\")?\\n$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S
Anyone lucky to get this working?
i don't think you need the escapes in the parser:
You have:
^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \\"(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: \\"(?<referer>[^\"]*)\\" \\"(?<agent>[^\"]*)\\")?\\n$
I think you want \" instead of \" etc.
You can test the pattern on https://regex101.com/ pretty simply.
I applied you log message:
172.17.0.1 192.168.39.251 - [16/Oct/2018:02:14:51 +0000] "GET /session-info/get-session-info/ HTTP/1.1" 200 221 "http://192.168.39.251:31714/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36"
and the suggested pattern:
^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?
on regex101.com and I see it extract ok (and thus the merge should work i think).
if the merge doesn't work it would imply maybe a duplicate field, likely time,and you could consider a docker line w/o the time key.
@edsiper can you take a look at this please?
@donbowman That would be perfect only that the log message contains \" and \n
It seems like no regex pattern is working?
I tried setting
Decode_Field_As escaped_utf8 log
in docker parser which yields
"log": "172.17.0.1 192.168.39.251 - [16/Oct/2018:03:18:40 +0000] \"GET /api-sileo/version-info/version-info/filter/?platform=1 HTTP/1.1\" 200 12 \"http://192.168.39.251:31714/populaasdfasdfasdf/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\u0000",
then I adjusted the regex pattern, (\n becomes \u0000), to
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \\"(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: \\"(?<referer>[^\"]*)\\" \\"(?<agent>[^\"]*)\\")?\\u0000$
as in http://rubular.com/r/oUqgP7DUtp
I also tried to change the time key in the pattern <rtime> to avoid duplicate when merging. Still no luck.
Thanks @donbowman I tried your pattern and it did merge fields.
However, I'm still puzzled as to why kibana json output log as
172.17.0.1 192.168.39.251 - [16/Oct/2018:03:45:20 +0000] \"GET /populaasdfasdfasdf/ HTTP/1.1\" 404 4617 \"-\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.54 Safari/537.36\"\n
Question
I was under the impression that the parser flow is docker -> nginxp that's why I tested it against the above log. If that's not the case, how else would I test the log against other patterns (for other apps aside from nginx)?
if you have \" and \n in the message, it sounds like the decoder is not
running.
On Mon, 15 Oct 2018 at 23:00, Aldrin Navarro notifications@github.com
wrote:
@donbowman https://github.com/donbowman That would be perfect only that
the log message contains \" and \n—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fluent/fluent-bit/issues/838#issuecomment-430084050,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AE5Ok9la_ryFrN87Dv2SiUBHD3ZtEbr9ks5ulUvXgaJpZM4XTS5F
.
i think that is normal for kibana (except the \n). Kibana is re-escaping to make it valid json.
E.g. I see:
{\"version\":\"e5b877e0587fff4797e0dc3a6c01514601ea2d562cf5cd7e3f927bcaaea3e7ec\",\"initContainers\":[\"istio-init\"],\"containers\":[\"istio-proxy\"],\"volumes\":[\"istio-envoy\",\"istio-certs\"],\"imagePullSecrets\":[\"regcred\"]}
to see the raw log, i run kubectl logs (or docker log, or jsut cp the file from /var/lib/containers/...).
Thanks @donbowman for clearing that. Closing issue now.