Fluent-bit: nginx-ingress 0.25.0 change log format

Created on 19 Jul 2019  路  6Comments  路  Source: fluent/fluent-bit

Bug Report

Describe the bug
After nginx-ingress upgrade to 0.25.0, it added a new log placeholder $proxy_alternative_upstream_name
so the k8s-nginx-ingress parser should be changed after you upgrade your nginx-ingress greater than 0.25.0

[PARSER]
    Name        k8s-nginx-ingress
    Format      regex
    Regex       ^(?<host>[^ ]*) - \[(?<real_ip>[^ ]*)\] - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] \[(?<proxy_alternative_upstream_name>[^ ]*)\]  (?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<last>[^$]*)
    Time_Key    time
    Time_Format %d/%b/%Y:%H:%M:%S %z

To Reproduce

  • apply the old k8s-nginx-ingress parser
  • upgrade nginx-ingress to 0.25.0

Your Environment

  • Version used: 1.0.6
  • Configuration:
  • Environment name and version (e.g. Kubernetes? What version?): kuberentes 1.11.5
  • Server type and version: VM
  • Operating System and version: CoreOS
  • Filters and plugins:

PS: if someone can help to create a better regex parser to compatible with nginx-ingress 0.25.0 and lower version. that will be great!

waiting-for-user

Most helpful comment

Same regex as the one from @jtackaberry (tested with 0.26) but with the correct names for the capture groups such as remote_addr instead of remote or host (names taken from the nginx documentation)

^(?<remote_addr>[^ ]*) - (?<remote_user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<status>[^ ]*) (?<body_bytes_sent>[^ ]*) "(?<http_referer>[^\"]*)" "(?<http_user_agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (\[(?<proxy_alternative_upstream_name>[^ ]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<req_id>[^ ]*)$

All 6 comments

would you please supply two examples of old and new ingress log lines ?

https://github.com/kubernetes/ingress-nginx/pull/4333

I also made a PR to nginx-ingress to let them change the log format document.

New format example

old format

would you please supply two examples of old and new ingress log lines ?

New format:

99.99.88.66 - [99.99.88.66] - username1 [13/Sep/2019:18:27:53 +0000] "POST /api/infra/graphql HTTP/2.0" 200 629 "https://kibana.domain/app/infra" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36" 876 0.135 [logs-kibana-logs-kb-http-5601] [] 10.92.131.195:5601 646 0.136 200 13cf502bfb897683a76e0a663426ee85

Old format:

99.99.88.66 - [99.99.88.66] - username1 [13/Sep/2019:18:27:53 +0000] "POST /api/infra/graphql HTTP/2.0" 200 629 "https://kibana.domain/app/infra" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36" 876 0.135 [logs-kibana-logs-kb-http-5601] 10.92.131.195:5601 646 0.136 200 13cf502bfb897683a76e0a663426ee85

PS: if someone can help to create a better regex parser to compatible with nginx-ingress 0.25.0 and lower version. that will be great!

^(?<host>[^ ]*) - \[(?<real_ip>[^ ]*)\] - (?<user>[^ ]*) \[(?<time>[^\]]*)\] \\*"(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?\\*" (?<code>[^ ]*) (?<size>[^ ]*) \\*"(?<referer>[^\"]*)\\*" \\*"(?<agent>[^\"]*)\\*" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (\[(?<proxy_alternative_upstream_name>[^ ]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<reg_id>[^ ]*).*$

As of 0.26, according to the ingress-nginx docs the current log format is:

log_format upstreaminfo
    '$remote_addr - $remote_user [$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" "$http_user_agent" '
    '$request_length $request_time [$proxy_upstream_name] [$proxy_alternative_upstream_name] $upstream_addr '
    '$upstream_response_length $upstream_response_time $upstream_status $req_id';```

And here is the regexp to parse that:

^(?<remote>[^ ]*) - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] \[(?<proxy_alternative_upstream_name>[^ ]*)\] (?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<req_id>[^ ]*).*$

Same regex as the one from @jtackaberry (tested with 0.26) but with the correct names for the capture groups such as remote_addr instead of remote or host (names taken from the nginx documentation)

^(?<remote_addr>[^ ]*) - (?<remote_user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<status>[^ ]*) (?<body_bytes_sent>[^ ]*) "(?<http_referer>[^\"]*)" "(?<http_user_agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (\[(?<proxy_alternative_upstream_name>[^ ]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<req_id>[^ ]*)$
Was this page helpful?
0 / 5 - 0 ratings