Describe the bug
After nginx-ingress upgrade to 0.25.0, it added a new log placeholder $proxy_alternative_upstream_name
so the k8s-nginx-ingress parser should be changed after you upgrade your nginx-ingress greater than 0.25.0
[PARSER]
Name k8s-nginx-ingress
Format regex
Regex ^(?<host>[^ ]*) - \[(?<real_ip>[^ ]*)\] - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] \[(?<proxy_alternative_upstream_name>[^ ]*)\] (?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<last>[^$]*)
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
To Reproduce
Your Environment
PS: if someone can help to create a better regex parser to compatible with nginx-ingress 0.25.0 and lower version. that will be great!
would you please supply two examples of old and new ingress log lines ?
I also made a PR to nginx-ingress to let them change the log format document.
would you please supply two examples of old and new ingress log lines ?
New format:
99.99.88.66 - [99.99.88.66] - username1 [13/Sep/2019:18:27:53 +0000] "POST /api/infra/graphql HTTP/2.0" 200 629 "https://kibana.domain/app/infra" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36" 876 0.135 [logs-kibana-logs-kb-http-5601] [] 10.92.131.195:5601 646 0.136 200 13cf502bfb897683a76e0a663426ee85
Old format:
99.99.88.66 - [99.99.88.66] - username1 [13/Sep/2019:18:27:53 +0000] "POST /api/infra/graphql HTTP/2.0" 200 629 "https://kibana.domain/app/infra" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36" 876 0.135 [logs-kibana-logs-kb-http-5601] 10.92.131.195:5601 646 0.136 200 13cf502bfb897683a76e0a663426ee85
PS: if someone can help to create a better regex parser to compatible with nginx-ingress 0.25.0 and lower version. that will be great!
^(?<host>[^ ]*) - \[(?<real_ip>[^ ]*)\] - (?<user>[^ ]*) \[(?<time>[^\]]*)\] \\*"(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?\\*" (?<code>[^ ]*) (?<size>[^ ]*) \\*"(?<referer>[^\"]*)\\*" \\*"(?<agent>[^\"]*)\\*" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (\[(?<proxy_alternative_upstream_name>[^ ]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<reg_id>[^ ]*).*$
As of 0.26, according to the ingress-nginx docs the current log format is:
log_format upstreaminfo
'$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" "$http_user_agent" '
'$request_length $request_time [$proxy_upstream_name] [$proxy_alternative_upstream_name] $upstream_addr '
'$upstream_response_length $upstream_response_time $upstream_status $req_id';```
And here is the regexp to parse that:
^(?<remote>[^ ]*) - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] \[(?<proxy_alternative_upstream_name>[^ ]*)\] (?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<req_id>[^ ]*).*$
Same regex as the one from @jtackaberry (tested with 0.26) but with the correct names for the capture groups such as remote_addr instead of remote or host (names taken from the nginx documentation)
^(?<remote_addr>[^ ]*) - (?<remote_user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<status>[^ ]*) (?<body_bytes_sent>[^ ]*) "(?<http_referer>[^\"]*)" "(?<http_user_agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (\[(?<proxy_alternative_upstream_name>[^ ]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<req_id>[^ ]*)$
Most helpful comment
Same regex as the one from @jtackaberry (tested with 0.26) but with the correct names for the capture groups such as
remote_addrinstead ofremoteorhost(names taken from the nginx documentation)