Hello,
_make the long story short_: fluent-bit returns incorrect timestamp from format %H:%M:%S datetime. The timestamp's time is correct. But for timestamp's date I got 28 days difference.
.........
I use iotop to monitor I/O status of processes. Here is a log sample:
16:53:54 674 be/4 nginx 508.84 K/s 35125.77 K/s 0.00 % 0.00 % nginx: worker process
I also wrote a regex to parse this type of log. Here is my template from parsers.conf:
[PARSER]
Name iotop
Format regex
Regex ^(?<time>[^\ ]*)\ *(?<PID>[\d]*)\ *(?<PRIO>[^\ ]*)\ (?<USER>[^\ ]*)[^\d]*(?<DISKREAD>[\d]*\.[\d]*)[^\d]*(?<DISKWRITE>[\d]*\.[\d]*)[^\d]*(?<SWAPIN>[\d]*\.[\d]*)[^\d]*(?<IO>[\d]*\.[\d]*)[\ \%]*(?<COMMAND>.*)$
Time_Format %H:%M:%S
Time_Key time
Time_Keep On
types PID:integer DISKREAD:float DISKWRITE:float SWAPIN:float IO:float
Everything seems OK, as you see, the parsed log would be shown like this:
iotop: [1514739234.000000000, {"time"=>"16:53:54", "PID"=>674, "PRIO"=>"be/4", "USER"=>"nginx", "DISKREAD"=>508.840000, "DISKWRITE"=>35125.770000, "SWAPIN"=>0.000000, "IO"=>0.000000, "COMMAND"=>"nginx: worker process"}]
But let's check the Unix timestamp on the parsed log: 1514739234. It equals to 12/31/2017 16:53:54.
I checked the system date. It is 01/27/2018. There is a 4 week bias!
Hence If we forward such log to fluentd node, we will get the wrong date, too.
.........
I checked this on both CentOS 6 (compiled fluent-bit) and Debian 8 (td-agent-bit from repository) and I got the same result.
UPDATE
fluentbit converts any datetime format that are without date to 31 Dec 2017.
i.e. by using Time_Format %H:%M:%S it converts 16:07:21 to 2017-12-31 16:07:21.
It doesn't matter which date we are in.
thanks for reporting this problem. You are right, actually the parser only populate internally the Year if is missing but not the day and month.
which application generate this log type ?
I have the same issue with Fluent Bit 1.0.4.
My log has this format:
12:14:03 (lmgrd) Server's System Date and Time: Tue Feb 05 2019 12:14:03 UTC
I use this time format in parser: time_format %k:%M:%S
And I got this timestamp in parsed log: 1546258443.000000 (Which is Monday, 31 December 2018 12:14:03)
@edsiper why parsing of date/time string made so hard why not possible simple make gmtime_r(&time_now, tm); to fill incomplete data? This little demo demonstrate approach:
#define _XOPEN_SOURCE
#include <time.h>
#include <stdio.h>
int main(int argc, char* arvg[])
{
struct tm tmy;
time_t time_now = time(NULL);
gmtime_r(&time_now, &tmy);
strptime("2019-02-20 23:55:43", "%H:%M:%S", &tmy);
printf("int tm_sec: %d\n", tmy.tm_sec);
printf("int tm_min: %d\n", tmy.tm_min);
printf("int tm_hour: %d\n", tmy.tm_hour);
printf("int tm_mday: %d\n", tmy.tm_mday);
printf("int tm_mon: %d\n", tmy.tm_mon);
printf("int tm_year: %d\n", tmy.tm_year + 1900);
printf("int tm_wday: %d\n", tmy.tm_wday);
printf("int tm_yday: %d\n", tmy.tm_yday);
printf("int tm_isdst: %d\n", tmy.tm_isdst);
}
so there will be not any problems as described in this issue, and code can be much simpler:
src/flb_parser.c | 74 +++++++-------------------------------------------------
1 file changed, 9 insertions(+), 65 deletions(-)
diff --git a/src/flb_parser.c b/src/flb_parser.c
index 31e53d50..a95afe2f 100644
--- a/src/flb_parser.c
+++ b/src/flb_parser.c
@@ -182,29 +182,12 @@ struct flb_parser *flb_parser_create(char *name, char *format,
p->time_fmt = flb_strdup(time_fmt);
/* Check if the format is considering the year */
- if (strstr(p->time_fmt, "%Y") || strstr(p->time_fmt, "%y")) {
+ if ((strstr(p->time_fmt, "%Y") || strstr(p->time_fmt, "%y")) &&
+ strstr(p->time_fmt, "%m") &&
+ strstr(p->time_fmt, "%d"))
+ {
p->time_with_year = FLB_TRUE;
}
- else {
- size = strlen(p->time_fmt);
- p->time_with_year = FLB_FALSE;
- p->time_fmt_year = flb_malloc(size + 4);
- if (!p->time_fmt_year) {
- flb_errno();
- flb_parser_destroy(p);
- return NULL;
- }
-
- /* Append the year at the beginning */
- tmp = p->time_fmt_year;
- *tmp++ = '%';
- *tmp++ = 'Y';
- *tmp++ = ' ';
-
- memcpy(tmp, p->time_fmt, size);
- tmp += size;
- *tmp++ = '\0';
- }
/* Check if the format contains a timezone (%z) */
if (strstr(p->time_fmt, "%z") || strstr(p->time_fmt, "%Z") ||
@@ -231,32 +214,14 @@ struct flb_parser *flb_parser_create(char *name, char *format,
* - http://stackoverflow.com/questions/7114690/how-to-parse-syslog-timestamp
* - http://code.activestate.com/lists/python-list/521885/
*/
- if (p->time_with_year == FLB_TRUE) {
- tmp = strstr(p->time_fmt, "%S.%L");
- }
- else {
- tmp = strstr(p->time_fmt_year, "%s.%L");
-
- if (tmp == NULL) {
- tmp = strstr(p->time_fmt_year, "%S.%L");
- }
- }
+ tmp = strstr(p->time_fmt, "%S.%L");
if (tmp) {
tmp[2] = '\0';
p->time_frac_secs = (tmp + 5);
}
else {
/* same as above but with comma seperator */
- if (p->time_with_year == FLB_TRUE) {
- tmp = strstr(p->time_fmt, "%S,%L");
- }
- else {
- tmp = strstr(p->time_fmt_year, "%s,%L");
-
- if (tmp == NULL) {
- tmp = strstr(p->time_fmt_year, "%S,%L");
- }
- }
+ tmp = strstr(p->time_fmt, "%S,%L");
if (tmp) {
tmp[2] = '\0';
p->time_frac_secs = (tmp + 5);
@@ -673,7 +638,6 @@ int flb_parser_time_lookup(char *time_str, size_t tsize,
char *time_ptr = time_str;
char tmp[64];
char fs_tmp[32];
- struct tm tmy;
*ns = 0;
@@ -688,11 +652,6 @@ int flb_parser_time_lookup(char *time_str, size_t tsize,
* to be parsed.
*/
if (parser->time_with_year == FLB_FALSE) {
- /* Given time string is too long */
- if (time_len + 6 >= sizeof(tmp)) {
- return -1;
- }
-
/*
* This is not the most elegant way but for now it let
* get the work done.
@@ -704,26 +663,11 @@ int flb_parser_time_lookup(char *time_str, size_t tsize,
time_now = now;
}
- gmtime_r(&time_now, &tmy);
- uint64_t t = tmy.tm_year + 1900;
-
- fmt = tmp;
- u64_to_str(t, fmt);
- fmt += 4;
- *fmt++ = ' ';
-
- memcpy(fmt, time_ptr, time_len);
- fmt += time_len;
- *fmt++ = '\0';
-
- time_ptr = tmp;
- time_len = strlen(tmp);
- p = strptime(time_ptr, parser->time_fmt_year, tm);
- }
- else {
- p = strptime(time_ptr, parser->time_fmt, tm);
+ gmtime_r(&time_now, tm);
}
+ p = strptime(time_ptr, parser->time_fmt, tm);
+
if (p != NULL) {
/* Check if we have fractional seconds */
if (parser->time_frac_secs && (*p == '.' || *p == ',')) {
@tantra35 sorry for the very long reply.
@fujimotos can you take a look at this please?
@edsiper Sure, I will look after this issue.
Perhaps worth pointing out here an important note about the strptime function, from the manpage:
In principle, this function does not initialize tm but stores only the values specified. This means that tm should be initialized before the call. Details differ a bit between different UNIX systems. The glibc implementation does not touch those fields which are not explicitly specified, except that it recomputes the tm_wday and tm_yday field if any of the year, month, or day elements changed.
On the other hand it doesn't seem robust to infer current year, month, day or hour as _now_ considering that the year, month, day or hour may have rolled-over since the time of the partial timestamp.
For example 11:59pm will possibly be interpreted as the next day, depending on how long it takes to reach fluent-bit.
@edsiper I have posted #1960 that fixes the issue reported by @eXtrem0us.
With that merged, Fluent Bit will parses a time string '13:45:30' extrapolating
the date fileld from the system date. So the parser would produce (assuming
today is Feb 20 2020):
{"time":"2020-02-20 13:45:30","log":"..."}
... instead of:
{"time":"2019-12-31 13:45:30","log":"..."}
Note: #1960 is the minimal fix that solves the original issue. It should be sufficient
as long as this issue is concerned.
Some comments on the previous posts:
Fixed on #1960 , thanks, everyone.
@fujimotos pls open a new ticket for remaining issues
Most helpful comment
Note: #1960 is the minimal fix that solves the original issue. It should be sufficient
as long as this issue is concerned.
Some comments on the previous posts:
current complication, and we can drastically simplify the time handling without
losing anything. I'll do this after v1.4.0 released
handle the midnight robustly. My rough idea is to add a logic to rollback the
timestamp to yesterday if the timestamp is a future date; I think this issue would
deserve an another separate ticket.