Fluent-bit: Lua filters can mangle timestamps

Created on 8 Mar 2020 · 15Comments · Source: fluent/fluent-bit

Due to Lua's use of the double data type for numbers, and Fluent Bit's representation of timestamps as <seconds>.<nanoseconds> , conversion to and from the double type can cause a loss of precision that results in incorrect timestamps.

Consider this simple filter:

function noop(tag, timestamp, record)
    return 1, timestamp, record

Because it returns 1 for code, Fluent Bit will convert from the double back to its internal representation.

Then:

Suppose an original log line with timestamp 2020-02-19T18:00:00.001+00:00
Fluent Bit parses that timestamp and according to the stdout output plugin, represents as 1582135200.001000000 which is as expected.
But if the Lua noop() filter is called, now some precision is lost, and the stdout output shows the timestamp is 1582135200.000999927

To make matters worse, passing this through an output plugin that only supports milliseconds and which truncates rather than rounds (such as es) can result in a @timestamp with the wrong millisecond (2020-02-19T18:00:00.000 in this example).

One solution would be to pass seconds and nanoseconds as separate parameters and treating each as longs without any fractional values. By way of backward compatibility, a new config parameter could be introduced to use the new style function signature, which would be something like:

function filter(tag, timestamp_seconds, timestamp_nanoseconds, record)

bug fixed

Source

jtackaberry

Most helpful comment

I've pushed a solution for this problem, please check the commit notes here: 1204440746c6e38671407466591711746c07465e

The solution will be part of Fluent Bit v1.6

edsiper on 16 Sep 2020

🎉2

All 15 comments

Although not as complete, a simpler stop-gap could be to have a new return code value that tells fluent-bit that the record has changed but the timestamp has not been modified, so it will skip the conversion from the Lua timestamp.

jtackaberry on 12 Mar 2020

👍1

Note:

how to reproduce

a.conf:

[INPUT]
    Name forward

[FILTER]
    Name stdout
    Match *

[FILTER]
    Name lua
    Match *
    script example.lua
    call test

[OUTPUT]
    Name stdout
    Match *

example.lua:

function test(tag, timestamp, record)
         record["filter"] = "lua"
         return 1, timestamp, record
end

command:

$ ../bin/fluent-bit -c a.conf &
$ printf "\x93\xa4hoge\xd7\x00\x5e\x4d\x77\xa0\x00\x0f\x42\x40\x81\xa4hoge\xa4hoge" | nc 127.0.0.1 24224

output:

$ ../bin/fluent-bit -c a.conf 
Fluent Bit v1.5.0
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/04/13 11:32:15] [ info] [storage] version=1.0.3, initializing...
[2020/04/13 11:32:15] [ info] [storage] in-memory
[2020/04/13 11:32:15] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/04/13 11:32:15] [ info] [engine] started (pid=31723)
[2020/04/13 11:32:15] [ info] [input:forward:forward.0] listening on 0.0.0.0:24224
[2020/04/13 11:32:15] [ info] [sp] stream processor started
[0] hoge�: [1582135200.001000000, {"hoge"=>"hoge"}]
[0] hoge: [1582135200.000999927, {"filter"=>"lua", "hoge"=>"hoge"}]

nokute78 on 13 Apr 2020

Fixed on #2100

edsiper on 13 Apr 2020

@jtackaberry Thank you for reporting issue.
I sent a patch to fix this issue.

The patch supports new return code '2' to keep the timestamp.

Could you check my PR ?

2100

nokute78 on 14 Apr 2020

Thanks @edsiper for implementing this stopgap solution. It's great for Lua filters that don't need to update the timestamp, which is probably the most common use case.

I do recommend this issue be reopened though because the underlying issue still exists: it's not possible for a Lua filter to adjust timestamps without conversion issues.

The use case that led to the discovery of this issue was having a filter encode a sequence number into the nanosecond resolution of a log line, to address sequencing issues for millisecond precision log lines before they are inserted into Elasticsearch.

So consider an input test.log that looks like this:

2020-02-19T18:00:00.000+00:00 line 1
2020-02-19T18:00:00.000+00:00 line 2
2020-02-19T18:00:00.000+00:00 line 3
2020-02-19T18:00:00.001+00:00 line 1
2020-02-19T18:00:00.001+00:00 line 2
2020-02-19T18:00:00.001+00:00 line 3

And with this configuration:

td-agent-bit.conf:

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info
    Parsers_File parsers.conf
    Plugins_File /etc/td-agent-bit/plugins.conf

[INPUT]
    Name          tail
    Tag           test
    Path          test.log
    Parser        simple

[FILTER]
    Name    lua
    Match   test
    script  filter.lua
    call    tweak

[OUTPUT]
    Name stdout
    Match *

parsers.conf:

[PARSER]
    Name        simple
    Format      regex
    Regex       ^(?<time>[^ ]*) (?<message>.+)$
    Time_Format %Y-%m-%dT%H:%M:%S.%L%z
    Time_Key    time
    Time_Keep   On

filter.lua:

offset = 0
last_timestamp = 0

function tweak(tag, timestamp, record)
    code = 2
    if last_timestamp ~= timestamp then
        offset = 0
        last_timestamp = timestamp
    else
        code = 1
        offset = offset + 0.00000001
        timestamp = timestamp + offset
    end
    return code, timestamp, record
end

Running Fluent Bit, the output is:

[0] test: [1582135200.000000000, {"message"=>"line 1", "time"=>"2020-02-19T18:00:00.000+00:00"}]
[1] test: [1582135200.000000000, {"message"=>"line 2", "time"=>"2020-02-19T18:00:00.000+00:00"}]
[2] test: [1582135200.000000000, {"message"=>"line 3", "time"=>"2020-02-19T18:00:00.000+00:00"}]
[3] test: [1582135200.001000000, {"message"=>"line 1", "time"=>"2020-02-19T18:00:00.001+00:00"}]
[4] test: [1582135200.000999927, {"message"=>"line 2", "time"=>"2020-02-19T18:00:00.001+00:00"}]
[5] test: [1582135200.000999927, {"message"=>"line 3", "time"=>"2020-02-19T18:00:00.001+00:00"}]

Subsequent log lines have actually gone back in time due to the loss of precision and truncation.

This can be somewhat mitigated by encoding a sequence in the microseconds digits (so adding 0.000001 to the timestamp for each log line at the same millisecond):

[0] test: [1582135200.000000000, {"message"=>"line 1", "time"=>"2020-02-19T18:00:00.000+00:00"}]
[1] test: [1582135200.000000953, {"message"=>"line 2", "time"=>"2020-02-19T18:00:00.000+00:00"}]
[2] test: [1582135200.000001907, {"message"=>"line 3", "time"=>"2020-02-19T18:00:00.000+00:00"}]
[3] test: [1582135200.001000000, {"message"=>"line 1", "time"=>"2020-02-19T18:00:00.001+00:00"}]
[4] test: [1582135200.001000881, {"message"=>"line 2", "time"=>"2020-02-19T18:00:00.001+00:00"}]
[5] test: [1582135200.001001834, {"message"=>"line 3", "time"=>"2020-02-19T18:00:00.001+00:00"}]

The new 2 return code does make at least this much possible now, but then this only allows 1000 log lines per millisecond before we lose the ability to sequence. On average this is fine, but I can imagine short bursts log log lines (e.g. some form of trace dump) that could overflow this.

So I do think the proper long term solution is separate arguments for seconds and nanoseconds.

jtackaberry on 12 May 2020

@nokute78 did the magic :) .

I am wondering if any other Lua version fix this

edsiper on 13 May 2020

Oops, indeed, apologies @nokute78 for not giving you your due credit. And thanks. :)

I confess I'm not very well versed on Lua internals, but I don't think this behavior will be dependent on any version, since with Lua all numeric values are doubles. I'm not sure of a way to address the issue except the option I first suggested to separate seconds and nanoseconds into different arguments when calling into Lua.

flb_time itself already works this way (as a wrapper of timespec) so it seems like a natural fit to preserve the separation into and out of Lua space. Probably the most inconvenient part about the proposal is the need for an addition config parameter to request calling into Lua using the new semantics to preserve backward compatibility.

jtackaberry on 13 May 2020

@jtackaberry Thank you for reporting. I agree.

The route cause is rounding error of type conversion flb_time <-> double.
(It affects not only filter_lua)

To prevent it, we need to modify callback prototype.
timestamp will be array or two variables.
It is a breaking change.

So we will need new property like call_v2 to use new prototype.

nokute78 on 14 May 2020

what if we pass nanoseconds since epoch ?

edsiper on 14 May 2020

@edsiper Hmm, we may need to use integer as Lua Number.

We can see rounding error such code. (pseudo https://github.com/fluent/fluent-bit/issues/2015#issuecomment-627547843 case)
(double)l_tm will be mangled.

#include <stdio.h>
#include <time.h>

int main(void) {
  struct timespec tm;
  long l_tm;
  tm.tv_sec = 1582135200;
  tm.tv_nsec = 1000000;

  l_tm = tm.tv_sec * 1000000000 + tm.tv_nsec;

  printf("tm.tv_sec =%ld\n", tm.tv_sec);
  printf("tm.tv_nsec=%ld\n", tm.tv_nsec);
  printf("l_tm      =%ld\n", l_tm);
  printf("l_tm(dbl) =%lf\n", (double)l_tm);
}

output is

tm.tv_sec =1582135200
tm.tv_nsec=1000000
l_tm      =1582135200001000000
l_tm(dbl) =1582135200000999936.000000

what if we pass nanoseconds since epoch ?

Golang supports such API.
However, I think this issue is caused by type casting double <-> integer.
https://golang.org/pkg/time/#Time.UnixNano

nokute78 on 15 May 2020

interesting, so nanoseconds is the way to go...

should we introduce a new callback prototype and config type like call2() ?, on that way we don't break our users

edsiper on 15 May 2020

I've pushed a solution for this problem, please check the commit notes here: 1204440746c6e38671407466591711746c07465e

The solution will be part of Fluent Bit v1.6

edsiper on 16 Sep 2020

🎉2

Really usefull, using the trick above to add "nanosecond" to a timestamp, for example to send to influxdb output :

offset = 0
last_timestamp = 0

-- this only allows 1000 log lines per millisecond
function tweak(tag, timestamp, record)
    nsec_timestamp = timestamp['nsec']
    sec_timestamp = timestamp['sec']

    --sf=string.format
    --print(sf("input sec timestamp: %d.",sec_timestamp))
    --print(sf("input nanosec timestamp: %d.",nsec_timestamp))
    code = 2
    if last_timestamp ~= sec_timestamp then
        offset = 0
        last_timestamp = sec_timestamp
    else
        code = 1
        -- we add one nanosecond
        offset = offset + 1
        timestamp['nsec'] = nsec_timestamp + offset
    end
    return code, timestamp, record
end