data_format = "json"
tag_keys = ["DEVICEID"]
json_time_key = "measuredDtm"
json_time_format = "2006-01-02T15:04:05+00:00"
telegraf 1.14 on ubuntu18.04
I am using telegraf to consume json input format. I have a tag key set to transform DEVICEID into an influxdb tag. The problem is that the tag ends up with truncated precision before is becomes a string.
source json which contains: ...,"DEVICEID":2882429806056571124,... should yield a metric with tag of DEVICEID=2882429806056571124.
source json which contains: ...,"DEVICEID":2882429806056571124,... actually yields a tag of DEVICEID=2882429806056571000. Note the zeroes in the least significant digits.
It looks like the problem is in the go encoding/json library, specifically in json.unmarshal. I think all json numbers are parsed into float64.
It is possible to optionally parse them as a json.Number, but it isn't a type we support in the Telegraf metric. It's possible we could expose an option to select if the value should be converted to an int64, uint64, float64.
Because I am using it as a tag, I ultimately want it parsed into a string.
But in general, I could see wanting to parse signed 64 bit integers as well since they are supported by influx. I think it would make sense to use whatever options are available in the json decoding to preserve information and then parse into influx compatible formats within parsers/json/parser.go. This would allow for placing priority on parsing to string without loss for keys that have been flagged as tags or string_fields.
Wondering if this is causing a loss of precision regarding timestamps as well?
Ran a local test and noticed the millisecond precision was dropped. Might not be related, and is very probable I've missed something.
Tested via the following command:
./telegraf --config jsonfile.conf --test
jsonfile.conf:
# Reload and gather from file[s] on telegraf's interval.
[[inputs.file]]
files = ["output.json"]
data_format = "json"
json_strict = true
tag_keys = ["Path", "TimeStamp"]
json_name_key = "Path"
json_time_key = "TimeStamp"
json_time_format = "unix_ms"
output.json:
[
{
"TimeStamp": "1590009582002",
"Path": "processor_time",
"Value": 10.656527371660751
},
{
"TimeStamp": "1590009582002",
"Path": "memory_committed_bytes",
"Value": 79.157851910953838
}
]
Results (truncated slightly to highlight the timestamp value):
Starting Telegraf
> processor_time,Value=10.65652737166075 1590009582000000000
> memory_committed_bytes,Value=79.15785191095384 1590009582000000000
The timestamp that was expected: 1590009582002000000
That is probably rounded due to the agent precision setting:
[agent]
precision = "1s"
BTW, you can check if that is the case by setting `precision = "1ms" in the agent. This takes effect across all of Telegraf but I'm planning to make it configurable per plugin in 1.15.
That was the issue, aka my fault. Thanks for the quick response, and that would be a really nice addition to 1.15
@danielnelson is there any fix for this on the horizon? Alternatively can you recommend a workaround or a strategy that I could pursue for implementing a fix in a fork? I'm happy to work on the implementation, particularly if the direction has some endorsement and potential to make it's way back into the main line.
@nathanpegram Could you try seeing if setting your "DEVICEID" as a json_string_fields doesn't lose the precision? Then converting it to a tag if is reported out as a field.
The problem looks to be that it's being converted to a float64 and losing precision. As long as you skip the float step, this shouldn't be a problem. json -> int64 -> string or json -> string, just _not_ json -> float64 -> string
A valuable bug fix would possibly be around int64 and possibly adding a json_int64_fields = []. There might be further work we could do with the JSON parser for tags.
@reimda @ssoroka