Logstash: tags collision with user-defined tags and built-in tags

Created on 1 Aug 2017  路  15Comments  路  Source: elastic/logstash

i have an event that has json content that looks like the following:

{"tags": {"key1":"value1", "key2":"value2"}}

Logstash is incorrectly assuming this is the built-in 'tags' field and putting it into a json list. How do I avoid this so that I can use my own event that happens to have 'tags' as an attribute (that is not a json list)?

Most helpful comment

@jsvd thanks for the reply, but what if I want the root element to be 'tags' when i output it to elasticsearch? I'm aware i can wrap it in the target parameter e.g.

{"doc": {"tags":{"key1":"value1","key2":"value2"}}}

but i need the end result in elasticsearch to have 'tags' at the root level.

All 15 comments

you must receive the event as string using something like the plain or line codec, then use the json filter, since that filter supports the target parameter

@jsvd thanks for the reply, but what if I want the root element to be 'tags' when i output it to elasticsearch? I'm aware i can wrap it in the target parameter e.g.

{"doc": {"tags":{"key1":"value1","key2":"value2"}}}

but i need the end result in elasticsearch to have 'tags' at the root level.

So the field name tags is a treated as a reserved field name throughout the LS code base.

The only way I can think of to preserve your tags Map/Hash like object is to use:

add_field => "[mytags]" => "[%{tags}]"
remove_field  => ["tags"]

in the input, then do the reverse in the last filter before the output:

add_field => "[tags]" => "[%{mytags}]"
remove_field  => ["mytags"]

Obviously, any filters will need to manipulate mytags instead of tags.

unfortunately the above still doesn't help. As soon as you try to use the 'tags' field again (even if its the last filter), it will still be of type json list. The goal is to be able to send the user-defined 'tags' field to the output.

I don't know what you mean by be of type json list. JSON is a text representation. LS Events are object based. Do you mean that in the tags field you have a JSON String representation of some other objects.

type json list just means a json array e.g.

{ "tags" : ["tag1", "tag2", "tag3"] }

https://www.w3schools.com/js/js_json_arrays.asp

As stated in my original comment, the 'tags' attribute that I need to flow thru to the output has to be a json object, not an array (list).

@dukenguyen
The info in that link is only valid in a Javascript execution environment.
In Logstash, is a line of text arrives at an input that is thought to be JSON encoded, we use the JSON codec to decode it. After decoding, there is not more JSON Text in the Logstash Event. It is possible to have JSON in JSON - a JSON String contains more JSON for example. Logstash will not double decode that extra JSON unless told to.

json_string = '{"foo":"{\"bar\":\"baz\"}"}'
=> "{\"foo\":\"{\\\"bar\\\":\\\"baz\\\"}\"}"
obj =JSON.load json_string
=> {"foo"=>"{\"bar\":\"baz\"}"}
obj.class
=> Hash
obj["foo"].class
=> String
JSON.load obj["foo"]
=> {"bar"=>"baz"}

Note for Ruby the text representation for a Hash (roughly equivalent to a JSON Object) the key and value is separated by a =>.

@dukenguyen

I think I have found the problem that you might be trying to describe (its deep inside the Logstash Event code)

In the current release of LS we hold a reference to the object that the tags field is pointing to.
So even if you rename tags to mytags it still points to the same object. In the cache we have tags ---> obj1 memory location and mytags ---> obj1 memory location

This has been fixed in master.
LS 5.5.1...

input {
  stdin{
    codec => json_lines
  }
}

filter {
  mutate {
    rename => { "tags" => "mytags" }
  }
  mutate {
    replace => { "[mytags][field1]" => "My new message" }
    add_tag => ["mutate-replace-field1"]
  }
  mutate {
    rename => { "mytags" => "tags" }
  }
}

# {"foo":["bar", "baz"], "tags":{"field1":"value1","field2":"value2"}, "colour":"red"}

output { stdout{ codec => rubydebug } }

Produces...

{
        "colour" => "red",
    "@timestamp" => 2017-08-21T14:18:12.235Z,
           "foo" => [
        [0] "bar",
        [1] "baz"
    ],
      "@version" => "1",
          "host" => "Elastics-MacBook-Pro.local",
          "tags" => [
        [0] [
            [0] "field1",
            [1] "My new message"
        ],
        [1] [
            [0] "field2",
            [1] "value2"
        ]
    ]
}

So when the mutate filter renames mytags to tags, internally it finds tags sees it as an array and overwrites it with the mytags object but converts it to an array first (to keep the data type).

In master with the same config, produces...

{
        "colour" => "red",
    "@timestamp" => 2017-08-21T14:15:05.013Z,
           "foo" => [
        [0] "bar",
        [1] "baz"
    ],
      "@version" => "1",
          "host" => "Elastics-MacBook-Pro.local",
          "tags" => {
        "field2" => "value2",
        "field1" => "My new message"
    }
}

Note that tags is now a Hash with "field1" replaced.

@guyboertje thanks for this! That looks accurate.

Do you know which release this fix will be a part of? I tried it in version 5.5.2 and also I tried it in what I think is a nightly snapshot here (https://snapshots.elastic.co/downloads/logstash/logstash-7.0.0-alpha1-SNAPSHOT.tar.gz), but neither had the fix.

Should I have seen the fix in either of those versions?

Please let me know where can i find the version which has a fix for this issue. I am also facing the same issue.

@guyboertje Could you let me know which version includes this fix? I'm also hitting this.

It is not fixed in master. The above PR will fix it.

@guyboertje Looks like this was merged into master.
https://github.com/elastic/logstash/blob/master/logstash-core/lib/logstash/filters/base.rb#L197

When will this be available in a release?

These were not back ported, sorry. Back-porting now.
They should be available in 5.6.11 if released and 6.3.1.

It's not fixed in 5.6.11 (or .12 or .13 either) as far as I can tell.
https://github.com/elastic/logstash/blob/v5.6.13/logstash-core/lib/logstash/filters/base.rb#L199

Was this page helpful?
0 / 5 - 0 ratings