Graylog2-server: [0.21.0-beta4] Wildcards won't match if the wildcard-ed query string contains literal uppercase characters

Created on 1 Sep 2014  路  6Comments  路  Source: Graylog2/graylog2-server

The issue is that Graylog will pass queries containing wildcards using the ES query_string query type. As explained for example in this thread (or related here), without additional work wildcard queries will not match in the case of upper-cased characters.

To demonstrate the issue, let a message contain the following fields:

ua: Jakarta Commons-HttpClient/3.1
ua_two: jakarta Commons-HttpClient/3.1

Now consider the following Graylog queries:

ua:Jakarta\ Commons\-HttpClient\/3.1 # will match the message literally
ua:Jakarta\ Commons\-HttpClient\/3.* # won't match the message
ua:Jakarta\ Commons\-HttpClient\/3.1 AND ua_two:j* # will match the message
ua:Jakarta\ Commons\-HttpClient\/3.1 AND ua_two:jakarta\ Commons* # won't match the message
ua:Jakarta\ Commons\-HttpClient\/3.1 AND ua_two:jakarta\ ?ommons* # will match the message. Note the absence of an uppercase char in the wildcard string
ua:J* AND ua_two:jakarta\ ?ommons* # won't match
ua:?* AND ua_two:jakarta\ ?ommons* # will match. Again, working around uppercase characters

It would be great if the mappings could be adjusted so that these Lucene caveats don't get exposed to the ignorant user.

As a workaround, regular wildcard queries

  "query": {
        "wildcard":{
          "ua":"J*"
        }
    }

work as expected (albeit off-Graylog).

Most helpful comment

This doesn't resolve the issue. Having to substitute a ? for any uppercase letter in a wildcard query is unexpected behavior by most standards. That _is_ the way the current interface operates. Regardless of the cause, this behavior should be addressed somehow. Anything less results in a bad user experience. For example, I can't tell a non-technical user that foo:\/blah\/SDkdf\/_.tgz won't return any results, but foo:\/blah\/??kdf\/_.tgz will return the results that they are expecting (as well as some false positives). They won't accept it. To them it looks like I'm an idiot for using graylog when it can't do a simple search properly. I understand the problem may not be readily resolvable, but at least adding a notice to alert the user when such a search is attempted would reduce user frustration.

All 6 comments

+1 still appears to be the case in 1.1 If nothing else, it would be nice to detect that an uppercase letter is being used in a wildcard query and alert the user. It took me a while to figure out what was happening, which is very annoying for a known issue.

It has nothing to do with upper or lower case characters. All messages fields except for message, full_message and source will intentionally _not_ get analyzed during index time, thus wildcard queries in those fields will _not_ work at all.

If you want to circumvent that intentional restriction, you'll have to use Elasticsearch index templates (https://www.elastic.co/guide/en/elasticsearch/reference/1.5/indices-templates.html) for now.

This doesn't resolve the issue. Having to substitute a ? for any uppercase letter in a wildcard query is unexpected behavior by most standards. That _is_ the way the current interface operates. Regardless of the cause, this behavior should be addressed somehow. Anything less results in a bad user experience. For example, I can't tell a non-technical user that foo:\/blah\/SDkdf\/_.tgz won't return any results, but foo:\/blah\/??kdf\/_.tgz will return the results that they are expecting (as well as some false positives). They won't accept it. To them it looks like I'm an idiot for using graylog when it can't do a simple search properly. I understand the problem may not be readily resolvable, but at least adding a notice to alert the user when such a search is attempted would reduce user frustration.

I am using 1.1.5, this issue is in it.

+1

This is still happening with 2.2.3

+1
This is a really annoying issue. I have all my data in all uppercase. So how do I search for;
field:DATA*
I can't...?!

Was this page helpful?
0 / 5 - 0 ratings