Elasticsearch: Term query not working for capital letters

Created on 21 Apr 2014  Â·  5Comments  Â·  Source: elastic/elasticsearch

ElasticSearch 1.1.0

curl -XPOST 'http://localhost:9200/test/test' -d '{
"letters" : "ABCD"
}'
=>
{"_index":"test","_type":"test","_id":"I8X9z8S9SIahvmv3wFekzA","_version":1,"created":true}

curl -XGET 'http://localhost:9200/test/_search' -d '{
"query": {
"term": { "letters": "ABCD" }
}
}'
=>
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

Most helpful comment

This isn't working because elastisearches's text queries are analyzed text. The term query takes the term exactly as provided and looks for it in the analyzed text. Your example doesn't find anything because the default analyzer lowercases. You have two options:

  1. Use a match query. That analyzes the query text and then looks for it. This means lower case searched would find capitals and vice versa.
  2. Use the mapping API to define the letters field as not_analyzed. Now the term query for capitals will find only capitals and lowercase will find only lowercase.

Might be worth adding something else to the docs because I've seen this kind of mistake quite a few times now.

Sent from my iPhone

On Apr 20, 2014, at 10:11 PM, Evan Wong [email protected] wrote:

ElasticSearch 1.1.0

curl -XPOST 'http://localhost:9200/test/test' -d '{
"letters" : "ABCD"
}'
=>
{"_index":"test","_type":"test","_id":"I8X9z8S9SIahvmv3wFekzA","_version":1,"created":true}

curl -XGET 'http://localhost:9200/test/_search' -d '{
"query": {
"term": { "letters": "ABCD" }
}
}'
=>
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

—
Reply to this email directly or view it on GitHub.

All 5 comments

This isn't working because elastisearches's text queries are analyzed text. The term query takes the term exactly as provided and looks for it in the analyzed text. Your example doesn't find anything because the default analyzer lowercases. You have two options:

  1. Use a match query. That analyzes the query text and then looks for it. This means lower case searched would find capitals and vice versa.
  2. Use the mapping API to define the letters field as not_analyzed. Now the term query for capitals will find only capitals and lowercase will find only lowercase.

Might be worth adding something else to the docs because I've seen this kind of mistake quite a few times now.

Sent from my iPhone

On Apr 20, 2014, at 10:11 PM, Evan Wong [email protected] wrote:

ElasticSearch 1.1.0

curl -XPOST 'http://localhost:9200/test/test' -d '{
"letters" : "ABCD"
}'
=>
{"_index":"test","_type":"test","_id":"I8X9z8S9SIahvmv3wFekzA","_version":1,"created":true}

curl -XGET 'http://localhost:9200/test/_search' -d '{
"query": {
"term": { "letters": "ABCD" }
}
}'
=>
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

—
Reply to this email directly or view it on GitHub.

Might be worth adding something else to the docs because I've seen this kind of mistake quite a few times now.

Such as this: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_exact_values_vs_full_text.html

Ah! That covers it.

@nik9000
That seems opposite to what you said. My fields are not_analyzed in mapping , but they still have the problem of not matching captical letters.

That seems opposite to what you said. My fields are not_analyzed in mapping , but they still have the problem of not matching captical letters.

If they are not analyzed and you can't find them when you search for the exact same text then I'd open up an issue with a recreation in curl (or sense). The curl step is super important because its kind of a lowest common denominator - we can all run the tests. It _should_ work. I mean, we test it on ever compile, but it could be that you've found something very novel.

Was this page helpful?
0 / 5 - 0 ratings