Elasticsearch version (bin/elasticsearch --version):
7.11 / 7.x / master
Plugins installed: []
JVM version (java -version):
java version "14.0.1" 2020-04-14
Java(TM) SE Runtime Environment (build 14.0.1+7)
Java HotSpot(TM) 64-Bit Server VM (build 14.0.1+7, mixed mode, sharing)
OS version (uname -a if on a Unix-like system):
Darwin Kernel Version 18.7.0: Thu Jun 18 20:50:10 PDT 2020; root:xnu-4903.278.43~1/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
TL;DR In 7.11/7.x/master the expected shard failures are not returned in the response when sorting on a field which is not present in a given index. Testing the below case out manually in 7.10 the shard failures are returned in the response but in 7.11/7.x/master the shard failures are missing.
I have a test which queries an index pattern myfa* and sorts on event.ingested. The sample indices + docs are as follows.
POST myfakeindex-1/_doc
{
"message": "hello world 1"
}
POST myfakeindex-2/_doc
{
"message": "hello world 2",
"event": {
"ingested": "2020-12-14T22:31:01.726Z"
}
}
POST myfakeindex-3/_doc
{
"message": "hello world 3",
"@timestamp": "2020-12-14T22:31:01.726Z"
}
Query:
Edit: updated query with range
Query:
GET myfa*/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [],
"filter": [
{
"match_all": {}
}
],
"should": [],
"must_not": []
}
},
{
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"event.ingested": {
"gte": "1900-01-01T00:00:00.000Z",
"format": "strict_date_optional_time"
}
}
}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{
"range": {
"event.ingested": {
"lte": "2021-01-04T00:05:23.924Z",
"format": "strict_date_optional_time"
}
}
}
],
"minimum_should_match": 1
}
}
]
}
},
{
"match_all": {}
}
]
}
},
"sort": [
{
"event.ingested": {
"order": "asc"
}
}
]
}
I am expecting to receive shard failures in the query response with the message "No mapping found for [event.ingested] in order to sort on" for the index myfakeindex-1 and the same error for myfakeindex-3. These messages are present in 7.10 and provide the expected response but in 7.11/7.x/master the failures are missing.
Expected response
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 1,
"skipped" : 0,
"failed" : 2,
"failures" : [
{
"shard" : 0,
"index" : "myfakeindex-1",
"node" : "43l9TxYaSdKemur9313R8A",
"reason" : {
"type" : "query_shard_exception",
"reason" : "No mapping found for [event.ingested] in order to sort on",
"index_uuid" : "wqVcHWZqTuy04047wwWRpw",
"index" : "myfakeindex-1"
}
},
{
"shard" : 0,
"index" : "myfakeindex-3",
"node" : "43l9TxYaSdKemur9313R8A",
"reason" : {
"type" : "query_shard_exception",
"reason" : "No mapping found for [event.ingested] in order to sort on",
"index_uuid" : "OmMdkwBURbWPGevduRmB3w",
"index" : "myfakeindex-3"
}
}
]
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "myfakeindex-2",
"_type" : "_doc",
"_id" : "ZeXFzXYBnhmb658UiX-I",
"_score" : null,
"_source" : {
"message" : "hello world 2",
"event" : {
"ingested" : "2020-12-14T22:31:01.726Z"
}
},
"sort" : [
1607985061726
]
}
]
}
}
Actual response
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 2,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "myfakeindex-2",
"_id" : "JIPGzXYBTQY__yEC0VlV",
"_score" : null,
"_source" : {
"message" : "hello world 2",
"event" : {
"ingested" : "2020-12-16T15:16:18.570Z"
}
},
"sort" : [
1608131778570
]
}
]
}
}
Steps to reproduce:
Post the following docs to Elasticsearch and run the subsequent query
Docs:
POST myfakeindex-1/_doc
{
"message": "hello world 1"
}
POST myfakeindex-2/_doc
{
"message": "hello world 2",
"event": {
"ingested": "2020-12-14T22:31:01.726Z"
}
}
POST myfakeindex-3/_doc
{
"message": "hello world 3",
"@timestamp": "2020-12-14T22:31:01.726Z"
}
Edit: updated query with range
Query:
GET myfa*/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [],
"filter": [
{
"match_all": {}
}
],
"should": [],
"must_not": []
}
},
{
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"event.ingested": {
"gte": "1900-01-01T00:00:00.000Z",
"format": "strict_date_optional_time"
}
}
}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{
"range": {
"event.ingested": {
"lte": "2021-01-04T00:05:23.924Z",
"format": "strict_date_optional_time"
}
}
}
],
"minimum_should_match": 1
}
}
]
}
},
{
"match_all": {}
}
]
}
},
"sort": [
{
"event.ingested": {
"order": "asc"
}
}
]
}
Expected response
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 1,
"skipped" : 0,
"failed" : 2,
"failures" : [
{
"shard" : 0,
"index" : "myfakeindex-1",
"node" : "43l9TxYaSdKemur9313R8A",
"reason" : {
"type" : "query_shard_exception",
"reason" : "No mapping found for [event.ingested] in order to sort on",
"index_uuid" : "wqVcHWZqTuy04047wwWRpw",
"index" : "myfakeindex-1"
}
},
{
"shard" : 0,
"index" : "myfakeindex-3",
"node" : "43l9TxYaSdKemur9313R8A",
"reason" : {
"type" : "query_shard_exception",
"reason" : "No mapping found for [event.ingested] in order to sort on",
"index_uuid" : "OmMdkwBURbWPGevduRmB3w",
"index" : "myfakeindex-3"
}
}
]
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "myfakeindex-2",
"_type" : "_doc",
"_id" : "ZeXFzXYBnhmb658UiX-I",
"_score" : null,
"_source" : {
"message" : "hello world 2",
"event" : {
"ingested" : "2020-12-14T22:31:01.726Z"
}
},
"sort" : [
1607985061726
]
}
]
}
}
Actual response
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 2,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "myfakeindex-2",
"_id" : "JIPGzXYBTQY__yEC0VlV",
"_score" : null,
"_source" : {
"message" : "hello world 2",
"event" : {
"ingested" : "2020-12-16T15:16:18.570Z"
}
},
"sort" : [
1608131778570
]
}
]
}
}
Provide logs (if relevant):
Pinging @elastic/es-search (Team:Search)
I don't think that anything has changed in 7.11. The fact that you have skipped shards in your response indicates that the can match phase filtered the two shards that don't have the sort field. However that shouldn't happen with a match_all query so I suspect that you used a range query on the event.ingested field since I cannot reproduce with a match_all
. If that's the case then this behavior was the same in 7.10, we don't check the sort criteria if the can match phase detects that the shard cannot match the search request.
I did omit the range query piece but I must be missing something locally because running Elasticsearch in the 7.10 branch I'm seeing the failures populated even with the range query present (there is a separate issue we have open to remove the should from this query so maybe that would impact the presence of shard failures in the response?)
Query
GET myfa*/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [],
"filter": [
{
"match_all": {}
}
],
"should": [],
"must_not": []
}
},
{
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"event.ingested": {
"gte": "1900-01-01T00:00:00.000Z",
"format": "strict_date_optional_time"
}
}
}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{
"range": {
"event.ingested": {
"lte": "2021-01-04T00:05:23.924Z",
"format": "strict_date_optional_time"
}
}
}
],
"minimum_should_match": 1
}
}
]
}
},
{
"match_all": {}
}
]
}
},
"sort": [
{
"event.ingested": {
"order": "asc"
}
}
]
}
Response
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 1,
"skipped" : 0,
"failed" : 2,
"failures" : [
{
"shard" : 0,
"index" : "myfakeindex-1",
"node" : "43l9TxYaSdKemur9313R8A",
"reason" : {
"type" : "query_shard_exception",
"reason" : "No mapping found for [event.ingested] in order to sort on",
"index_uuid" : "wqVcHWZqTuy04047wwWRpw",
"index" : "myfakeindex-1"
}
},
{
"shard" : 0,
"index" : "myfakeindex-3",
"node" : "43l9TxYaSdKemur9313R8A",
"reason" : {
"type" : "query_shard_exception",
"reason" : "No mapping found for [event.ingested] in order to sort on",
"index_uuid" : "OmMdkwBURbWPGevduRmB3w",
"index" : "myfakeindex-3"
}
}
]
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "myfakeindex-2",
"_type" : "_doc",
"_id" : "ZeXFzXYBnhmb658UiX-I",
"_score" : null,
"_source" : {
"message" : "hello world 2",
"event" : {
"ingested" : "2020-12-14T22:31:01.726Z"
}
},
"sort" : [
1607985061726
]
}
]
}
}
Query
GET myfa*/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [],
"filter": [
{
"match_all": {}
}
],
"should": [],
"must_not": []
}
},
{
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"event.ingested": {
"gte": "1900-01-01T00:00:00.000Z",
"format": "strict_date_optional_time"
}
}
}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{
"range": {
"event.ingested": {
"lte": "2021-01-04T00:05:23.924Z",
"format": "strict_date_optional_time"
}
}
}
],
"minimum_should_match": 1
}
}
]
}
},
{
"match_all": {}
}
]
}
},
"sort": [
{
"event.ingested": {
"order": "asc"
}
}
]
}
Response
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 2,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "myfakeindex-2",
"_id" : "JIPGzXYBTQY__yEC0VlV",
"_score" : null,
"_source" : {
"message" : "hello world 2",
"event" : {
"ingested" : "2020-12-16T15:16:18.570Z"
}
},
"sort" : [
1608131778570
]
}
]
}
}
Ok thanks, I found the change that caused this.
I would not consider this a bug though because we don't validate the entire search request when skipping a shard.
The error was more a side-effect of how we execute the can match phase but there is no guarantee that we'll detect errors when the shard is skipped.
Thanks for the explanation! Our use case for these errors were more informational from the customer's perspective than required logic in order for the security solution's detection rules to execute properly.
The security solution鈥檚 detection rules run queries against any indices which are ECS-compliant. Originally we required the @timestamp field to be present on any indices a customer wanted to search against. As we transition away from relying on @timestamp to event.ingested (for example the endpoint agent utilizes event.ingested ) we needed a way to search and sort on two fields that may or may not be present in the indices matching the index patterns provided to the detection rule. We also wanted to tell the customer which indices were missing which sort fields. The shard failure messages were used to tell the customer 鈥渉ey the detection rule found some documents matching your search but it could not search against the following indices [failed indices] because they are missing the sort fields."
Resolution for the security solution is to now utilize the getFieldCaps endpoint to determine which indices are missing which sort fields and provide this information to the user. Thanks for investigating @jtibshirani @jimczi 馃槃
Most helpful comment
Resolution for the security solution is to now utilize the
getFieldCapsendpoint to determine which indices are missing which sort fields and provide this information to the user. Thanks for investigating @jtibshirani @jimczi 馃槃