When working with source filtering (e.g. setFetchSource ), the excludes are interpreted first, and then the includes are interpreted. I would assume it would be the other way around on conflicts, as includes is directly stating what is desired. Ultimately, this causes problems when using wildcards.
If I have the following structure
{
"person_name" : "Charles",
"person_age" : 67,
"person_dob" : "1950-10-01",
"person_ssn" : "xxx-xx-xxxx",
"person_location" : {
"name" : "My House",
"address" : "xxx",
"city" : "Orlando",
"state" : "Florida"
}
}
And I want to get all the immediate person fields plus the location name, but not including all the other location fields. I would expect to be able to ask for the following:
builder.setFetchSource(
/* include */ new String[]{ "person_*", "person.location_name"},
/* exclude */ new String[]{"person.location_*"}
)
This results in all person.location_* fields being excluded even person.location_name. Even though this example seems to be a more complicated usage of the source filtering, I would still assume that includes should always win over excludes
Your example mixes up dots and underscores, but the description of the problem is correct:
PUT t/t/1
{
"person_name" : "Charles",
"person_age" : 67,
"person_dob" : "1950-10-01",
"person_ssn" : "xxx-xx-xxxx",
"person_location" : {
"name" : "My House",
"address" : "xxx",
"city" : "Orlando",
"state" : "Florida"
}
}
GET t/_search
{
"_source": {
"include": ["person_*", "person_location.name"],
"exclude": "person_location.*"
}
}
I think I agree with your reasoning. @bleskes what do you think?
Discussed in FixItFriday: We have to choose a precedence order one way or the other and there are going to be use cases which are hard either way (Include all X except X.Y or exclude all X expect X.Y). Elsewhere (e..g term agg) the includes take precedence over excludes so we should keep it this way for consistency. The behaviour should be documented properly though so the fix here is to add an explanation of the precedence order in the documentation.
@colings86 I think you're saying that the intended behavior is for includes to take precedence over excludes, but this does not appear to be the case (at least in v2.4.3).
The following query, for example, returns an empty _source object in all cases:
{
"_source": {
"includes": "data.name",
"excludes": "data.*"
}
}
I think there are a few issues here:
includes, the default behavior becomes "omit all fields". This doesn't really make sense in the presence of an explicit excludes.excludes seems to take priority over includes in all cases.* < *.name < data.* < data.name). If the path specs were evaluated in order based on specificity it would make all use cases easier to implement.Pinging @elastic/es-search-aggs
[docs issue triage]
leaving open as this is still relevant
Most helpful comment
@colings86 I think you're saying that the intended behavior is for
includesto take precedence overexcludes, but this does not appear to be the case (at least inv2.4.3).The following query, for example, returns an empty
_sourceobject in all cases:I think there are a few issues here:
includes, the default behavior becomes "omit all fields". This doesn't really make sense in the presence of an explicitexcludes.excludesseems to take priority overincludesin all cases.* < *.name < data.* < data.name). If the path specs were evaluated in order based on specificity it would make all use cases easier to implement.