>>> import elasticsearch_dsl
>>> elasticsearch_dsl.VERSION
(2, 0, 0)
>>> from elasticsearch_dsl import Search, Q
When filter is after query method, 'minimum_should_match' exist in query:
>>> s = Search()
>>> s.query(q).filter('term', status=1).to_dict()
{'query': {'bool': {'filter': [{'term': {'status': 1}}],
'minimum_should_match': 1,
'should': [{'match': {'name': 'abc'}}]}}}
But if we change order of filter and query, than 'minimum_should_match' disappears:
>>> s = Search()
>>> s.filter('term', status=1).query(q).to_dict()
{'query': {'bool': {'filter': [{'term': {'status': 1}}],
'should': [{'match': {'name': 'abc'}}]}}}
In this case it was safe to drop it since it didn't have any change in behavior of the query which is the thing we prioritize. We have also changed some stuff in master to make it even more resilient in the future.
My queries do not work without "minimum_should_match" param.
I run into the same issue, and the results are definitely not the same for the two cases, so I don't think it is safe to drop the minimum_should_match parameter here.
The number of hits returned without this parameter are all results matching the filter only, whereas with minimum_should_match: 1 the result set is narrowed down further according to the should clause.
Sorry, I forgot provide a query:
>>> q = Q('bool', should=[Q('match',name='aaa'), Q('match', desc='aaa')], minimum_should_match=2)
>>> s.query(q).filter('term', status=1).to_dict()
{'query': {'bool': {'filter': [{'term': {'status': 1}}],
'minimum_should_match': 2,
'should': [{'match': {'name': 'aaa'}}, {'match': {'desc': 'aaa'}}]}}}
>>> s.filter('term', status=1).query(q).to_dict()
{'query': {'bool': {'filter': [{'term': {'status': 1}}],
'should': [{'match': {'name': 'aaa'}}, {'match': {'desc': 'aaa'}}]}}}
The same stuff happens when "minimum_should_match" is gather then one. When filter goes first, minimum_should_match disappears.
I got the same problem, when providing two minimum_should_match on a query, once on filter clause and once on should clause, if I use filter first, the minimum_should_match clause on should is dropped. No problem if I invert them.
I have just pushed new version on PYPI (2.1.0) that should have improved in this manner quite a bit. If I could ask someone to verify that these issues still persist I would be glad.
Also please note that the library only cares about the _meaning_ of the query, not necessarilly the _shape_ so it can change things like minimum_should_match as long as the behavior of the query remains unchanged.
Thank you all for cooperation!
I can confirm that there's still an issue which seems to be related to how queries are ANDed. Here is an example that reproduces the issue:
# Three queries
q1 = Q('term', category=1)
q2 = Q('bool', should=[Q('term', name='aaa'), Q('term', name='bbb')])
q3 = Q('bool', should=[Q('term', type='ccc'), Q('term', type='ddd')])
# What happens when we combine them:
print(q1 & q2 & q3)
> Bool(minimum_should_match=0,
must=[Term(category=1), Bool(minimum_should_match=1, should=[Term(type='ccc'), Term(type='ddd')])],
should=[Term(name='aaa'), Term(name='bbb')]
)
# What if we change the order:
print(q3 & q2 & q1)
> Bool(minimum_should_match=1,
must=[Bool(minimum_should_match=1, should=[Term(name='aaa'), Term(name='bbb')]), Term(category=1)],
should=[Term(type='ccc'), Term(type='ddd')]
)
Note that the outer minimum_should_match has changed from 0 to 1 depending on the order in which the queries are combined. The _correct_ result is the second one, because in the first case, setting minimum_should_match to 0 means that the term filter is not enforced if one of the other conditions is met.
Thank you @solarissmoke! This is very useful feedback and helped me discover a bug where doing Bool & NotBool the minimum_should_match was ignored. That's why the order mattered.
I pushed a fix in 4b4487d0659ac8f2cf0e6499009ce696fca35684 for this particular case with a test.
I will close the issue now, feel free to reopen if the issue isn't resolved.
Thanks again for all the help!
Hi, we had the same problem and this solved our issue.
Can you make a release of this version for pip install ?
Thanks for your work :+1: