What is the proper way to query a nested object? Example:
s = Search(using=es).index("my_index").query("nested", path="features", query=Q("term", features__name="foo"))
The code expects keyword parameters, but specifying a nested field like 'features.name' as keyword parameter is not possible and 'features__name' is not converted into 'features.name'.
If it's not supported, I could add some code that will properly replace double underscores with periods.
Hi Adriaan,
converting a__b to a.b is indeed not supported. Currently you can achieve the code you want by specifying:
s = Search(using=es).index("my_index").query("nested", path="features", query=Q("term", **{'features.name': "foo"}))
...which is obviously far from ideal.
Transforming __ to the dots sounds definitely like a good idea - something the Q and F shortcuts could do. I will look into it.
Thanks!
Of course, unpacking a dict! Thanks for the gentle reminder :)
That said, I'm glad you like the idea of transforming __, it's also in line with how Django ORM handles relationships.
Hey @HonzaKral this issue did not demand any patching. The transformation of __, a pattern followed by Django ORM, is out of scope of Elasticsearch DSL as it nowhere mentions use of such separators. Moreover, it forces your mappings to not have fields with double underscore. Can we have revert to the patch
Hi @harshit298, I understand that this is a hindrance for you and doesn't help. On the other hand, having __ in your field names is very unusual and having a way to use dotted path without having to resolve to **{} pattern is definitely convenient. And convenience is, after all, what this dsl is all about.
If I don't hear from anyone else that they use __ in their field names and this issue is hurting them, I will leave it as is. I am sorry this doesn't work for you, I can only offer a workaround that will work:
q = Q('match')
q.field__with__underscores = 'value'
agreed :+1:
Hi @HonzaKral, I had this exact same problem - we have data indexed with nested fields. We needed both non-analyzed and analyzed values for a field so we copy the non-analyzed "value" to a "value__text" field. This "feature" is not well documented, so I ended up digging through the source and found this uncommented functionality.
Since we were up against a tight deadline, I did not have time to reindex, so I had to monkey patch the DslMeta class. I'd recommend better documentation on this or removal from the source to prevent future headaches.
Thanks for the awesome library, though, it's so much better than my former hackish DSL formatter! :+1:
To add to the dialog, I would weigh in on @adriaant 's side. Using __ to access nested fields is a Django ORM standard, and as such I'd think a common and expected way to access nested fields. I for one tried it reflexively without documented support.
I'm having trouble getting correct results using nested queries :( The reason seems to be that when the ES query is formed, it's missing some parts, e.g., "nested" and "path". See https://www.elastic.co/guide/en/elasticsearch/guide/master/nested-query.html Can someone confirm successful use of nested queries? Thanks.
Here's a sample query I use:
q = Q('bool',
must=[
Q("term", source=source_name.lower()),
Q("nested", path="features", query=Q("term", **{'features.name': feature_name.lower()}))
],
)
es = get_es()
s = Search().query(q).using(es).index(ProductDoc._doc_type.index).fields('id').extra(from_=0, size=5000)
Perhaps you can compare to the code you use.
@adriaant Many thanks for your help! Worked just fine! I somehow focused on "." vs. "__" and completely missed the "nested" and "path" parts of the solution -- must have thought it should be able to figure out those two parts by itself... Thanks.
Thanks for the example @adriaant, just FYI it can also be written as:
q = Q("term", source=source_name.lower())
q &= Q("nested", path="features", query=Q("term", features__name=feature_name.lower()}))
Search().query(q)
hope this helps
@HonzaKral Thanks for the hint, and big thanks for the package: it's tremendously useful and very well designed and executed!
@HonzaKral Yes, indeed, that was the whole point of this issue in the first place. I just haven't gotten to update that piece of code ;)
I am having the same issue than @harshit298, ie I have __ in field and Elasticsearch DSL converts them to points.
I tried using your workaround, it works fine if you apply one filter, but is overridden afterwards:
# Apply first filter using the workaround
f = F('range')
f.first_field__with_double_underscores = {'low': low, 'high': high}
search = search.filter('and', [f])
# Second filter, same workaround
f = F('range')
f.second_field__with_double_underscores = {'low': low2, 'high': high2}
search = search.filter('and', [f])
After the first filter is applied, search.to_dict() shows a filter on first_field__with_double_underscores, as we wanted.
After the second one, search.to_dict() shows filters applied on second_field__with_double_underscores, as expected, and first_field.with_double_underscores, the __ was replaced with a .
In this simple example I could put both F objects in one filter, but let's assume I can't in my real-life application.
Do you see another workaround that would work with chained filters? The current solution I'm looking at is using the code from https://github.com/harshit298/elasticsearch-dsl-py/pull/1
+1 Would it be possible to have a parameter on the Search class to disable this behavior?
Thanks for the feedback, in abd0075bba01f58dd3294528c264d0beee5e136d I added an option to override this behavior by sending _expand__to_dot=False: Q('match', field__with__underscores='value', _expand__to_dot=False)
Hope this helps
That's great, thanks!
I'm having an issue with the faceted search and fields with __. I can't figure out how to pass the _expand__to_dot when trying to filter by a facet. My code:
class ProductSearch(FacetedSearch):
index = 'ehc__wagtailcore_page'
fields = ('title', 'body')
doc_types = []
facets = {
'product_type': TermsFacet(field='products_product__product_type_filter'),
'audience': TermsFacet(field='products_product__audience_array_filter'),
}
ps = ProductSearch('cancer', {'product_type': ['Consumer Summary']})
response = ps.execute()
The facets return fine and mark the selected facets — but there are no results.
@HonzaKral It seems the F shortcut has vanished completely. What's the solution?
I'm having the same problem as @atadams . The way _expand__to_dot is implemented, it's not possible to set with Faceted search. Can this issue be reopened?
Would it be possible to revisit this decision? Sounds like __ > . was implemented because that's how Django works. But as you can see from this issue, this behavior is unsurprising and breaks many people's workflows (including mine). I don't know what % of Elasticsearch users are familiar with Django syntax, but it might be small. Can we just make users pass in a.b instead of a__b?
@melissachang you can always pass in a.b but it is inconvenient when passing it as a kwarg.
As for FacetedSearch we could completely fix that and have it not rely on the Q shortcut. Using __ in FacetedSearch config wouldn't work anyway so it even wouldn't be backwards compatible. I will create a separate issue for this.
Just curious, what is difficult about passing a.b as a kwarg?
It is not difficult, but it is not convenient.
Typing query('match', a__b='word') is imo much easier and readable than query('match', **{'a.b': 'word'}).
Thanks. I know I'm several years too late :), but IMHO, this feature is not worth it.
If it was Elasticsearch syntax it would be better, but it's strange to use Djagno syntax. (And just in one client library -- I assume the other client libraries don't have this.)
If someone happens to have "__" in a field name -- which can happen for a myriad of legitimate reasons -- their search will break and they have to do digging and find this issue.
And the feature complicates the code. It's strange that FacetedSearch has to construct a query a specific way. Some future maintainer of FacetedSearch (could be different from current maintainers) will have to remember to not use Q shortcut.
Thank you for the feedback!
In my experience there are many more people that have . in their fields and never any __ which makes this worth it for many, especially if there is a way around it if not wanted/needed. I know I use it every day and it makes my life much easier.
As for complications in the code - any maintainer of any code will have to be aware of the API and its uses, that is never a reason to omit an option from the API that make peoples lives easier.
Overall I don't see the corner case of people having __ in their code enough to break backwards compatibility and remove this useful shortcut.
@HonzaKral Hi Honza , Would you please help me with ObjectField search .
I have define DocType as below , But i dont know how to search it with query_string
rawdata = ItemCalendarJanuaryDocument.search().query("query_string",query = query ,fields=['location.countryName'])
Doctype define :
itemsearch= Index('itemcalendarjanuaryindex')
# See Elasticsearch Indices API reference for available settings
itemsearch.settings(
number_of_shards=1,
number_of_replicas=0
)
@itemsearch.doc_type
class ItemCalendarJanuaryDocument(DocType):
location = fields.ObjectField(properties={
'countryName': fields.TextField(),
'countryCode': fields.TextField(),
})
@leon1s the query string query that you have there should work just fine, in what way doesn't it work for you?
Most helpful comment
Thanks for the feedback, in abd0075bba01f58dd3294528c264d0beee5e136d I added an option to override this behavior by sending
_expand__to_dot=False:Q('match', field__with__underscores='value', _expand__to_dot=False)Hope this helps