Hi. I have this multi-level aggregation, that I would like to convert. Problem is, I don't know how to code the ranges part is Python.
CURL code
POST /_search
{
"size": 0,
"aggs": {
"by_property" : {
"terms": {
"field": "propertyId",
"size": 0
},
"aggs": {
"twitter_count": {
"range": {
"field": "twitterAccount.followers",
"ranges": [
{ "to" : 5001},
{ "from" : 5001, "to" : 10001},
{ "from" : 10001, "to" : 50001},
{ "from" : 50001}
]
},
"aggs" : {
"email_addy": {
"terms" : {
"field": "emails.value",
"size": 0
}
}
}
}
}
}
}
}
Python DSL code
s.aggs.bucket('by_property', 'terms', field='propertyId', size=0) \
.bucket('twitter_count', 'range', field='twitterAccount.followers')
How do I continue the aggregation and say what ranges, the range uses?
.ranges( )? body=range{ } ?
Hi,
the mechanism is always the same, whatever you would put inside of the json object, just pass in as kwargs, in this case:
s.aggs.bucket('by_property', 'terms', field='propertyId', size=0)\
.bucket('twitter_count', 'range',
field='twitterAccount.followers',
ranges=[
{'to': 5001},
{'from': 5001, 'to': 10001},
{'from': 10001, 'to': 50001},
{'from': 50001}
]
)
Note that you can always use Search.from_dict to just pass it the json you would send by curl and then inspect the resulting object and it's repr:
s = Search.from_dict({...})
print(repr(s.aggs['by_property']['twitter_count']))
Hope this helps
Thank so much Honza. I'm new to both Python and ElasticSearch, which conflates my syntax issues. I will also try Search.from_dict. That will come in handy for some other queries down the line. This definitely helped. Thanks again.
Happy to help
I attempted using from_dict, however, the query ignores the index I specified now.
client = connections.create_connection(hosts=['http://some_location:9200'])
s = Search(using=client, index="g", doc_type="prop")
body = {
"query": {
"match_all": {}
},
"aggs": {
"by_property" : {
"terms": {
"field": "propertyId",
"size": 3
},
"aggs": {
"twitter_count": {
"range": {
"field": "twitterAccount.followers",
"ranges": [
{ "from" : 1000, "to" : 5000},
{ "from" : 5000, "to" : 10000},
{ "from" : 10000, "to" : 50000},
{ "from" : 50000}
]
},
"aggs" : {
"email_addy": {
"terms" : {
"field": "emails.value",
"size": 3
}
}
}
}
}
}
}
}
s = Search.from_dict(body)
#s.index("g")
#s.doc_type("prop")
body = s.to_dict()
r = s.execute()
print repr(r)
print
for i in vars(r):
print i
print
for i in r:
print vars(i)
print
My results should only include stuff from the g index, but it's pulling across multiple indices, with results like:
{'_meta': {u'index': u'logevents1409', u'score': 1.0, u'id': u'ab3...}, '_d_': {u'status': u'success', u'count': 1, u'project': u'proj1', u'info': {u'args': u'', u'command': u'x:cronjob:hourly', u'queuename': u'cronjob'}, u'date': u'2014-09-01T23:59:59+00:00', u'type': u'jobQueue', u'id': u'rAnDoMlEtErS', u'propertyId': None}}
Does the index and doc_type info get erases/reset when I declare s.to_dict()? How and where in the script would I set them back to what I want?
Ah, the index parameter is not being passed in the body, but in the URL so it won't be in the to_dict output.
Also, you have to write: s = s.index('g') because .index (and all other methods on Search) returns a copy of the Search object, it doesn't mutate it in place.
Thanks for the help, it saved lot of time. Can i also know how to i specify keyed aggregation in DSL, like how it is in the elastic docs
{
"aggs" : {
"price_ranges" : {
"range" : {
"field" : "price",
**"keyed" : true**,
"ranges" : [
{ "to" : 100 },
{ "from" : 100, "to" : 200 },
{ "from" : 200 }
]
}
}
}
}
@kslsantosh
you need to write something like that:
```search = Search(using=client, index=index, doc_type=doc_type)
search.aggs.bucket("bucketName", "range", field='fieldName', keyed=True, ranges=[
{'to': 1},
{'from': 2, 'to': 3},
{'from': 4}
])
result:
print(search.aggs.to_dict()
{'bucketName': {'buckets': {'-1.0': {'to': 1.0, 'doc_count': 0}, '2.0-3.0': {'from': 2.0, 'to': 3.0, 'doc_count': 0}, '4.0-': {'from': 4.0, 'doc_count': 777}}}}
```
Most helpful comment
Hi,
the mechanism is always the same, whatever you would put inside of the json object, just pass in as kwargs, in this case:
Note that you can always use
Search.from_dictto just pass it the json you would send by curl and then inspect the resulting object and it'srepr:Hope this helps