I am testing an index with geo_point. The mapping is like the following
"city": {
"properties": {
"city": {"type": "string"},
"state": {"type": "string"},
"location": {"type": "geo_point"}
}
}
I cannot find the type to import from elasticsearch_dsl. Is "geo_point" supported? If it is, how can I declare a class with attribute of type "geo_point" and create an object of "geo_point"?
Thanks.
elasticsearch-dsl-py handles this beautifully. You can define your DocType like this:
from elasticsearch_dsl import DocType
from elasticsearch_dsl import GeoShape
from elasticsearch_dsl import Index
from elasticsearch_dsl import MetaField
from elasticsearch_dsl import String
_INDEX_ALIAS = 'city'
_index = Index(_INDEX_ALIAS)
@_index.doc_type
class City(DocType):
city = String()
state = String()
location = GeoShape()
class Meta:
doc_type = 'city'
dynamic = MetaField('strict')
There's some good documentation here. Hope that helps!
Thanks for the quick reply.
I see that GeoShape is dynamically created Field type along with GeoPoint. When I tried to index a city with the following code
c = City()
c.city = 'New York'
c.state = 'NY'
c.location = GeoPoint()
c.location.lat = 40
c.location.lon = -74
Elasticsearch returns an error: field must be either [lat], [lon] or [geohash]
It looks like location is serialized into following instead with an extra "type" attribute.
"location": {"lat": "40", "type": "geo_point", "lon": "-74"}
Apologies. In my haste, I made an error. My code should've read:
from elasticsearch_dsl import DocType
from elasticsearch_dsl import GeoPoint
from elasticsearch_dsl import Index
from elasticsearch_dsl import MetaField
from elasticsearch_dsl import String
_INDEX_ALIAS = 'city'
_index = Index(_INDEX_ALIAS)
@_index.doc_type
class City(DocType):
city = String()
state = String()
location = GeoPoint(lat_lon=True)
class Meta:
doc_type = 'city'
dynamic = MetaField('strict')
Try that out instead?
I added the lat_lon=True parameter as you have suggested, but it made no difference.
The class definition is fine, and creating an object c of class City is also OK. The issue here is how to create the GeoPoint object and assign it to c.location such that when I call c.save() the object is serialized into:
"location": {"lat": "40", "lon": "-74"}
and not
"location": {"lat": "40", "type": "geo_point", "lon": "-74"}
@Al77056 I've never worked with GeoPoint, so sorry if my suggestions are off...
I notice that you define your field as GeoPoint in your class. I am not sure why you go on to assigning to it a GeoPoint() object. Couldn't you directly assign it your dict?
I.e. {"lat": "40", "lon": "-74"}
Or maybe even assigning it directly with a list: [40, -74], as it is defined as a GeoPoint object field, it might be interpreted directly as lat and lon?
Again, just some thoughts from someone who has never used GeoPoint ;)
To @njoannin's point, I believe that you can even assign location with the string value '40,-74'.
Assigning to a dictionary definitely works. I just start using python not long ago from a statically typed language, which got me into thinking that I need to assign location to a GeoPoint. Thank you all for the help.
Array and strings as @brainix referred to work as well. However, how do I query for that?
Ah, nevermind it's
MyDocType.search().filter(
'geo_distance', distance='1000m', location={"lat": "40", "lon": "-74"}
)
@shredding excuse me锛宨t's lack of doc for the "geo_distance" field, could you give me some advice or refer to use this field. I want to use the distance to sort my result.
Maybe an example helps.
Here's an excerpt of my dating app, Users have a Profile and a method find_near_by which searches for all other profiles within a given distance (the ~Q part filters out the current profile because noone wants to date themself).
from elasticsearch_dsl import String, Date, Integer, Boolean, GeoPoint, Q
class Profile(DynamicDocType):
name = String(fields={'raw': String(index='not_analyzed')})
location = GeoPoint()
def find_near_by(self, distance_in_meters=5000):
"""
:param distance_in_meters: int
:return: elasticsearch_dsl.Search
"""
return self.search().filter(
'geo_distance', distance='{}m'.format(distance_in_meters), location=self.location, distance_type='plane'
).filter(~Q('term', name=self.name))
And the corresponding test to see it in action (the method is part of a django unittest). create_profile just commits a default Profile to elasticsearch and flush_index_that_contains commits pending things to elasticsearch's index in order for it to be there on next query:
def test_finds_profiles_near_by(self):
home_of_digital_avantgarde = dict(lat=52.51581, lon=13.45149)
berghain = dict(lat=52.51106, lon=13.44143)
birthplace_of_author = dict(lat=51.40181, lon=7.19116)
center = self.create_profile(name='center', location=home_of_digital_avantgarde)
close_by = self.create_profile(name='close_by', location=berghain)
far_away = self.create_profile(name='far_away', location=birthplace_of_author)
center.save()
close_by.save()
far_away.save()
self.flush_index_that_contains(Profile)
self.assertEqual(1, center.find_near_by().count())
self.assertEqual('close_by', center.find_near_by().execute()[0].name)
@shredding Wow, it's a wonderful answer! Thank you very much. You know, the doc is lack of details for elasticsearch-dsl-py, your answer give me many ideas beyond my question. But, it seem that I can not get the accurate distance between the point with the others. I also need to sort my result according to the response distances.
I haven't tested it and touched it in a while, but i think that the results are sorted by the distance, because closer distances get a higher score.
I haven't done distance calculation within ES in elasticsearch-dsl, but I know that haystack has implemented that, so you might wanna look how they did it in their elasticsearch backend, which should be very similar to elasicsearch_dsl (both use the same elasticsearch-py under the hood).
Here's a starting point: http://django-haystack.readthedocs.io/en/v2.4.1/spatial.html
If you have a queryset in haystack and do something like:
from haystack.utils.geo import Point
reference = Point(float(longitude), float(latitude))
queryset = queryset.distance('location', reference)
... then your result set will have the distance within the result from ES.
@shredding Thanks. : ) I have look up the src of haystack, but I do not find the way to calculate the distance for the SearchQuerySet. I decide to use the Haversine Formula to generate the distance.
I got the same problem, but I affect a dict to the DocType:
class MyIndex(DocType):
title = String()
location = GeoPoint(lat_lon=True)
class Meta:
index = 'tender'
dynamic = MetaField('strict')
i = MyIndex()
i.title = "this is a test"
i.location = dict(lat=42.222, lon=3.333)
i.save()
but my mapping says:
title: {
type: "string"
},
location: {
properties: {
lat: {
type: "double"
},
lon: {
type: "double"
}
}
}
Any ideas ? I have the same problem with geoshape.
@levylll I'm not sure if you still need the actual distance, but i had to implement it today, so here's the solution:
This will limit your dataset (in my case a vendor who has a GeoPoint named headquarter_location) to all Vendors, where the headquarter location is max 50km away from the given lat/long!
queryset = Vendor.search().filter(
'geo_distance',
distance='50km',
headquarter_location={'lat': 52.5720661, 'lon': 13.4126604}
)
This will sort all vendors according to the distance of a headquarter location from the given lat/long:
queryset = queryset.sort(
{"_geo_distance": {
'headquarter_location': {'lat': 52.5720661, 'lon': 13.4126604},
"order": "asc",
"unit": "km"
}}
)
The filter and the sort can be combined.
The distance is than available in the result:
for vendor in queryset:
distance_in_km = vendor.meta.sort[0]
@shredding Thank you very much. I have used the way you described, It's great!
Most helpful comment
Ah, nevermind it's