Elasticsearch-dsl-py: How to create a geo_point

Created on 26 Apr 2016  路  17Comments  路  Source: elastic/elasticsearch-dsl-py

I am testing an index with geo_point. The mapping is like the following
"city": {
"properties": {
"city": {"type": "string"},
"state": {"type": "string"},
"location": {"type": "geo_point"}
}
}

I cannot find the type to import from elasticsearch_dsl. Is "geo_point" supported? If it is, how can I declare a class with attribute of type "geo_point" and create an object of "geo_point"?
Thanks.

Most helpful comment

Ah, nevermind it's

MyDocType.search().filter(
    'geo_distance', distance='1000m', location={"lat": "40", "lon": "-74"}
)

All 17 comments

elasticsearch-dsl-py handles this beautifully. You can define your DocType like this:

from elasticsearch_dsl import DocType
from elasticsearch_dsl import GeoShape
from elasticsearch_dsl import Index
from elasticsearch_dsl import MetaField
from elasticsearch_dsl import String

_INDEX_ALIAS = 'city'
_index = Index(_INDEX_ALIAS)

@_index.doc_type
class City(DocType):
    city = String()
    state = String()
    location = GeoShape()

    class Meta:
        doc_type = 'city'
        dynamic = MetaField('strict')

There's some good documentation here. Hope that helps!

Thanks for the quick reply.
I see that GeoShape is dynamically created Field type along with GeoPoint. When I tried to index a city with the following code
c = City()
c.city = 'New York'
c.state = 'NY'
c.location = GeoPoint()
c.location.lat = 40
c.location.lon = -74

Elasticsearch returns an error: field must be either [lat], [lon] or [geohash]
It looks like location is serialized into following instead with an extra "type" attribute.
"location": {"lat": "40", "type": "geo_point", "lon": "-74"}

Apologies. In my haste, I made an error. My code should've read:

from elasticsearch_dsl import DocType
from elasticsearch_dsl import GeoPoint
from elasticsearch_dsl import Index
from elasticsearch_dsl import MetaField
from elasticsearch_dsl import String

_INDEX_ALIAS = 'city'
_index = Index(_INDEX_ALIAS)

@_index.doc_type
class City(DocType):
    city = String()
    state = String()
    location = GeoPoint(lat_lon=True)

    class Meta:
        doc_type = 'city'
        dynamic = MetaField('strict')

Try that out instead?

I added the lat_lon=True parameter as you have suggested, but it made no difference.

The class definition is fine, and creating an object c of class City is also OK. The issue here is how to create the GeoPoint object and assign it to c.location such that when I call c.save() the object is serialized into:
"location": {"lat": "40", "lon": "-74"}
and not
"location": {"lat": "40", "type": "geo_point", "lon": "-74"}

@Al77056 I've never worked with GeoPoint, so sorry if my suggestions are off...

I notice that you define your field as GeoPoint in your class. I am not sure why you go on to assigning to it a GeoPoint() object. Couldn't you directly assign it your dict?

I.e. {"lat": "40", "lon": "-74"}

Or maybe even assigning it directly with a list: [40, -74], as it is defined as a GeoPoint object field, it might be interpreted directly as lat and lon?

Again, just some thoughts from someone who has never used GeoPoint ;)

To @njoannin's point, I believe that you can even assign location with the string value '40,-74'.

Assigning to a dictionary definitely works. I just start using python not long ago from a statically typed language, which got me into thinking that I need to assign location to a GeoPoint. Thank you all for the help.

Array and strings as @brainix referred to work as well. However, how do I query for that?

Ah, nevermind it's

MyDocType.search().filter(
    'geo_distance', distance='1000m', location={"lat": "40", "lon": "-74"}
)

@shredding excuse me锛宨t's lack of doc for the "geo_distance" field, could you give me some advice or refer to use this field. I want to use the distance to sort my result.

Maybe an example helps.

Here's an excerpt of my dating app, Users have a Profile and a method find_near_by which searches for all other profiles within a given distance (the ~Q part filters out the current profile because noone wants to date themself).

from elasticsearch_dsl import String, Date, Integer, Boolean, GeoPoint, Q

class Profile(DynamicDocType):

    name = String(fields={'raw': String(index='not_analyzed')})
    location = GeoPoint()

    def find_near_by(self, distance_in_meters=5000):
        """
        :param distance_in_meters: int
        :return: elasticsearch_dsl.Search
        """
        return self.search().filter(
            'geo_distance', distance='{}m'.format(distance_in_meters), location=self.location, distance_type='plane'
        ).filter(~Q('term', name=self.name))

And the corresponding test to see it in action (the method is part of a django unittest). create_profile just commits a default Profile to elasticsearch and flush_index_that_contains commits pending things to elasticsearch's index in order for it to be there on next query:

 def test_finds_profiles_near_by(self):
        home_of_digital_avantgarde = dict(lat=52.51581, lon=13.45149)
        berghain = dict(lat=52.51106, lon=13.44143)
        birthplace_of_author = dict(lat=51.40181, lon=7.19116)

        center = self.create_profile(name='center', location=home_of_digital_avantgarde)
        close_by = self.create_profile(name='close_by', location=berghain)
        far_away = self.create_profile(name='far_away', location=birthplace_of_author)

        center.save()
        close_by.save()
        far_away.save()
        self.flush_index_that_contains(Profile)

        self.assertEqual(1, center.find_near_by().count())
        self.assertEqual('close_by', center.find_near_by().execute()[0].name)

@shredding Wow, it's a wonderful answer! Thank you very much. You know, the doc is lack of details for elasticsearch-dsl-py, your answer give me many ideas beyond my question. But, it seem that I can not get the accurate distance between the point with the others. I also need to sort my result according to the response distances.

I haven't tested it and touched it in a while, but i think that the results are sorted by the distance, because closer distances get a higher score.

I haven't done distance calculation within ES in elasticsearch-dsl, but I know that haystack has implemented that, so you might wanna look how they did it in their elasticsearch backend, which should be very similar to elasicsearch_dsl (both use the same elasticsearch-py under the hood).

Here's a starting point: http://django-haystack.readthedocs.io/en/v2.4.1/spatial.html

If you have a queryset in haystack and do something like:

from haystack.utils.geo import Point

reference = Point(float(longitude), float(latitude))
queryset = queryset.distance('location', reference)

... then your result set will have the distance within the result from ES.

@shredding Thanks. : ) I have look up the src of haystack, but I do not find the way to calculate the distance for the SearchQuerySet. I decide to use the Haversine Formula to generate the distance.

I got the same problem, but I affect a dict to the DocType:

class MyIndex(DocType):
    title = String()
    location = GeoPoint(lat_lon=True)

    class Meta:
        index = 'tender'
        dynamic = MetaField('strict')
i = MyIndex()
i.title = "this is a test"
i.location = dict(lat=42.222, lon=3.333)
i.save()

but my mapping says:

title: {
   type: "string"
},
location: {
  properties: {
    lat: {
      type: "double"
    },
    lon: {
      type: "double"
    }
  }
}

Any ideas ? I have the same problem with geoshape.

@levylll I'm not sure if you still need the actual distance, but i had to implement it today, so here's the solution:

This will limit your dataset (in my case a vendor who has a GeoPoint named headquarter_location) to all Vendors, where the headquarter location is max 50km away from the given lat/long!

queryset = Vendor.search().filter(
    'geo_distance', 
    distance='50km', 
    headquarter_location={'lat': 52.5720661, 'lon': 13.4126604}
)

This will sort all vendors according to the distance of a headquarter location from the given lat/long:

queryset = queryset.sort(
    {"_geo_distance": {
        'headquarter_location': {'lat': 52.5720661, 'lon': 13.4126604},
        "order": "asc",
        "unit": "km"
    }}
)

The filter and the sort can be combined.

The distance is than available in the result:

for vendor in queryset:
    distance_in_km = vendor.meta.sort[0]

@shredding Thank you very much. I have used the way you described, It's great!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

SalahAdDin picture SalahAdDin  路  4Comments

vanzi picture vanzi  路  4Comments

gabrielpjordao picture gabrielpjordao  路  3Comments

MauriJHN picture MauriJHN  路  4Comments

mortada picture mortada  路  3Comments