Presto: add support for GeoJSON in geospatial functions

Created on 13 Jun 2019  路  13Comments  路  Source: prestodb/presto

Currently, only WKT are supported as serializable/deserializable when converting to Geometry objects. However, the API used by Presto to create Geometries _also_ allows for deserialization from GeoJSON, as well. These should be supported in addition to WKT and binary.

Proposing the addition of:

  • ST_GeometryFromGeoJSON

    • Accepts JSON, returns GEOMETRY_TYPE_NAME

  • ST_AsGeoJSON (deserialization)

    • Accepts GEOMETRY_TYPE_NAME, returns JSON

These are complimentary to the FromText/AsText functions, and handle similarly in errors, NULL, etc.

CC @mbasmanova

All 13 comments

@ampampamp Alex, would you describe the use case that would benefit from having these functions available?

ST_* functions come from SQL/MM Part 3 specification which doesn't seem to have support for GeoJSON. Only WKT, WKB and GML are supported. Hence, let's rename these new functions. Perhaps, to_geo_json and from_geo_json.

4.2.1.2 Functions on ST_Geometry
1) ST_GeomFromText: returns an ST_Geometry value, which is transformed from a CHARACTER
LARGE OBJECT value that represents the well-known text representation of an ST_Geometry.
2) ST_GeomFromWKB: returns an ST_Geometry value, which is transformed from a BINARY LARGE
OBJECT value that represents the well-known binary representation of an ST_Geometry.
3) ST_GeomFromGML: returns an ST_Geometry value, which is transformed from a CHARACTER

CC: @jagill

I've seen that GeoJSON is a similarly-supported format to WKT, and have heard from some users that their current means of working with GeoJSON involve complicated workflows. Presto seems to be a good home for harboring _any_ geospatial functionality, and adding GeoJSON support seems complimentary and natural.

I read through the specification a bit, and I agree that the functions should be renamed. I am fine with to_geo_json and from_geo_json.

@mbasmanova @jagill

CC: @zhenxiao

Something that might be worth consider is that, while it's not part of the SQL3 spec, PostGIS named the functions ST_AsGeoJSON and ST_GeomFromGeoJSON. If we want to make the SQL easy to port to/from PostGIS (which I think has merit), we may want to name it those. Also, the PostGIS implementation is ST_GeomFromGeoJSON only takes the geometry fragment of the GeoJSON object, ie:

ST_GeomFromGeoJSON('{"type":"Point","coordinates":[-48.23456,20.12345]}')

correspondingly, ST_AsGeoJSON creates the geometry fragment:

ST_AsGeoJSON(geom)
---
'{"type":"Point","coordinates":[23.569251,51.541599]}'

(Although I think we should err on the side of permissive deserialization, so that someone could give either a geometry fragment or a whole geojson, where we just search out the geometry fragment.)

@jagill I agree with the latter, supporting deserialization of supported GeoJSON types like Features, FeatureCollections, etc, for the reason that the library used by Presto _supports_ this, and it shouldn't be contractually expected to supply individual geometries from more complex GeoJSON objects like GeometryCollection.

Regarding the naming, I'm fine with either, although I'd like to hear @mbasmanova on her thoughts here.

I don't think the fact that ESRI or JTS library supports a feature is a sufficient reason for exposing that feature to Presto users. Let's find out what the use cases are and figure out how to support them. Are users ingesting data from a source that speaks GeoJSON? If so, do they need to get access to the geometry object only or metadata as well? Would it make sense to support casting JSON to/from Geometry? (see https://prestodb.github.io/docs/current/functions/json.html ) Do users need a way to produce GeoJSON that includes metadata?

I don't think we'd be exposing anything. The simple use case is, given a GeoJSON, extract the geometry object, whether or not you provide metadata. Users _may_ be aware of the fact the GeoJSON can contain such metadata, but the function being implemented here should be able to handle an object _with_ metadata, like FeatureCollection, or a simple Polygon, and I believe this is what @jagill was referring to. Correspondingly, I don't think users would be necessarily interested in producing GeoJSON with metadata; as long as serialization/deserialization works, that's all that matters.

@ampampamp In this case, consider adding cast to/from Geometry and JSON. Let's also document that the conversions are lossy, e.g. metadata is being lost.

I wonder why has this been closed by the author while the discussion doesn't explicitly reject the idea? I find the ST_GeomFromGeoJSON and ST_AsGeoJSON extremely useful. My usecase is:

  • primary backend being MongoDB which stores all geospatial values natively in GeoJSON so that spatial queries using Mongo API directly work
  • I also access this Mongo collection using PrestoDB to join the data with other datasets using the spatial features; unfortunately I cannot use the original geometry values stored in GeoJSON as it makes no sense to Presto which needs WKT
  • my only solution was to add another value to the mongo collection with the same geometry stored also in WKT just so I can manipulate the data from Presto; this is ugly data redundancy which would be unnecessary had Presto allowed me to use the existing geometry stored in GeoJSON using the ST_GeomFromGeoJSON

ok, I guess it is tracked under #12478...

CC: @jagill

This issue was closed because the OP ended up not needing the functionality, so we didn't have a good use case for it. Interoperability with Mongo is a compelling case. We can definitely put this in the queue.

Was this page helpful?
0 / 5 - 0 ratings