Geopandas: QST: Trouble reading a large ESRI geodatabase

Created on 15 Oct 2020  路  5Comments  路  Source: geopandas/geopandas

I am trying to read a large ESRI File geodatabase. I know that geopandas can read the particular layer I'm interested in because it successfully loads the other layers in the File Geodatabase with no issues.

However this layer is a large polygon layer containing all building footprints for a major European country. The .read_file method takes extremely long to execute.

How can I reduce the amount of time it takes to read this dataset so that I can perform further analysis on it?

question

All 5 comments

If the issue is performance, you can try using pyogrio - https://github.com/brendan-ward/pyogrio/ Otherwise I am not sure if there's much we can do now. Pyorgio might become a default option in geopandas in future, but not now.

I ended up using the bbox optional argument in geopandas.read_file which improved things quite a bit. But thanks, its good to know ....

@awa5114 using geopandas, you could also try to read only a subset of the data at a time, if that is possible for your use case. See the docs here about the multiple options for this: https://geopandas.readthedocs.io/en/latest/docs/user_guide/io.html#reading-subsets-of-the-data

Yeah, so you already discovered that option! ;)
(closing this issue then)

@jorisvandenbossche sorry, but reading using the bbox option is not foolproof and causes all sorts of geometry problems. I solved one issue (self-intersecting polygons) by applying a zero buffer, but now I'm getting a new issue:

IllegalArgumentException: Points of LinearRing do not form a closed linestring
Traceback (most recent call last):
  File "clip_datasets.py", line 32, in <module>
    gdf = geopandas.read_file(dataset_path, layer=layer, bbox=box(*buffered_trace.total_bounds))
  File "lib\site-packages\shapely\geometry\geo.py", line 59, in box
    return Polygon(coords)
  File "lib\site-packages\shapely\geometry\polygon.py", line 243, in __init__
    ret = geos_polygon_from_py(shell, holes)
  File "lib\site-packages\shapely\geometry\polygon.py", line 509, in geos_polygon_from_py
    ret = geos_linearring_from_py(shell)
  File "shapely\speedups\_speedups.pyx", line 408, in shapely.speedups._speedups.geos_linearring_from_py
ValueError: GEOSGeom_createLinearRing_r returned a NULL pointer

Why does this occur? How can I fix or circumvent it?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

StevenLi-DS picture StevenLi-DS  路  4Comments

kuanb picture kuanb  路  4Comments

fmaussion picture fmaussion  路  6Comments

linwoodc3 picture linwoodc3  路  4Comments

mattayes picture mattayes  路  6Comments