Geopandas: QUESTION: Zip Files Documentation

Created on 20 Jan 2019  路  4Comments  路  Source: geopandas/geopandas

Where is the documentation on reading zip files directly with geopandas? I ask because I'm willing to do a quick pull requests for the docs to add.

If this does not exist, would maintainers be open to a simple doc update; I'll do the pull request.

I found the example in a Scipy 2018 tutorial here.

I looked on the geopandas page and github and didn't find an explicit mention. This is a useful feature as a lot of ESRI shapefiles get shipped around. Hopefully the GeoPackage spec takes over completely, but having the zip example would help. Maybe it's obvious to some but it wasn't to me for a long time.
Example:

# download zipped country file from https://gadm.org/download_country_v3.html
states = gpd.read_file('zip:///Users/linwood/Downloads/cb_2017_us_state_500k.zip') # read in zip
documentation good first issue

Most helpful comment

This feature isn't as well documented in https://fiona.readthedocs.io/en/latest/README.html#collections-from-archives-and-virtual-file-systems as I'd like, I'll work on that, too.

Keep in mind that fiona.open("zip://data.zip") (which is what read_file calls) is only useful when data.zip contains a single vector dataset in the root of the zip file. If the dataset is in a folder in the zip file you must extend the identifier like zip://data.zip!folder. If there are multiple datasets in that folder you must also specify the filename like zip://data.zip!folder/file.shp.

All 4 comments

@linwoodc3 Thanks for raising this. A note on reading zipfiles directly as above would be a welcome addition in https://github.com/geopandas/geopandas/blob/master/doc/source/io.rst

@linwoodc3 @jorisvandenbossche Hi, I'd like to take on this issue!

This feature isn't as well documented in https://fiona.readthedocs.io/en/latest/README.html#collections-from-archives-and-virtual-file-systems as I'd like, I'll work on that, too.

Keep in mind that fiona.open("zip://data.zip") (which is what read_file calls) is only useful when data.zip contains a single vector dataset in the root of the zip file. If the dataset is in a folder in the zip file you must extend the identifier like zip://data.zip!folder. If there are multiple datasets in that folder you must also specify the filename like zip://data.zip!folder/file.shp.

Thanks for the info @sgillies !

Was this page helpful?
0 / 5 - 0 ratings