Geopandas: More extensive plotting tools?

Created on 27 Feb 2017  路  6Comments  路  Source: geopandas/geopandas

Howdy,

I developed geoplot (repo), a package for high-level geospatial plotting, a few months back. Right now it's a standalone module, but I'm interested in integrating this directly into geopandas as a geospatial visualization module that's akin to the pandas plotting lib functions: gpd.plot.cartogram, gpd.plot.kdeplot, etcetera.

Is there an interest in doing something like this? Would love to discuss.

cc: @jorisvandenbossche, since you're active here and we're already in contact in pandas dev...

Most helpful comment

So while @jorisvandenbossche is distracted by the Docathon (:smile:), I thought I'd drop some notes about what the scope of this change would be.

API

In terms of API, I think for compat it's best to continue to use the current GeoDataFrame.plot/GeoSeries.plot behavior, and extend the plot accessor with the additional options enabled by geoplot. So, at a minimum, you'd see the following additional methods:

This is equivalent to what pandas does right now (note, though, that pandas is moving to deprecate the root plot method "eventually").

There are two methods that I am leaving out of this list. The first is GeoDataFrame.plot.aggplot, an experimental minimal-assumptions take on a choropleth that makes somewhat less sense when it's attached to an explicitly geospatial data structure. geopandas does support null geometry columns; so if this is something that the devs are comfortable including, I think it can be done. But, it may find a better home somewhere in the pysal.geoplot. Or maybe it's just a terrible idea and should be cut completely. I'm not sure, TBD.

The other method is GeoDataFrame.plot.choropleth. The geoplot one is an improved iteration on the pure-geopandas one that supports projects and a bit of other extra "stuff". I think the APIs already match, and geopandas.choropleth can be transparently inserted into GeoDataFrame.plot with backwards compatibility and a minimum of modifications. But, this would be a more substantial change than simply hanging new methods as above.

Tests

geoplot has its own tests. These can be integrated into geopandas pretty smoothly. However, the tests I have are pretty bare-bones, I'm just checking that the plot "works". I will need to do some work creating more vigorous unit tests; again going off the pandas model here.

Also, the property tests I have (implemented with hypothesis) are cool, but ultimately non-essential and take forever and should be cut.

Updates

geoplot could still use some polish in places. I think we can work on fixing up trouble spots as a part of this effort.

Docs

geoplot has its own docs, these would need to be rewritten obviously. The tutorial docs would become a section in the geoplot docs, the API reference likewise. I'm pretty sure we won't be able to bring the seaborn-style illustrative images over...oh well.

Something I would like to keep is the Example Gallery. Not sure how that would work, but I think the geoplot matplotlib-style Gallery is pretty awesome (hope you agree :smile:) and I'd like to keep it. How is TBD.

Windows compat

geoplot on its own is not Windows-compat AFIAK because of DLL hell issues. However, I'm 80% sure this will clear up if it's made a part of geopandas directly, since it's interactions between geoplot and geopandas and cartopy that are causing the issue.

Version compat

The code needs to be retouched to make it 3.4 and 2.7 compat, as the rest of this lib is.

Pull strategy

I think I'd prefer to get these changes into geopandas as a series of PRs, if possible. Probably break things up by method?

Release strategy

Once the whole thing is "in" and a, I think it'd be nice to cut this as a new release. This is a lot after all!

All 6 comments

@jwass and @kjordahl also, since this would be a pretty huge change-set.

I developed geoplot

and which is a really cool project!

we're already in contact in pandas dev...

ah, so I again can question the need to add more methods ..? :-)

Just commenting here to let you know that I saw your post (and which is a very interesting subject you brought up), and that I will certainly answer in more depth, but that may take a few more days. (but let that not prevent others from already answering!)

So while @jorisvandenbossche is distracted by the Docathon (:smile:), I thought I'd drop some notes about what the scope of this change would be.

API

In terms of API, I think for compat it's best to continue to use the current GeoDataFrame.plot/GeoSeries.plot behavior, and extend the plot accessor with the additional options enabled by geoplot. So, at a minimum, you'd see the following additional methods:

This is equivalent to what pandas does right now (note, though, that pandas is moving to deprecate the root plot method "eventually").

There are two methods that I am leaving out of this list. The first is GeoDataFrame.plot.aggplot, an experimental minimal-assumptions take on a choropleth that makes somewhat less sense when it's attached to an explicitly geospatial data structure. geopandas does support null geometry columns; so if this is something that the devs are comfortable including, I think it can be done. But, it may find a better home somewhere in the pysal.geoplot. Or maybe it's just a terrible idea and should be cut completely. I'm not sure, TBD.

The other method is GeoDataFrame.plot.choropleth. The geoplot one is an improved iteration on the pure-geopandas one that supports projects and a bit of other extra "stuff". I think the APIs already match, and geopandas.choropleth can be transparently inserted into GeoDataFrame.plot with backwards compatibility and a minimum of modifications. But, this would be a more substantial change than simply hanging new methods as above.

Tests

geoplot has its own tests. These can be integrated into geopandas pretty smoothly. However, the tests I have are pretty bare-bones, I'm just checking that the plot "works". I will need to do some work creating more vigorous unit tests; again going off the pandas model here.

Also, the property tests I have (implemented with hypothesis) are cool, but ultimately non-essential and take forever and should be cut.

Updates

geoplot could still use some polish in places. I think we can work on fixing up trouble spots as a part of this effort.

Docs

geoplot has its own docs, these would need to be rewritten obviously. The tutorial docs would become a section in the geoplot docs, the API reference likewise. I'm pretty sure we won't be able to bring the seaborn-style illustrative images over...oh well.

Something I would like to keep is the Example Gallery. Not sure how that would work, but I think the geoplot matplotlib-style Gallery is pretty awesome (hope you agree :smile:) and I'd like to keep it. How is TBD.

Windows compat

geoplot on its own is not Windows-compat AFIAK because of DLL hell issues. However, I'm 80% sure this will clear up if it's made a part of geopandas directly, since it's interactions between geoplot and geopandas and cartopy that are causing the issue.

Version compat

The code needs to be retouched to make it 3.4 and 2.7 compat, as the rest of this lib is.

Pull strategy

I think I'd prefer to get these changes into geopandas as a series of PRs, if possible. Probably break things up by method?

Release strategy

Once the whole thing is "in" and a, I think it'd be nice to cut this as a new release. This is a lot after all!

In non-geospatial data viz there is the matplotlib-to-pandas plotting-to-seaborn stack: matplotlib provides your low-level features, pandas provides an intermediate layer optimized for speed, and seaborn provides a high-level layer optimized for analytical capability. I'd like to see this implemented in geospatial Python as a cartopy-to-geopandas plotting-pysal plotting kind of thing. This would be a pretty big step in that direction, obviously.

@ResidentMario is this idea still alive? Or is it preferred to keep geoplot as separate package? We may also keep geoplot as is and use GeoDataFrame.plot.kdeplot etc. as a convenient interface only.

At this point geoplot is a fairly mature an independent implementation of geospatial plotting tooling, with an API that's very different from the one that geopandas uses. I forsee that the two will continue to exist as independent packages.

I would like to do the following:

  • Finish up a few remaining chores and polish tasks
  • Act on an idea that @jorisvandenbossche a couple of years later, which was to move the library out of my own GH project domain and into the geopandas one.
  • At that point I think that and having convenience interface is a great idea! And I'd be happy to write that up.

Sadly I have not had much time to work on geoplot as of late, but maybe in the next few months I will be able to set aside some time to sprint on it some more.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rutgerhofste picture rutgerhofste  路  4Comments

galak75 picture galak75  路  6Comments

jorisvandenbossche picture jorisvandenbossche  路  3Comments

awa5114 picture awa5114  路  5Comments

J4nJ4nsen picture J4nJ4nsen  路  3Comments