Howdy,
I developed geoplot (repo), a package for high-level geospatial plotting, a few months back. Right now it's a standalone module, but I'm interested in integrating this directly into geopandas as a geospatial visualization module that's akin to the pandas plotting lib functions: gpd.plot.cartogram, gpd.plot.kdeplot, etcetera.
Is there an interest in doing something like this? Would love to discuss.
cc: @jorisvandenbossche, since you're active here and we're already in contact in pandas dev...
@jwass and @kjordahl also, since this would be a pretty huge change-set.
I developed geoplot
and which is a really cool project!
we're already in contact in pandas dev...
ah, so I again can question the need to add more methods ..? :-)
Just commenting here to let you know that I saw your post (and which is a very interesting subject you brought up), and that I will certainly answer in more depth, but that may take a few more days. (but let that not prevent others from already answering!)
So while @jorisvandenbossche is distracted by the Docathon (:smile:), I thought I'd drop some notes about what the scope of this change would be.
In terms of API, I think for compat it's best to continue to use the current GeoDataFrame.plot/GeoSeries.plot behavior, and extend the plot accessor with the additional options enabled by geoplot. So, at a minimum, you'd see the following additional methods:
GeoDataframe.plot.pointplotGeoDataFrame.plot.cartogramGeoDataFrame.plot.kdeplotGeoDataFrame.plot.sankeyThis is equivalent to what pandas does right now (note, though, that pandas is moving to deprecate the root plot method "eventually").
There are two methods that I am leaving out of this list. The first is GeoDataFrame.plot.aggplot, an experimental minimal-assumptions take on a choropleth that makes somewhat less sense when it's attached to an explicitly geospatial data structure. geopandas does support null geometry columns; so if this is something that the devs are comfortable including, I think it can be done. But, it may find a better home somewhere in the pysal.geoplot. Or maybe it's just a terrible idea and should be cut completely. I'm not sure, TBD.
The other method is GeoDataFrame.plot.choropleth. The geoplot one is an improved iteration on the pure-geopandas one that supports projects and a bit of other extra "stuff". I think the APIs already match, and geopandas.choropleth can be transparently inserted into GeoDataFrame.plot with backwards compatibility and a minimum of modifications. But, this would be a more substantial change than simply hanging new methods as above.
geoplot has its own tests. These can be integrated into geopandas pretty smoothly. However, the tests I have are pretty bare-bones, I'm just checking that the plot "works". I will need to do some work creating more vigorous unit tests; again going off the pandas model here.
Also, the property tests I have (implemented with hypothesis) are cool, but ultimately non-essential and take forever and should be cut.
geoplot could still use some polish in places. I think we can work on fixing up trouble spots as a part of this effort.
geoplot has its own docs, these would need to be rewritten obviously. The tutorial docs would become a section in the geoplot docs, the API reference likewise. I'm pretty sure we won't be able to bring the seaborn-style illustrative images over...oh well.
Something I would like to keep is the Example Gallery. Not sure how that would work, but I think the geoplot matplotlib-style Gallery is pretty awesome (hope you agree :smile:) and I'd like to keep it. How is TBD.
geoplot on its own is not Windows-compat AFIAK because of DLL hell issues. However, I'm 80% sure this will clear up if it's made a part of geopandas directly, since it's interactions between geoplot and geopandas and cartopy that are causing the issue.
The code needs to be retouched to make it 3.4 and 2.7 compat, as the rest of this lib is.
I think I'd prefer to get these changes into geopandas as a series of PRs, if possible. Probably break things up by method?
Once the whole thing is "in" and a, I think it'd be nice to cut this as a new release. This is a lot after all!
In non-geospatial data viz there is the matplotlib-to-pandas plotting-to-seaborn stack: matplotlib provides your low-level features, pandas provides an intermediate layer optimized for speed, and seaborn provides a high-level layer optimized for analytical capability. I'd like to see this implemented in geospatial Python as a cartopy-to-geopandas plotting-pysal plotting kind of thing. This would be a pretty big step in that direction, obviously.
@ResidentMario is this idea still alive? Or is it preferred to keep geoplot as separate package? We may also keep geoplot as is and use GeoDataFrame.plot.kdeplot etc. as a convenient interface only.
At this point geoplot is a fairly mature an independent implementation of geospatial plotting tooling, with an API that's very different from the one that geopandas uses. I forsee that the two will continue to exist as independent packages.
I would like to do the following:
geopandas one.Sadly I have not had much time to work on geoplot as of late, but maybe in the next few months I will be able to set aside some time to sprint on it some more.
Most helpful comment
So while @jorisvandenbossche is distracted by the Docathon (:smile:), I thought I'd drop some notes about what the scope of this change would be.
API
In terms of API, I think for compat it's best to continue to use the current
GeoDataFrame.plot/GeoSeries.plotbehavior, and extend theplotaccessor with the additional options enabled bygeoplot. So, at a minimum, you'd see the following additional methods:GeoDataframe.plot.pointplotGeoDataFrame.plot.cartogramGeoDataFrame.plot.kdeplotGeoDataFrame.plot.sankeyThis is equivalent to what
pandasdoes right now (note, though, thatpandasis moving to deprecate the rootplotmethod "eventually").There are two methods that I am leaving out of this list. The first is
GeoDataFrame.plot.aggplot, an experimental minimal-assumptions take on a choropleth that makes somewhat less sense when it's attached to an explicitly geospatial data structure.geopandasdoes support null geometry columns; so if this is something that the devs are comfortable including, I think it can be done. But, it may find a better home somewhere in thepysal.geoplot. Or maybe it's just a terrible idea and should be cut completely. I'm not sure, TBD.The other method is
GeoDataFrame.plot.choropleth. Thegeoplotone is an improved iteration on the pure-geopandasone that supports projects and a bit of other extra "stuff". I think the APIs already match, andgeopandas.choroplethcan be transparently inserted intoGeoDataFrame.plotwith backwards compatibility and a minimum of modifications. But, this would be a more substantial change than simply hanging new methods as above.Tests
geoplothas its own tests. These can be integrated intogeopandaspretty smoothly. However, the tests I have are pretty bare-bones, I'm just checking that the plot "works". I will need to do some work creating more vigorous unit tests; again going off thepandasmodel here.Also, the property tests I have (implemented with
hypothesis) are cool, but ultimately non-essential and take forever and should be cut.Updates
geoplotcould still use some polish in places. I think we can work on fixing up trouble spots as a part of this effort.Docs
geoplothas its own docs, these would need to be rewritten obviously. The tutorial docs would become a section in thegeoplotdocs, the API reference likewise. I'm pretty sure we won't be able to bring theseaborn-style illustrative images over...oh well.Something I would like to keep is the Example Gallery. Not sure how that would work, but I think the
geoplotmatplotlib-style Gallery is pretty awesome (hope you agree :smile:) and I'd like to keep it. How is TBD.Windows compat
geoploton its own is not Windows-compat AFIAK because of DLL hell issues. However, I'm 80% sure this will clear up if it's made a part ofgeopandasdirectly, since it's interactions between geoplot and geopandas and cartopy that are causing the issue.Version compat
The code needs to be retouched to make it 3.4 and 2.7 compat, as the rest of this lib is.
Pull strategy
I think I'd prefer to get these changes into
geopandasas a series of PRs, if possible. Probably break things up by method?Release strategy
Once the whole thing is "in" and a, I think it'd be nice to cut this as a new release. This is a lot after all!