Xarray: HoloViews based plotting API

Created on 30 May 2018  路  10Comments  路  Source: pydata/xarray

As part of a recent project we have been working on a plotting API for a number of projects including pandas and xarray called HoloPlot. You can see some examples using the API with xarray here. As the name suggests it is built on HoloViews and is meant as an alternative for the native plotting API that closely mirrors but does not necessarily match those APIs exactly. The main differences are:

  • Certain keywords are likely to differ, e.g. width/height vs fig_inches
  • The API returns HoloViews objects which can be composed and display themselves
  • It supports some additional features such as datashading and exploring a parameter space with widgets

The main question I'd like to put to the xarray community is how we should best expose this API. In pandas there has been some discussion to add a configurable engine for the plotting API letting you switch between different plotting implementations (see https://github.com/pandas-dev/pandas/issues/14130). The approach we started with was to clobber the DataArray.plot API entirely, which I now consider to obtrusive and likely to interfere with existing workflows. The alternative approaches we considered:

  • Name the patched method different, e.g. DataArray.hvplot, DataArray.hplot or DataArray.holoplot
  • Patch DataArray.plot but add an engine keyword to toggle between the original and HoloPlot API.
  • Add a global toggle to switch between the APIs (likely in addition to the engine keyword)

I'd love to hear what xarray maintainers and users think would be the best approach here.

community

Most helpful comment

I agree the accessor is the best option for now, but I have no strong opinions about the name of the accessor.

Okay thanks, given xarray's preference for accessor names to match projects I'm now leaning toward da.holoplot().

Automatic generation of DynamicMaps. Say I have a DataArray with dimensions ('time', 'lat', 'lon'); I should be able to say da.hv.plot(kdims=['lat', 'lon'] and have time become a dynamic selector.

HoloPlot explicitly does not deal with kdims and vdims instead more closely following the API of pd.DataFrame.plot and xr.DataArray. That said coordinates that are not assigned to the x/y axes will automatically result in a DynamicMap, so this will give you an image plot + a widget to select the time:

da.holoplot(x='lon', y='lat', kind='image')

To go along with the above, lazy loading of dask-backed arrays

That should happen automatically.

Intelligent faceting which automatically links the facet kdims

You can facet in a number of ways:

da.isel(time=slice(0, 3)).holoplot(x='lon', y='lat', kind='image', by='time')

will produce three subplots which are linked on the x- and y-axis, i.e. zooming on one will zoom on all unless you set shared_axes=False. You can also generate a grid with:

da.isel(time=slice(0, 3)).holoplot(x='lon', y='lat', kind='image', row='time', col='some_other_coord')

Plotting not just of DataArrays but Datasets.

This is also already supported, the API here is:

ds.holoplot(x='lon', y='lat', z=['air', 'surface'])

Will provide a widget to select between the 'air' and 'surface' data variable.

Options for projections, coastlines, etc. associated with geoviews

Currently working on that, it's basically just waiting on new HoloViews/GeoViews releases. The API here is as follows:

air_ds.air.holoplot.quadmesh(
    'lon', 'lat', ['air', 'some_other_variable'], crs=ccrs.PlateCarree(), projection=ccrs.Orthographic(-80, 30),
    global_extent=True, width=600, height=500, cmap='viridis'
) * gv.feature.coastline

screen shot 2018-05-30 at 9 03 53 pm

All 10 comments

It looks like a good use case for accessors. The syntax could then be: DataArray.hv.plot() and would give you full flexibility.

Very cool! I also think this would be a good use case for a new accessor, perhaps DataArray.holoplot() mirroring our preference for accessor names to match projects.

An engine keyword/option could also be viable, but would require more coordination (e.g., figuring out the plotting interface, which seems to have stalled that plotting issue). That said, if pandas figured out a way to do this I'm sure we would be happy to copy it.

Thanks for the feedback! I'll try to drive the pandas conversation along, but since I doubt that will be resolved in the near term so I think until then we should definitely pursue the accessor approach (which is much better than the property monkey patching we're doing now).

Personally I'd prefer DataArray.hvplot() since I think even the two extra characters make a difference and something like DataArray.hv.plot.contourf() seems too deeply nested. That said if "our preference for accessor names to match projects" is a solidly established convention I'll defer to that and go with DataArray.holoplot().

@rabernat Since you have used HoloViews with xarray in the past I'd very appreciate your input as well.

I am a big fan of holoviews and have been using it extensively for my own work in recent months. So obviously I am a big 馃憤 on this integration.

I agree the accessor is the best option for now, but I have no strong opinions about the name of the accessor.

Some features I would like to see are things that go beyond the plotting capabilities associated with the matplotlib engine. For example:

  • Automatic generation of DynamicMaps. Say I have a DataArray with dimensions ('time', 'lat', 'lon'); I should be able to say da.hv.plot(kdims=['lat', 'lon'] and have time become a dynamic selector.
  • To go along with the above, lazy loading of dask-backed arrays
  • Intelligent faceting which automatically links the facet kdims
  • Plotting not just of DataArrays but Datasets. The variable itself could become a dynamic selector in a dropdown menu. Basically, I just want to say ds.hv.plot() and have holoviews provide all the options I need to explore the dataset interactively. Kind of like how ncview works. At that point, we won't need ncview anymore.
  • Options for projections, coastlines, etc. associated with geoviews

Oh and another big 馃憤 to the datashader integration. This is crucial for my datasets.

I agree the accessor is the best option for now, but I have no strong opinions about the name of the accessor.

Okay thanks, given xarray's preference for accessor names to match projects I'm now leaning toward da.holoplot().

Automatic generation of DynamicMaps. Say I have a DataArray with dimensions ('time', 'lat', 'lon'); I should be able to say da.hv.plot(kdims=['lat', 'lon'] and have time become a dynamic selector.

HoloPlot explicitly does not deal with kdims and vdims instead more closely following the API of pd.DataFrame.plot and xr.DataArray. That said coordinates that are not assigned to the x/y axes will automatically result in a DynamicMap, so this will give you an image plot + a widget to select the time:

da.holoplot(x='lon', y='lat', kind='image')

To go along with the above, lazy loading of dask-backed arrays

That should happen automatically.

Intelligent faceting which automatically links the facet kdims

You can facet in a number of ways:

da.isel(time=slice(0, 3)).holoplot(x='lon', y='lat', kind='image', by='time')

will produce three subplots which are linked on the x- and y-axis, i.e. zooming on one will zoom on all unless you set shared_axes=False. You can also generate a grid with:

da.isel(time=slice(0, 3)).holoplot(x='lon', y='lat', kind='image', row='time', col='some_other_coord')

Plotting not just of DataArrays but Datasets.

This is also already supported, the API here is:

ds.holoplot(x='lon', y='lat', z=['air', 'surface'])

Will provide a widget to select between the 'air' and 'surface' data variable.

Options for projections, coastlines, etc. associated with geoviews

Currently working on that, it's basically just waiting on new HoloViews/GeoViews releases. The API here is as follows:

air_ds.air.holoplot.quadmesh(
    'lon', 'lat', ['air', 'some_other_variable'], crs=ccrs.PlateCarree(), projection=ccrs.Orthographic(-80, 30),
    global_extent=True, width=600, height=500, cmap='viridis'
) * gv.feature.coastline

screen shot 2018-05-30 at 9 03 53 pm

something like DataArray.hv.plot.contourf() seems too deeply nested.

Actually I suppose that's not what it would be, it could be da.hv.plot and da.hv.contourf with .plot figuring out the kind for you. I quite like that too.

I'm not strongly opposed to something like DataArray.hvplot for the accessor, it's just slightly less obvious than DataArray.holoplot.

hv would probably be too short for a good name (but of course this is totally up to you), especially because I can imagine people using hv for a variables name, which can also be accessed via attributes.

Thanks again for the feedback, I've decided to go with .holoplot in the end. I'll work on finishing some of geo related features today and get a 0.1 release and announcement out this week.

Thanks for everyone's feedback, due to trademark concerns we decided to rename both the library and the API to .hvplot. There should be a release and an announcement in the coming week.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mathause picture mathause  路  4Comments

Zac-HD picture Zac-HD  路  3Comments

equaeghe picture equaeghe  路  4Comments

ray306 picture ray306  路  4Comments

benbovy picture benbovy  路  3Comments