Altair: Use dataframe index as X axis

Created on 22 Nov 2016  路  14Comments  路  Source: altair-viz/altair

Hi, I am trying to use a pandas dataframe as the Chart data in altair, but I need to use the index of the dataframe as the X axis. My dataframe looks like this

image

with index of type

DatetimeIndex(['2014-04-01 00:00:00', '2014-04-01 00:05:00',
               '2014-04-01 00:10:00', '2014-04-01 00:15:00',
               '2014-04-01 00:20:00', '2014-04-01 00:25:00',
               '2014-04-01 00:30:00', '2014-04-01 00:35:00',
               '2014-04-01 00:40:00', '2014-04-01 00:45:00',
               ...
               '2014-04-14 23:10:00', '2014-04-14 23:15:00',
               '2014-04-14 23:20:00', '2014-04-14 23:25:00',
               '2014-04-14 23:30:00', '2014-04-14 23:35:00',
               '2014-04-14 23:40:00', '2014-04-14 23:45:00',
               '2014-04-14 23:50:00', '2014-04-14 23:55:00'],
              dtype='datetime64[ns]', name='timestamp', length=4032, freq=None)

I tried something like this, but to no use.

from altair import Chart, X, Y
Chart(data1).mark_line().encode(
    X('index'),
    Y('value'),
)

How should I do it? I know that if I convert my index to a column it can do the trick, but is there another way?

documentation

Most helpful comment

Altair only recognizes column data; it ignores index values. You can plot the index data by first resetting the index:

Chart(data.reset_index()).mark_line().encode(
    x='index',
    y='value'
)

All 14 comments

Altair only recognizes column data; it ignores index values. You can plot the index data by first resetting the index:

Chart(data.reset_index()).mark_line().encode(
    x='index',
    y='value'
)

Oh thanks. You mean, resetting the index, and using 'timestamp' as the x value right?

I mean resetting the index, and then using the resulting column as the x value. If the index is unnamed, then the column will be called "index". If the index has a name, then the column will have that name, and you should use it instead. Print the re-indexed dataframe if you're not sure.

Let's add this to the documentation for 1.3.

Okay, I am definitely missing something, but just "index" does not seem to work with pandas 0.22.
Minimalistic example:

data = pd.DataFrame([3,2,4,5,2,3], columns=['value'])
Chart(data).mark_line().encode(
  x='index',
  y='value'
)

Yields index encoding field is specified without a type. If I specify type (any really) I either get an undefined axis or an empty plot.

@nova77
In your minimal example, you do not have a column named 'index' in your dataframe.
So your x axis is looking for that and not finding it.

@afonit
Yeah, I suspected it might have been something like that. However, pandas provides an automatic index. How can I use it as x axis?

@nova77
here:

data = pd.DataFrame([3,2,4,5,2,3], columns=['value'])
Chart(data.reset_index()).mark_line().encode(
  x='index',
  y='value'
)

the reset_index() will pop that index out into a column by the name 'index'.

@afonit
Thanks, that works. I apologize for the noob question, but I found it surprisingly difficult to do something as simple, and .reset_index() doesn't seem very intuitive.

It occurs to me that we could provide a data transformer that does the reset_index() step automatically, so you could do

alt.data_transformers.enable('use_index')

and then reference the index in encodings.

Though I'm not certain how much more intuitive that would be than using standard pandas manipulations directly, though...

This is now included explicitly in the docs: https://altair-viz.github.io/user_guide/data.html#including-index-data

This seems outdated as of altair-viz 4.0. alt.data_transformers.enable('use_index') does not work as v4 API does not have the method: No 'use_index' entry point found in group 'altair.vegalite.v4.data_transformer' and with x=index:Q nothing is displayed.

There was never a use_index data transformer in Altair. If you want to include the index in your data, follow the instructions in the docs here: https://altair-viz.github.io/user_guide/data.html#including-index-data

An alternative to resetting the pandas DataFrame index is to copy the index as a column and then use the column as the axis

data = pd.DataFrame([3,2,4,5,2,3], columns=['value'])
data['k'] = data.index.copy()
Chart(data).mark_line().encode(
  x='k',
  y='value'
)
Was this page helpful?
0 / 5 - 0 ratings

Related issues

dzonimn picture dzonimn  路  3Comments

jtbaker picture jtbaker  路  3Comments

mroswell picture mroswell  路  4Comments

zanarmstrong picture zanarmstrong  路  4Comments

galloramiro picture galloramiro  路  3Comments