Altair: Question, how to divide two aggregate

Created on 1 Jan 2019  路  6Comments  路  Source: altair-viz/altair

sorry, I could not find anything in the documentation how to divide two aggregate, I am using this syntax but I am stuck, all I want is to create a new aggregation new, which is the division of two aggregation sum(Installed_qty) and sum(budget)

bar = alt.Chart(df).transform_calculate(
                               new='sum(Installed_qty)'/'sum(budget)'
                                       ).mark_bar().encode(
                                   x=alt.X("date:O", timeUnit="yearmonthdate",axis=alt.Axis(format="%d %b %y")),
                                   y='new:Q'   

Most helpful comment

It's a bit awkward in the Vega-Lite transform syntax, but you have to create two aggregate transforms and then divide them within a calculate transform. e.g.

bar = alt.Chart(df).transform_aggregate(
    sum_installed='sum(Installed_qty)',
    sum_budget='sum(budget)',
    groupby=['date']
).transform_calculate(
    new='datum.sum_installed / datum.sum_budget'
).mark_bar().encode(
    x=alt.X("date:O", timeUnit="yearmonthdate",axis=alt.Axis(format="%d %b %y")),
    y='new:Q'
)

If you're using a pandas dataframe, it might be cleaner to add columns with the computed values.

All 6 comments

It's a bit awkward in the Vega-Lite transform syntax, but you have to create two aggregate transforms and then divide them within a calculate transform. e.g.

bar = alt.Chart(df).transform_aggregate(
    sum_installed='sum(Installed_qty)',
    sum_budget='sum(budget)',
    groupby=['date']
).transform_calculate(
    new='datum.sum_installed / datum.sum_budget'
).mark_bar().encode(
    x=alt.X("date:O", timeUnit="yearmonthdate",axis=alt.Axis(format="%d %b %y")),
    y='new:Q'
)

If you're using a pandas dataframe, it might be cleaner to add columns with the computed values.

thanks you very much, I can't use columns in pandas dataframe as the result change based on a cross filter from another chart.

thanks for the quick reply, much appreciate it.

the cross filter does not work :(

here is the whole code

[import altair as alt
import pandas as pd
df=pd.read_csv("qty.csv",parse_dates=[0], dayfirst=True)

pts = alt.selection(type='multi', encodings=['y'])

barselect = alt.Chart(df).mark_bar().encode(
    y='category:N',
    x='sum(Installed_qty)',
    color=alt.condition(pts, alt.ColorValue("steelblue"), alt.ColorValue("grey"))
).properties(
    selection=pts,
    width=550,
    height=200
)   

bar = alt.Chart(df).transform_aggregate(
    sum_installed='sum(Installed_qty_cumulative)',
    sum_budget='sum(budget)',
    groupby=['date']
).transform_calculate(
    new='datum.sum_installed / datum.sum_budget'
).mark_bar().encode(
    x=alt.X("date:O", timeUnit="yearmonthdate",axis=alt.Axis(format="%d %b %y")),
    y='new:Q'




).transform_filter(
    pts
)
text = bar.mark_text(
    align='right',
    baseline='bottom',
    dx=1  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    text='new:Q').transform_filter(
    pts
) 

chart= alt.hconcat(
    barselect,
    bar+text
).resolve_legend(
    color="independent",
    size="independent"
)
chart

sorry I don't know how to add a csv file in the code

date | category | Installed_qty | Installed_qty_cumulative | budget
-- | -- | -- | -- | --
1/01/2010 | piling | 9 | 9 | 702
2/01/2010 | piling | 11 | 20 | 702
3/01/2010 | piling | 94 | 114 | 702
4/01/2010 | piling | 81 | 195 | 702
5/01/2010 | piling | 99 | 294 | 702
6/01/2010 | piling | 85 | 379 | 702
7/01/2010 | piling | 92 | 471 | 702
8/01/2010 | piling | 66 | 537 | 702
9/01/2010 | piling | 32 | 569 | 702
10/01/2010 | piling | 3 | 572 | 702
11/01/2010 | piling | 50 | 622 | 702
12/01/2010 | piling | 59 | 681 | 702
13/01/2010 | piling | 21 | 702 | 702
1/01/2010 | excavation | 48 | 48 | 565
2/01/2010 | excavation | 45 | 93 | 565
3/01/2010 | excavation | 3 | 96 | 565
4/01/2010 | excavation | 77 | 173 | 565
5/01/2010 | excavation | 11 | 184 | 565
6/01/2010 | excavation | 66 | 250 | 565
7/01/2010 | excavation | 56 | 306 | 565
8/01/2010 | excavation | 90 | 396 | 565
9/01/2010 | excavation | 70 | 466 | 565
10/01/2010 | excavation | 4 | 470 | 565
11/01/2010 | excavation | 1 | 471 | 565
12/01/2010 | excavation | 46 | 517 | 565
13/01/2010 | excavation | 48 | 565 | 565
1/01/2010 | concrete | 79 | 79 | 456
2/01/2010 | concrete | 17 | 96 | 456
3/01/2010 | concrete | 88 | 184 | 456
4/01/2010 | concrete | 8 | 192 | 456
5/01/2010 | concrete | 26 | 218 | 456
6/01/2010 | concrete | 61 | 279 | 456
7/01/2010 | concrete | 29 | 308 | 456
8/01/2010 | concrete | 27 | 335 | 456
9/01/2010 | concrete | 53 | 388 | 456
10/01/2010 | concrete | 68 | 456 | 456
11/01/2010 | concrete | 68 | 456 | 456
12/01/2010 | concrete | 68 | 456 | 456
13/01/2010 | concrete | 68 | 456 | 456

regards

You are applying the pts filter after the calculate and aggregate transforms, which is probably not what you want. Transforms are applied in the order they are specified.

wow, it works, that's great, thanks

one last question please,what's the syntax for
if planned_cumulative='datum.sum_planned_cumulative / datum.sum_budget' 0 then null else planned_cumulative='datum.sum_planned_cumulative / datum.sum_budget',

basically one of the line stop in the middle and i don't want to draw line if 0

thanks

Was this page helpful?
0 / 5 - 0 ratings

Related issues

galloramiro picture galloramiro  路  3Comments

morberg picture morberg  路  3Comments

nielsmde picture nielsmde  路  4Comments

bmcfee picture bmcfee  路  3Comments

Juan-132 picture Juan-132  路  3Comments