Visidata: Use a python expression on a whole column data instead of a single cell

Created on 17 Oct 2018 · 8Comments · Source: saulpw/visidata

I have a CSV file (see sample file) with a pre-computed histogram. It has 2 columns: bin and val, for each value there is an associated count ("bin" in this case).

I would like to plot this histogram using visidata. This can be done by marking the "val" column as key and ploting the "bin" column with .. However I don't like this visualization since it uses a scatterplot-like visualization, and a bar plot would be more suited in my case.

So my current solution is:

find the max of the "bin" column using z+, max.
create a derived column with the expr '#' * int(bin/the_max * 20) where the_max is a number, given by the previous step.
This creates a column with a vertical bar plot.

The downside of this solution is that the max has to be manually put in the expr.
Which leads to my question: is there a way in visidata to compute the max (or any expr) of a column and use this value in another expr?

From what I understand about visidata, it is more oriented toward raw data, so my use case (pre-computed data) may not fit with what visidata was designed for.

Example attached:
example.zip

Source

zakora

👍1

Most helpful comment

Hi @zakora, this is an interesting request. I think you're probably doing the best that can be done with the interface alone. However, you could add the following to your .visidatarc:

Sheet.addCommand('z.', 'addcol-freq', 'addColumn(HistogramColumn(cursorCol), cursorColIndex+1)')

def HistogramColumn(sourceCol):
    return Column(sourceCol.name+"_histogram",
                    getter=lambda c,r: options.disp_histogram*(options.disp_histolen*c.source.getTypedValue(r)//c.largest),
                    width=options.disp_histolen+2,
                    source=sourceCol,
                    largest=max(sourceCol.getValues(sourceCol.sheet.rows)))

and then z. would add a new column histogramming the current column.

saulpw on 18 Oct 2018

👍2

All 8 comments

Hi @zakora, this is an interesting request. I think you're probably doing the best that can be done with the interface alone. However, you could add the following to your .visidatarc:

Sheet.addCommand('z.', 'addcol-freq', 'addColumn(HistogramColumn(cursorCol), cursorColIndex+1)')

def HistogramColumn(sourceCol):
    return Column(sourceCol.name+"_histogram",
                    getter=lambda c,r: options.disp_histogram*(options.disp_histolen*c.source.getTypedValue(r)//c.largest),
                    width=options.disp_histolen+2,
                    source=sourceCol,
                    largest=max(sourceCol.getValues(sourceCol.sheet.rows)))

and then z. would add a new column histogramming the current column.

saulpw on 18 Oct 2018

👍2

In other words, if you can do it in Python, you can do it in VisiData :)

saulpw on 18 Oct 2018

😄1

Thanks for your solution, it is very helpful!

zakora on 18 Oct 2018

👍1

Will there be something like ='#' * int(bin/max(col('bin')) * 20)?

agguser on 14 Dec 2018

@agguser I don't think I understand...what are you looking to do?

saulpw on 14 Dec 2018

I mean: will visidata support an expression for the list of values in a column (e.g. col('bin') would return the list of values in column "bin")?

agguser on 14 Dec 2018

@agguser It's tricky to make that work with decent performance on large datasets. For aggregating column data I've made new subsheets in the past. If you tell me what kind of expressions you're trying to compute, we can see if there's a way to do what you want.

saulpw on 17 Dec 2018

I usually need "percentage columns" (=x*100/sum(col('x'))).

agguser on 17 Dec 2018

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

null data in input describe sheet

aborruso · 12Comments

[www] Create favicon for visidata.org

anjakefala · 35Comments

Add option to guess at column types

khughitt · 12Comments

[wishlist] Autodetect file extensions only supported by pandas?

khughitt · 14Comments

A small announcment: a VisiData introductory guide in Italian

aborruso · 12Comments