Visidata: View pandas/numpy/list of list objects in running python session

Created on 21 Jun 2018  路  3Comments  路  Source: saulpw/visidata

As mentioned in issue #10 (rightfully closed as Visidata was made importable):
Would it be possible to use Visidata to view in-memory python objects?

This seems appropriate, especially given the tighter integration with pandas since v1.2 HDF5 file support. Again, as someone relatively new to the Python world coming from R (yes, one of those), I don't know how much work this would be and am merely going off the features I would enjoy.

As for how this could work, I guess all editing features could be disabled. Perhaps some write to file - modify - reload to memory solution would be handy for quick-and-dirty analysis that would split the work between code and Visidata later on, but perhaps that's thinking too far ahead. (Alternatively, the intermediary step could be skipped, but I have no idea how that would work)

Also, independently of whether this feature gets integrated, thank you for making a brilliant tool! The step between exploration and end-product code has finally been filled for me in a way I never thought possible. What little use I had for GUI spreadsheet programs seems to have finally run out.

Most helpful comment

@jjzmajic, I've added a PandasSheet. Now you will be able to do:

>>> import visidata
>>> vdf = lambda df: visidata.run(visidata.PandasSheet('pandas', source=df))

>>> df = pandas.read_csv('foo.csv')
>>> vdf(df)

This will be in v1.3. You can try it out sooner by using the dev branch, or by running the ~20 lines of code from pandas.py in your Python environment.

All 3 comments

Thank you, @jjzmajic! I always love to hear how people are using it. If you have any specific favorite workflows it would be my great privilege to hear them.

What kind of in-memory Python objects you want to view? if you're using the Python command-line interface (REPL), try this:

>>> import visidata
>>> vd = lambda obj: visidata.run(visidata.load_pyobj('foo', obj))

>>> data = [{'a':1, 'b':2, 'c':3}, {'a':4, 'b':5, 'c':6}]
>>> vd(data)

Even though that's not what I meant I think I kind of love you just for that piece of code.
Seems like I found my new dir() alias.

I was mostly referring to the contents pandas DataFrames and numpy Arrays represent, as opposed to the object content itself. Sorry for being vague.

I was thinking something like what gtabview (https://github.com/TabViewer/gtabview) offers with gtabview.view(). Thing is, VisiData is far more powerful, and I frankly prefer the terminal.

The reason I ask is that I use VisiData as a preliminary guide for analysis. Jump through categorical vars -> sort continuous vars -> see trend? -> graph relationship -> zoom into interesting bit -> save to csv -> play around in python until I get a concrete idea as to what I could do with the data -> repeat analysis more carefully and reproducibly with the entire dataset in pure code.

It would be incredibly handy to be able to do that once I already have the data as a pd.DataFrame. Just so that workflow is more cohesive and so I can switch back and forth.

But frankly, even without that VisiData is such a huge help and improvement that I just want to thank you. You're doing amazing work! Hopefully, I'll become competent enough in Python to actually contribute at some point.

@jjzmajic, I've added a PandasSheet. Now you will be able to do:

>>> import visidata
>>> vdf = lambda df: visidata.run(visidata.PandasSheet('pandas', source=df))

>>> df = pandas.read_csv('foo.csv')
>>> vdf(df)

This will be in v1.3. You can try it out sooner by using the dev branch, or by running the ~20 lines of code from pandas.py in your Python environment.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

frosencrantz picture frosencrantz  路  15Comments

aborruso picture aborruso  路  12Comments

cclark picture cclark  路  18Comments

aborruso picture aborruso  路  29Comments

suntzuisafterU picture suntzuisafterU  路  11Comments