Visidata: Select starting sheet in html/xls/sqlite from command-line

Created on 24 Nov 2018 · 11Comments · Source: saulpw/visidata

Hi,
if I use vd -b t.html -o html.csv I have the table below in CSV and not the html table I have inside my file.

tag,id,nrows,ncols,classes
table,,72,4,wikitable sortable

table is the html table inside my t.html file. Is there a way to pass to the command line the sheet name? Something like vd -b t.html -sheet table -o html.csv

Thank you

wish granted wishlist

Source

aborruso

🚀1

Most helpful comment

Hi @aborruso, there's not an easy way yet. I've wanted something like this myself at times. Let me see if I can come up with something. Thanks for the suggestion!

saulpw on 24 Nov 2018

❤3

All 11 comments

Hi @aborruso, there's not an easy way yet. I've wanted something like this myself at times. Let me see if I can come up with something. Thanks for the suggestion!

saulpw on 24 Nov 2018

❤3

Hi @saulpw is there a way to open the table directly, when it is only one?

My final goal is to use VisiData as HTML to CSV converter with something like below in which I use an xpath query to extract only one table

curl "http://example.com/page.html" | myScrapeUtilty -xpathRule '//table[count(tr/td)>7]' | vd -b  -f html -o out.csv

But also with only one table visidata asks me to choose, and it saves as csv sheets sheet

Thank you

aborruso on 17 Jun 2019

Hi @aborruso, try adding -p dive.vd with the attached small .vd script.

sheet   col row longname    input   keystrokes  comment
            open-file   -   o   
-       0   dive-row        ^J

The first command opens the input from stdin (-), and the second command dives into the first row (0).

You can get this .vd yourself with:

the same command you have but without -b
press Enter and do any other manual steps
press Shift+D to go to the commandlog
finally, press Ctrl+S to save to dive.vd, which you can use with your pipeline.

dive.vd.txt

saulpw on 18 Jun 2019

❤2

@saulpw you are really brilliant, I'm impressed VisiData is a kind of magic

aborruso on 18 Jun 2019

😄2

@saulpw I have added a recipe in my VisiData Italian guide https://github.com/ondata/guidaVisiData/blob/master/testo/README.md#Salvare-una-tabella-HTML-in-CSV-a-partire-da-una-pagina-web

Thank you againg

aborruso on 18 Jun 2019

❤1

Fixed for html loader in f55de386d48aa5064f0b58cef3428e136cfc78ce; requires changes in other loaders with a sheet index.

saulpw on 21 Aug 2019

To-do to resolve this issue:

Fix loaders with sheet index to have rowdef sheets.
Write above requirement into book/loaders.md.
Improve startup with large files to remove sync(); file should load sync, cursor should jump after load completes (including ^C), or after sheet/row/col is available, if possible.

saulpw on 5 Oct 2019

The IndexSheet has been developed (see visidata/sheets.py). It contains the attribute rowtype = 'sheets' on default.

Loaders to be ported:

[X] html
[X] xls
[X] xlsx
[X] xlsb
[X] hdf5
[x] sqlite
[ ] postgres

Misc:

[ ] requirements needs to be added to loaders.md

anjakefala on 10 Nov 2019

CLI syntax is +:<sheet>:<row>:<col>.

+:subsheet:: to ignore row/col
can name toplevel source index if more than one: +toplevel:subsheet::

saulpw on 10 Nov 2019

Hi @saulpw if I run

curl -L "https://en.wikipedia.org/wiki/Olympic_medal" | vd -f html +:table_2:1:1

vd does not open the table_e. What's wrong in my command?

vd 2 is really great!

aborruso on 23 Apr 2020

Hey @aborruso!

Can you please open a bug report, and link to this issue?

There is not a good way for me to remember to check up on this potential bug, otherwise. :sweat_smile:

anjakefala on 23 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[plugin] add ability from the keyboard to add color to cells/rows based on the data in the cell/row

frosencrantz · 11Comments

[SheetsSheet] sorting list of sheets jumps away from SheetsSheet

geekscrapy · 12Comments

AttributeError: '_curses.curses window' object has no attribute 'get_wch'

qrkourier · 16Comments

[wishlist] Autodetect file extensions only supported by pandas?

khughitt · 14Comments

[www] Create favicon for visidata.org

anjakefala · 35Comments