Visidata: Crash (stack overflow) with self-referential column expressions

Created on 6 Sep 2019  路  2Comments  路  Source: saulpw/visidata

  • OS: Linux (Arch)
  • VisiData: v1.5.2 and v2.-1dev
  • Python: 3.7.4

Not sure if this is obvious, expected, a bug, or something I've overlooked in the documentation, but it's possible to crash VisiData (and python) if a column expression refers to its own column name, something which it's easy to do inadvertently/indirectly when columns are renamed.

In the attached test case, I have a bunch of TV episodes I want to assign a short season/episode identifier to, e.g. Season 6, Episode 7 -> "06x07". The generated column name is ugly and I have no particular use for its constituent columns after combining them so I name the newly-created column episode (and possibly hide the old season and episode columns).

What happens next varies. Sometimes nothing happens (i.e. no error). Sometimes I get an immediate error (red exclamation marks in the new episode column's cells). Sometimes I trigger the error if I re-order the columns (e.g. moving the new episode column before the old one) and clear it if I move the column back. And sometimes VisiData crashes (with a mangled python stack trace) and python crashes ("core dumped") if I sort an unrelated column (as demonstrated in the attached test case).

In the error case, I get the core dump if I inspect any of the flagged cells with z Ctrl-E.

In case anyone else encounters the same issue, the stack trace of the crashed python process includes the following:

_PyBytes_Resize (in /usr/lib/libpython3.7m.so.1.0)
??? (in /usr/lib/libpython3.7m.so.1.0)
_PyMethodDef_RawFastCallDict (in /usr/lib/libpython3.7m.so.1.0)
_PyCFunction_FastCallDict (in /usr/lib/libpython3.7m.so.1.0)
??? (in /usr/lib/libpython3.7m.so.1.0)
_PyMethodDef_RawFastCallDict (in /usr/lib/libpython3.7m.so.1.0)
_PyCFunction_FastCallDict (in /usr/lib/libpython3.7m.so.1.0)
_PyObject_CallMethodId_SizeT (in /usr/lib/libpython3.7m.so.1.0)
??? (in /usr/lib/libpython3.7m.so.1.0)
_PyMethodDef_RawFastCallKeywords (in /usr/lib/libpython3.7m.so.1.0)
_PyMethodDescr_FastCallKeywords (in /usr/lib/libpython3.7m.so.1.0)
bug

Most helpful comment

This was great, @chocolateboy, thanks. We spent the weekend combing through the expr eval code, took notes for the book, fixed up the api, and VisiData now catches recursive expressions before the interpreter does.

With the above commit:

  • VisiData should not crash like this anymore.
  • For an identifier, the first so-named column in the columns list (shown or hidden) is used (this was also the previous behavior).
  • In your repro, the self-referential column starts out showing an error, but if you slide it to the right past the other episode, then it resolves properly (errors are not cached).
  • but values are cached, so sliding it back keeps the value; to clear the cache and see the error again, use cache-col (z').

The error message for your specific case is not great, but I'm not sure what to do about that. "TypedExceptionWrapper" is not very informative, but in this case it means you're using an error value from another column as a value in the expression.

All 2 comments

Yeah that's a good one, @chocolateboy. I figured that using the first one in the columns list seemed reasonable, and it seemed like a cell-error was appropriate if the column itself was first and thus self-referential. But this should not result in a significant performance penalty, much less an actual crash. I think I'm seeing the crash during replay of your test case, but it doesn't seem to happen interactively. I'll have to investigate.

This was great, @chocolateboy, thanks. We spent the weekend combing through the expr eval code, took notes for the book, fixed up the api, and VisiData now catches recursive expressions before the interpreter does.

With the above commit:

  • VisiData should not crash like this anymore.
  • For an identifier, the first so-named column in the columns list (shown or hidden) is used (this was also the previous behavior).
  • In your repro, the self-referential column starts out showing an error, but if you slide it to the right past the other episode, then it resolves properly (errors are not cached).
  • but values are cached, so sliding it back keeps the value; to clear the cache and see the error again, use cache-col (z').

The error message for your specific case is not great, but I'm not sure what to do about that. "TypedExceptionWrapper" is not very informative, but in this case it means you're using an error value from another column as a value in the expression.

Was this page helpful?
0 / 5 - 0 ratings