Pants: [pantsd] `./pants repl` for python, scala echos input 2x

Created on 6 Dec 2017  路  9Comments  路  Source: pantsbuild/pants

seen in master - import sys is seen 2x, when only typed once:

[omerta pants-release (master)]$ git rev-parse HEAD
069cf3d151be919a0410d69a3db39d3484a3dbca
[omerta pants-release (master)]$ ./pants -q repl 3rdparty/python:psutil

Python 2.7.10 (default, Dec 16 2015, 14:09:45) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import sys
import sys
>>>

expected:

[omerta pants-release (master)]$ python2.7
Python 2.7.13 (default, Mar  2 2017, 16:22:01) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> 
bug pantsd

Most helpful comment

I think I've found a happy medium between direct exec + nailgun tty control in #5352 by passing tty paths as env vars and directly opening/addressing the clients tty as a file in daemon runs - seems viable fwict and both python and scala repls work identically to non-daemon runs (w/ first class ^D/line buffering/etc - all buttery smooth) with the added bonus of no middle man and no reader/writer threads.

this approach does come with a bit of a caveat, which is that we have to be judicious about subprocess cleanup in the face of a ^C/^\ or orphaned subprocesses still attached to the tty can continue to read/write outside of a pants run. more work on that aspect needed to land, but I'm way less concerned about a bigger risk here now.

All 9 comments

more context via debugging:

pid mappings:

  501 72484  1642   0  6:51PM ttys003    0:01.14 python /Users/kwilson/dev/pants/src/python/pants/bin/pants_loader.py repl 3rdparty/jvm/
org/mockito::
  501 72656     1   0  6:51PM ??         0:01.62 pantsd-runner [./pants repl 3rdparty/jvm/org/mockito::]

debug logging w logging in NailgunProtocol.{read,write}_chunks + stdio_as in the form <pid>@<thread>: <msg>:

72656@Thread-3: $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$spawned pipe pair: 20<->21
72656@Thread-3: $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$spawned pipe pair: 22<->23
72656@Thread-3: $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$spawned pipe pair: 25<->26
72656@Thread-3: stdout->23 stderr->26 stdin->20
...
72484@NailgunStreamWriter: DATA from 0 (of [0]): 'qq\n'
72484@NailgunStreamWriter: --0-->|'qq\n'|
72656@NailgunStreamStdinReader: <--0--|'qq\n'|
72656@NailgunStreamWriter: DATA from 22 (of [22, 25]): 'q'
72656@NailgunStreamWriter: --1-->|'q'|
72656@NailgunStreamWriter: DATA from 22 (of [22, 25]): 'q'
72656@NailgunStreamWriter: --1-->|'q'|
72484@MainThread: <--1--|'q'|
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.write(*('q',), **{}) -> None
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.flush(*(), **{}) -> None
72484@MainThread: <--1--|'q'|
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.write(*('q',), **{}) -> None
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.flush(*(), **{}) -> None
72656@NailgunStreamWriter: DATA from 22 (of [22, 25]): '\r\r\n'
72656@NailgunStreamWriter: --1-->|'\r\r\n'|
72484@MainThread: <--1--|'\r\r\n'|
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.write(*('\r\r\n',), **{}) -> None
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.flush(*(), **{}) -> None
72656@NailgunStreamWriter: DATA from 22 (of [22, 25]): '<console>:12: error: not found: value qq\r\n       qq\r\n       ^\r\n'
72656@NailgunStreamWriter: --1-->|'<console>:12: error: not found: value qq\r\n       qq\r\n       ^\r\n'|
72484@MainThread: <--1--|'<console>:12: error: not found: value qq\r\n       qq\r\n       ^\r\n'|
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.write(*('<console>:12: error: not found: value qq\r\n       qq\r\n       ^\r\n',), **{}) -> None
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.flush(*(), **{}) -> None
72656@NailgunStreamWriter: DATA from 22 (of [22, 25]): '\r\nscala> '
72656@NailgunStreamWriter: --1-->|'\r\nscala> '|
72484@MainThread: <--1--|'\r\nscala> '|
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.write(*('\r\nscala> ',), **{}) -> None
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.flush(*(), **{}) -> None

we first see a client-side NailgunStreamWriter read 'qq\n' from stdin on the client and send a type 0 chunk payload w/ a corresponding read by the runner-side NailgunStreamStdinReader:

72484@NailgunStreamWriter: DATA from 0 (of [0]): 'qq\n'
72484@NailgunStreamWriter: --0-->|'qq\n'|
72656@NailgunStreamStdinReader: <--0--|'qq\n'|

immediately after, the problem begins: the runner-side NailgunStreamWriter reads the 'qq' (now with an appended \r\r\n) from fd 22 (the read end of sys.stdout) and emits it over the socket as chunktype 1 (stdout), which is written to the console of the thin client:

72656@NailgunStreamWriter: DATA from 22 (of [22, 25]): 'q'
72656@NailgunStreamWriter: --1-->|'q'|
72656@NailgunStreamWriter: DATA from 22 (of [22, 25]): 'q'
72656@NailgunStreamWriter: --1-->|'q'|
72484@MainThread: <--1--|'q'|
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.write(*('q',), **{}) -> None
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.flush(*(), **{}) -> None
72484@MainThread: <--1--|'q'|
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.write(*('q',), **{}) -> None
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.flush(*(), **{}) -> None
72656@NailgunStreamWriter: DATA from 22 (of [22, 25]): '\r\r\n'
72656@NailgunStreamWriter: --1-->|'\r\r\n'|
72484@MainThread: <--1--|'\r\r\n'|
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.write(*('\r\r\n',), **{}) -> None
72484@MainThread: <open file '<stdout>', mode 'w' at 0x100264150>.flush(*(), **{}) -> None

however, in the case of DaemonPantsRunner sys.stdout is patched for logging in the same fashion as the emissions from MainThread, but we see no python-side writes to that object. this leads me to believe that the subprocess itself is directly echoing this back to an inherited fd.

this is reproducible in the normal python repl, the ipython repl and the scala repl. one other odd bit here is the newline signature change on the stdout write-back. in the python repls, this is a \r\n (vs \r\r\n for scala repl). puzzling.

a dtrace trace of all write syscalls on the child repl process seems to confirm that the subprocess is echoing back output via write().

this is definitely related to tty settings. it seems the \r\n injections is a feature called terminal 'newline translation'.

putting the os.openpty() pipes into raw mode resolves the problem in the python repl case, but doesn't solve for the general case - which I believe is one of e.g. a subprocess (like a repl) attempting to control the tty for certain behaviors via e.g. stty in the runner's process context and not in the thin clients.

good reads here:

https://blog.nelhage.com/2009/12/a-brief-introduction-to-termios/
https://blog.nelhage.com/2009/12/a-brief-introduction-to-termios-termios3-and-stty/

to be clear: this goes deeper than just echoing - currently in the daemon, things like up/down/left/right/^D in the repl don't work as expected either because those are all functions of the terminal driver. I'd also expect e.g. a ./pants run that launched a full blown curses app to not work either.

I think longer term, the v2-y solution here may be to directly execv a built product (an external binary, test or repl entrypoint) directly from the thin client as console task (effectively removing the middle man - this would be the cleanest approach).

the interim solution needs more consideration. something along the lines of:

  • write/emulate a small terminal driver w/ "good enough" settings and/or something controllable (e.g. via contextmanager to wrap repl/run with config like 'echo off', etc).
  • make the nailgun protocol aware of tty settings and send delta events on setting changes (if those changes are even recognized/discoverable by the pty pair via normal e.g. stty means)?
  • some other UNIX-y way of connecting the pty on the pantsd-runner and tty of the thin client (setpgid may help at least for signaling..). or maybe some way to open/read/write directly from the tty as file (/dev/...).

John mentioned tmux and ssh as good reference implementations to dig into.

also: https://github.com/nelhage/reptyr

Direct execv-ing the built product sounds excellent to me :)

As mentioned in our discussion, I'm less sure that having the client execute any custom code is a good idea, as one of the benefits of the nailgun client/server model is that there is some degree of standardization. If the nailgun project actually manages to resurrect itself, we might regret diverging further from the protocol.

If the tty problem is well enough defined, and we have the appetite for it, proposing changes to the protocol to allow for transparent usage of tty-aware programs would be interesting. Maybe we don't have the appetite for it though.

I think I've found a happy medium between direct exec + nailgun tty control in #5352 by passing tty paths as env vars and directly opening/addressing the clients tty as a file in daemon runs - seems viable fwict and both python and scala repls work identically to non-daemon runs (w/ first class ^D/line buffering/etc - all buttery smooth) with the added bonus of no middle man and no reader/writer threads.

this approach does come with a bit of a caveat, which is that we have to be judicious about subprocess cleanup in the face of a ^C/^\ or orphaned subprocesses still attached to the tty can continue to read/write outside of a pants run. more work on that aspect needed to land, but I'm way less concerned about a bigger risk here now.

Sounds like a nice middleground! :)

Was this page helpful?
0 / 5 - 0 ratings