Nix: remote build of 'silent' packages fails

Created on 5 Jan 2021  路  2Comments  路  Source: NixOS/nix

Describe the bug
Building of some packages on remote hosts fails. The affected packages produce no terminal output for long periods of time and thus the SSH connection gets closed for inactivity.

To Reproduce
Steps to reproduce the behavior:

  1. setup remote building: https://nixos.wiki/wiki/Distributed_build
  2. build qtwebengine with -j0, to force remote build
  3. if your server doesn't produce warnings fast enough, you'll get this error on the server:
    Jan 05 13:33:52 SERVER systemd-logind[785]: Session 17 logged out. Waiting for processes to exit. Jan 05 13:33:52 SERVER systemd-logind[785]: Removed session 17. Jan 05 13:33:55 SERVER nix-daemon[2645]: unexpected Nix daemon error: writing to file: Broken pipe
    And on the client it will fail after its own timeout period.
    ```
    ...
    ../../3rdparty/chromium/services/network/trust_tokens/trust_token_request_redemption_helper.cc:59:31: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
    59 | DCHECK(request->initiator() &&
    | ~~~~~^~
    60 | request->initiator()->scheme() == url::kHttpsScheme ||
    | ~~~~~~~~~~~
    ../../3rdparty/chromium/base/logging.h:808:54: note: in definition of macro 'DCHECK'
    808 | #define DCHECK(condition) EAT_STREAM_PARAMETERS << !(condition)
    | ^~~~~
client_loop: send disconnect: Broken pipe
error: unexpected end-of-file
builder for '/nix/store/9qskm7w05npz9vsh4r65dsjk11yvwi8m-qtwebengine-5.15.2.drv' failed with exit code 1
cannot build derivation '/nix/store/xkr3cs62lf4lbi9bdswl7nsvbjcfwcv6-zoom-us-5.4.53350.1027.drv': 1 dependencies couldn't be built
...
```

Expected behavior
No manual workarounds on SSH configs for remote building. nixos-rebuild -j 0 should always work, as long as there is a network connection.
Maybe nix could send some sort of heartbeat packets over the same connection?

# nix-env --version
nix-env (Nix) 2.3.10
# nixos-version
21.03.git.014440d7105 (Okapi)
bug

Most helpful comment

SSH can already do this, see the ServerAliveInterval and TCPKeepAlive options in ssh_config.

All 2 comments

SSH can already do this, see the ServerAliveInterval and TCPKeepAlive options in ssh_config.

I know remote builds are kinda high-level, but it still is bad UX. I love the deterministic approach nix's ecosystem takes, and this doesn't feel right, since the only exhausted resource is a ssh/tcp heartbeat.

If you think everybody should solve this on his own, feel free to close this ticket.

p.s.: since it's my first interaction with @edolstra: Thanks for (starting) nix and the ecosystem around it!

Was this page helpful?
0 / 5 - 0 ratings