Nixpkgs: Reduce default value for "binary-caches-parallel-connections" from 150 to 40

Created on 26 Aug 2015  路  12Comments  路  Source: NixOS/nixpkgs

When nix download many files (typically in nixos-rebuild), I often see the following error:

download-from-binary-cache.pl: still waiting for ... after 5 seconds

repeated dozens of times; the downloads do not make any progress. This issue was reported earlier, by multiple users, as https://github.com/NixOS/nixpkgs/issues/5546, but this URL returns 404 now.

The problem is probably related to the network equipment behind my internet connection. It looks like the router does not like that download-from-binary-cache.pl is suddenly opening 150 concurrent connections to the (same?) server. It could also be related to the fact that I'm running NixOS inside VirtualBox on OS X.

A workaround for me is to add binary-caches-parallel-connections = 40 to my /etc/nix/nix.conf. If you're using nix-env you can use --option binary-caches-parallel-connections 40 instead; for nixos-rebuild nix.conf seems to be the place to change. (As a side note, it's actually not easy to change this setting for nixos-rebuild as nix.conf is in the read-only nix store and can only be changed using nixos-rebuild. I temporarily change it to a regular file to work around this chicken-and-egg problem.)

This problem occur frequently. My take is that the default setting of 150 connections is excessive for a lot of home routers. How about reducing this number to 40 as a default setting?

enhancement

All 12 comments

:+1: from me. I've set that option to "50" on all my machines, too, because I've felt that 150 connections is a bit excessive.

Hmm, I wonder. In my firefox, network.http.max-connections is set to 256 (by upstream). Perhaps our connections are behaving differently.

In my firefox, network.http.max-persistent-connections-per-server is set to 6; I think that would be a more appropriate comparison.

Note that as far as I can tell, this only affects checking for the existence of .nar files on the binary caches -- the actual package download is performed using curl, which shouldn't open as many concurrent connections. I could be wrong about this though.

@vcunat, maybe Firefox usually doesn't need that many parallel connections in practice?

@pesterhazy: these connections to download _.narinfo are *not_ persistent, I believe. I think it's one query per connection so the server can handle the queries in parallel. I'm not knowledgeable in these matters, but I guess it's only http2 that can handle multiple parallel requests in one connection (well, maybe spdy can already).

@vcunat, I'm sure you're right but my point was just that Firefox won't open more than 6 connections to the same server (persistent or not). Ideally the perl script should reuse reconnections in the downloader script (using "Keep Alive" in HTTP 1.1), but that's a further optimization and not what I'm proposing here.

IIRC the script does reuse connections, but that's not the main point here anyway.

:+1:

I have the same problem. Where can I set this parameter? I tried in configuration.nix as nix.binary-caches-parallel-connections but this option does not exist. Should we add it?

nix.extraOptions -> /etc/nix/nix.conf. See man nix.conf for description of options.

With binary-caches-parallel-connections = 1 in my nix.conf (via remounting /nix/store rw), this same error still happens, which means that your "fix" is never going to work. I think this program is just flawed in so many ways; the only thing which will really crush this issue which has been annoying users since at least 2013 [1] (and likely before) is to rewrite this program in a more intelligent fashion.

If the same computer is able to run a webbrowser to download the _very_same_ file, and your program cannot, your program is just broken. It is unresponsible to continue to distribute this program, IMHO.

If the program notices that downloading for some reason is extremely slow or makes no progress, it should try a different method of getting those same bytes to do some root cause analysis as to what the issue might be. As long as the network is not broken (it could try to fetch some configurable list of URLs of major popular websites), this is a critical piece of infrastructure which should always work.

Can you please explain why after at least three years you still have failed to implement a correct version of this program? Is it that you intentionally want to write broken software? Is it just that your skills are limited? By "you" I am talking about everyone who is responsible for a NixOS release.

I am just trying to figure out what the problem is, because _clearly_ there is a problem.

[1] http://lists.science.uu.nl/pipermail/nix-dev/2013-August/011518.html

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yawnt picture yawnt  路  3Comments

tomberek picture tomberek  路  3Comments

ghost picture ghost  路  3Comments

grahamc picture grahamc  路  3Comments

copumpkin picture copumpkin  路  3Comments