nix-channel --update downloads are not efficient, because it will just download the same file over and over (after multiple calls), even if it is already locally available.
There exist various mechanisms to determine whether a particular file was already downloaded before.
One such mechanism is the -N flag of wget. I don't care about which mechanism is used, as long as it is fixed.
Priority to fix this is low, as it is an optimization.
The nixos.org channels aren't showing the Last-Modified header so that will also need to be added.
This should be fairly trivial using the "If-Modified-Since" header in the perl line of https://github.com/NixOS/nix/blob/75d2492f20dc513337de3ef2d45e1d5c68c7dff8/scripts/nix-channel.in#L102. But I think that whole script is going to be rewritten in C++ if #341 is ever resolved.
The approach used in this script could probably be reused here.
$ curl -D - http://nixos.org/channels/nixos-16.03/nixexprs.tar.xz -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0HTTP/1.1 302 Found
Date: Mon, 25 Jul 2016 11:23:02 GMT
Server: Apache/2.4.18 (Unix) OpenSSL/1.0.2h PHP/5.6.23
Location: http://nixos.org/releases/nixos/16.03/nixos-16.03.1143.6d520ce/nixexprs.tar.xz
Content-Length: 262
Content-Type: text/html; charset=iso-8859-1
100 262 100 262 0 0 1881 0 --:--:-- --:--:-- --:--:-- 2620
We'd need to set last modified header or similar for the apache.
Using mod_expires: https://github.com/h5bp/html5-boilerplate/blob/master/dist/.htaccess#L847 we should be able to mutate https://github.com/NixOS/nixos-org-configurations/blob/master/nixos-org/webserver.nix and get the desired header to be present.
Related issue: If you have two channels pointing to the same URL, nix-channel --update downloads it twice.
If we can just shift the channels over to S3, then they'd show an etag, which is (almost always) the MD5 of the content.
@copumpkin opened https://github.com/NixOS/nixos-channel-scripts/issues/7
We're now using S3 channel but this is still not fixed. nix-channel --update basically calls:
$ nix-prefetch-url https://d3g5gsiof5omrk.cloudfront.net/nixos/16.09/nixos-16.09beta480.2d463a3/nixexprs.tar.xz
downloading ‘https://d3g5gsiof5omrk.cloudfront.net/nixos/16.09/nixos-16.09beta480.2d463a3/nixexprs.tar.xz’... [0/0 KiB, 0.0 Ki [7951/8557 KiB, 646.2 KiB/s]
path is ‘/nix/store/fm9glgmvjs6ga3k2h202mqq0nsrlr0ll-nixexprs.tar.xz’
161lz9xg38g34qxagxjpk5dii3s1d9441svx6csbiklh3xss1p2d
And the url is fetched each time.
@domenkozar yeah, it won't magically work, but we now have the tools to make it work.
Basically, you'd call a HEAD on the URL, which should return an ETag that corresponds to the MD5 of the object (assuming it wasn't uploaded as a multi-part upload, which we control). We check that MD5, then only download if it changed.
The alternative that requires one fewer request is to use one of the more advanced request headers as specified in http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html. So we'd compute the MD5 locally, then pass in If-None-Match with our locally computed MD5, and if we get a 304 in response, we know that nothing's changed, otherwise we get the new file. I'm pretty sure all that behavior gets forwarded properly through CloudFront, but I've only ever used it myself against S3 directly.
Well, just matching the final URL would be enough for practical purposes, as it contains shortened commit hash.
Anyone working on this? Kind of a buzzkill watching it run a no-op 8729 KiB download on every call to update.
I don't think there's anyone... so it's free for taking ;-)
Most helpful comment
If we can just shift the channels over to S3, then they'd show an etag, which is (almost always) the MD5 of the content.