Nix: fetchGit fails with a not very helpful error message when fetching a revision not in the remote's HEAD

Created on 18 Sep 2018  路  22Comments  路  Source: NixOS/nix

Rather than trying to hijack #1923 any further, I instead open a new issue here.

I was trying to use the new (as of Nix 2) way of pinning nixpkgs, but it doesn't work for the at the time of this writing most recent nixos-18.03 commit 01f5e794913a18494642b5f237bd76c054339d61:

$ cat << EOF | nix repl
builtins.fetchGit {
  url = https://github.com/nixos/nixpkgs-channels;
  rev = "01f5e794913a18494642b5f237bd76c054339d61";
}
EOF
Welcome to Nix version 2.1.1. Type :? for help.

fatal: not a tree object
error: program 'git' failed with exit code 128

error: Unexpected linenoise keytype: 0

The problem here is mostly the fatal: not a tree object error. I suspect this is due to fetchGit cloning with --depth 1 or taking some other shortcut, which only considers master (in this case nixos-unstable) commits. 01f5e794913a18494642b5f237bd76c054339d61 isn't an ancestor of master, so it will not be fetched, hence the error.

Or maybe it's some other reason altogether.

Anyway, my current workaround is to use fetchTarball instead. Or even by specifying ref = "nixos-18.03"; as in #1923. I think this would be worth adding to the manuel entry.

Error Messages

Most helpful comment

So it seems there are two ways to go forward as a default

  • require ref
  • not setting ref clones the whole repository (fetching refs/remotes/*)

Either way, fetching the whole repository should be an option because sometimes one has only a rev but not ref. For example Stack tooling in Haskell world operates this way.

All 22 comments

This is a problem with Git in general with or without --depth 1 depending upon the circumstances of the repository and the remote. See https://nixos.org/nix/manual/#idm140737317570688

Right, I was a bit thrown off by the error message and thought that the cached git repo was corrupted, so I tried to delete it twice, to no avail. A hint in the error message of fetchGit would be great.

Or a warning saying 'you have provided no ref field, making a shallow clone of the default branch'.

People run into this problem. Example.

Why not fetch the whole thing by default? Fetching nixpkgs was super slow on my machine despite this optimization, so my guess is that it's even better to just fetch the whole repo as a default, or otherwise as a fallback.

I remember reading somewhere that github prefers whole checkouts because a dumb bulk operation is cheaper than a "smart" checkout with lots of random I/O. I suppose that's what I was seeing.

Can we make this work by default?

fetchGit does not do shallow fetches (though it probably should to conserve disk space). It fetches the entire history of the specified ref, and the specified rev must be an ancestor of that.

Perhaps we could have a default without surprises and an optimized one based on the ref?
Coming to think of the shallow fetching, wouldn't that make the ref optimization even more brittle? Because then you have to make sure that rev is within n commits of ref. We'd probably need to put that under a switch in order not to break virtually all cases where people use ref with branches instead of tags.

I don't think there are any surprises here because this is simply how Git works. It doesn't in general support fetching revs, only refs. (I think Git has a configuration option to allow fetching of unadvertised objects, but GitHub doesn't enable it.) E.g. this doesn't work either:

$ git clone https://github.com/nixos/nixpkgs-channels -b 01f5e794913a18494642b5f237bd76c054339d61
Cloning into 'nixpkgs-channels'...
fatal: Remote branch 01f5e794913a18494642b5f237bd76c054339d61 not found in upstream origin

$ git fetch https://github.com/nixos/nixpkgs-channels 01f5e794913a18494642b5f237bd76c054339d61
error: Server does not allow request for unadvertised object 01f5e794913a18494642b5f237bd76c054339d61

Fetching the entire repository is potentially much more expensive. For example, the Chromium repository reportedly has 500,000 refs. (https://opensource.googleblog.com/2018/05/introducing-git-protocol-version-2.html)

Yes, this is not surprising to someone who is intimately familiar with git's inner workings, but that doesn't make it unsurprising to most people.

I see two scenario's:

A novice user learns about fetchGit and doesn't notice the ref option because it wasn't in the example the read, or perhaps they just learned about this function and want to convert a call to one of the nixpkgs variations to fetchGit and they forget about ref because it's not in those calls. The user seems to call fetchGit just fine, their repo's are reasonably sized, but now suddenly there's a weird error about a weird situation that they never had to deal with after many years of using plain git on the command line. This is painful.

The other scenario is where someone does indeed fetch from a huge repository (refs or commits) and is not aware of the ref option because they never had to learn it. This user suffers from slow fetches, but it works.

In my view, it's better to solve the first scenario than to solve the second.

  • less intervention by users required - avoiding frustration and human context switches is more valuable than machine time
  • the first situation is very beginner unfriendly
  • the second situation is still mostly avoidable by recommending ref as a good practice: serves as documentation + improves performance
  • the second situation I expect to be created by power users who are more likely to use the command correctly
  • learning to solve the second situation manually is more fun than having to solve the first situation manually
  • solving the first situation also solves cases where refs are renamed or otherwise changed without releasing the commit

The only reason I can think of to solve only the second scenario is

  • user may not notice potential performance improvement

And that can be solved by looking at the number of refs and ratio of output size / .git size

fetchGit could do this:

  • if ref is not present download the entire repo

    • if .git size > some_size_threshold and output size / .git size is disproportional, emit a message about the ref attribute

    • if #refs > some_refs_threshold, emit a message about the ref attribute

  • if ref is present

    • deep fetch of the single ref

Does this address all concerns?

To me not specifying a ref is just bad practice. It would be like having fetchurl download an entire website to search for a file with the specified hash, just so the user doesn't have to specify a URI stem. Exaggerating obviously :-)

BTW in the case of GitHub, it's probably better to use fetchTarball https://github.com/nixos/nixpkgs-channels/archive/01f5e794913a18494642b5f237bd76c054339d61.tar.gz since it's a lot faster, uses much less disk space and doesn't require specifying a ref.

I think it's fine to expect users to specify ref. However, the error here is truly awful. It's not even remotely indicative of the root cause; I understand that this error originates from Git but perhaps Nix could do better to guide the user to their mistake if it sees that git fails.

The direnv activated, pinned nix-shell is such a wonderful workflow, I keep advertising it everywhere.

However, I was just trying to set it up on a new macOS again and I hit a bunch of issues.

Can someone who understands this well, clean up these documentations, please?


The https://nixos.wiki/wiki/FAQ/Pinning_Nixpkgs page does not say anything about refs, so this whole conversation doesn't make much sense to me.

The fetchGit example is extra confusing, because the url references the https://github.com/nixos/nixpkgs but the git ls-remote below, in the comment talks about
https://github.com/nixos/nixpkgs-channels

I would think the nixos/nixpkgs-channels repo should be the recommended way of pinning
for the "average developers", until Nix Flake is mature enough.

I'm trying to pin in my shell.nix and use it via the use_nix function I put into my ~/.direnvrc as described in https://github.com/direnv/direnv/wiki/Nix#using-a-global-use_nix-with-garbage-collection-prevention

I'm just getting this error on direnv allow:

++ nix-shell --show-trace --pure --run '"/Users/onetom/.nix-profile/bin/direnv" dump bash'
direnv: ([/Users/onetom/.nix-profile/bin/direnv export bash]) is taking a while to execute. Use CTRL-C to give up.
fatal: not a tree object: d3e6486935981288621cca2eaabe23017f16ab57

My .envrc is:

set -x
use nix
layout python

My shell.nix starts as:

with import (builtins.fetchGit {
    name = "nixpkgs-19.03-darwin";
    url = https://github.com/NixOS/nixpkgs/;
    rev = "d3e6486935981288621cca2eaabe23017f16ab57";
}) {};
...

As mentioned above fetchTarball works:

with import (builtins.fetchTarball {
    name = "nixpkgs-19.03-darwin";
    url = https://github.com/nixos/nixpkgs-channels/archive/d3e6486935981288621cca2eaabe23017f16ab57.tar.gz;
    sha256 = "1k3g4fvg48ffzkfapalghxn3r4s6lgbpsdn04cq0cmn4yphgpkf7";
}) {};
...

However the use_nix script also fiddles with the SSL certs, so curling https
sites doesn't work anymore :(

$ curl https://news.ycombinator.com
curl: (60) SSL certificate problem: self signed certificate in certificate chain
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

$ env | rg SSL_CERT
NIX_SSL_CERT_FILE=

This is how I imagine a channel pinning guide:

Pinning to a channel

The latest recommendation I could find proposes to use the builtins.fetchTarball function to obtain a specific version of the package tree.

To construct the URL for the fetchTarball call, we have to determine the
the latest commit hash of a channel:

$ git ls-remote https://github.com/nixos/nixpkgs-channels
984851a9bfa3a7b5dacb436d7686f2f09b5e2e85        HEAD
...
a7e559a5504572008567383c3dc8e142fa7a8633        refs/heads/nixos-18.09
a7e559a5504572008567383c3dc8e142fa7a8633        refs/heads/nixos-18.09-small
878531fbdbbe78a6746e90694c20eff8cfb70fae        refs/heads/nixos-19.03
878531fbdbbe78a6746e90694c20eff8cfb70fae        refs/heads/nixos-19.03-small
984851a9bfa3a7b5dacb436d7686f2f09b5e2e85        refs/heads/nixos-unstable
8746c77a383f5c76153c7a181f3616d273acfa2a        refs/heads/nixos-unstable-small
24a7883c2349af5076107dbbb615be09d6025a95        refs/heads/nixpkgs-17.09-darwin
4d48e8106f9fac757b9359b8c8eeec3ca1e35908        refs/heads/nixpkgs-18.03-darwin
68b3bff32da3fadb16d5fdcbca2b4b69f6b97eb6        refs/heads/nixpkgs-18.09-darwin
d3e6486935981288621cca2eaabe23017f16ab57        refs/heads/nixpkgs-19.03-darwin
c0e56afddbcf6002e87a5ab0e8e17f381e3aa9bd        refs/heads/nixpkgs-unstable
9b2c3093d2146f050a3b02bad04531a33789367b        refs/pull/1/head
baaa4b9f817a945873895b0a275200233ab5835f        refs/pull/1/merge
...

Github allows getting a .tar.gz or .zip compressed version of the
repo at any commit, via an URL like:
https://github.com///archive/[.tar.gz|.zip]

Substituting the commit hash from the previous command into the gihub archive
URL template, we can calculate a base32 encoded SHA-256 hash of the
unpacked tarball.
Nix uses this to check the integrity of the file on subsequent downloads.

$ nix-prefetch-url --unpack https://github.com/NixOS/nixpkgs-channels/archive/d3e6486935981288621cca2eaabe23017f16ab57.zip
unpacking...
[27.5 MiB DL]
path is '/nix/store/440aawadng6l42y0khr9a91s0qd4xkq7-d3e6486935981288621cca2eaabe23017f16ab57.zip'
1k3g4fvg48ffzkfapalghxn3r4s6lgbpsdn04cq0cmn4yphgpkf7

Since neither of the hashes communicate what channel did they represent,
it worth taking a note of that in the name attribute of the fetchTarball
function.

As a result, your shell.nix should look something like this:

with import (builtins.fetchTarball {
    name = "nixpkgs-19.03-darwin";
    url = https://github.com/nixos/nixpkgs-channels/archive/d3e6486935981288621cca2eaabe23017f16ab57.tar.gz;
    sha256 = "1k3g4fvg48ffzkfapalghxn3r4s6lgbpsdn04cq0cmn4yphgpkf7";
}) {};

let
    nodejs = nodejs-10_x;
in

mkShell rec {
    buildInputs = [ nodejs ];

    shellHook = ''
        # Arbitrary shell script to run, which might introduce impurity!
        # You can reference the package store paths or package attributes:
        echo ${nodejs.name} is stored at \"${nodejs}\"
    '';
}

Channel builds fail regularly, so you might want to pin them to a version
which has been built successfully.

The beginning of the Nixpkgs manual / Contributors Guide
links to the Hydra build system logs:
http://hydra.nixos.org/job/nixpkgs/trunk/unstable#tabs-constituents

You can page back manually to find the latest passing build. If you click on
it, under the _Inputs_ tab you can find the git commit hash in the _Revision_
column. The _Value_ field is https://github.com/NixOS/nixpkgs.git, but
the same commit is accessible via both the nixpkgs and the
nixpkgs-channels repos.

Alternatively you can find the commit hash of the latest successful build more
directly on the non-official https://howoldis.herokuapp.com/ site.

The 1st - _Channel_ - column there links you to a directory, which contains
a more compressed version of the package tree in the nixexprs.tar.xz file.
The directory listing also shows the corresponding SHA-256 hash, but in a
hex encoded format. It's also the hash of the compressed file, not the
unpacked data, so we have to calculate it as above:

$ nix-prefetch-url --unpack https://releases.nixos.org/nixpkgs/nixpkgs-19.09pre188239.c0e56afddbc/nixexprs.tar.xz
unpacking...
[12.1 MiB DL]
02sijmad7jybzwf063aig6bsaw4h85as0ax4x2425c867n62xnxz

I also came up with an approach to combine the stable and the unstable channels, to minimize the amount of dependencies downloaded and to maximize the amount of binary cache used.

Assuming a new package is only available in the unstable channel, BUT
it would just work fine if it's compiled within the stable channel's environment, then we can do the following:

with import (builtins.fetchTarball {
    name = "nixpkgs-19.03-darwin";
    url = https://releases.nixos.org/nixpkgs/19.03-darwin/nixpkgs-darwin-19.03pre173264.d3e64869359/nixexprs.tar.xz;
    sha256 = "1a48845psycjwhp1z88ygcrmw6r4hkhb413c7np38gch6c8jz9n6";
}) {};

let
    unstable = import (builtins.fetchTarball {
       name = "nixpkgs-unstable";
       url = https://releases.nixos.org/nixpkgs/nixpkgs-19.09pre188239.c0e56afddbc/nixexprs.tar.xz;
       sha256 = "02sijmad7jybzwf063aig6bsaw4h85as0ax4x2425c867n62xnxz";
    }) {};

    devd = callPackage (unstable.path + /pkgs/development/tools/devd) { };
    modd = callPackage (unstable.path + /pkgs/development/tools/modd) { };
in

mkShell rec {
    buildInputs = [ devd modd ];
}

I have no idea how good this solution is, but it seems to work...

I've updated the https://nixos.wiki/wiki/FAQ/Pinning_Nixpkgs page to include the ref attribute in the fetchGit call.
I think this issue can be closed.

@onetom Although I appreciate your effort to document pinning, it does not solve the unhelpful error message. It is not a solution to this issue, because even if it is documented everywhere, someone will forget to add the ref and the error message will mislead them.

I agree with bgamari:

I think it's fine to expect users to specify ref. However, the error here is truly awful. It's not even remotely indicative of the root cause; I understand that this error originates from Git but perhaps Nix could do better to guide the user to their mistake if it sees that git fails.

A reminder why it's not helpful, quoting sgraf812:

I was a bit thrown off by the error message and thought that the cached git repo was corrupted, so I tried to delete it twice, to no avail. A hint in the error message of fetchGit would be great.

I found this pretty surprising, despite being pretty familiar with both nix and git. It almost seems like the ref field ought to be mandatory, given this behavior.

To add to this and request it's revisited or prioritized, I just made a nix build for my team. Their first nix experience was this error after hearing about how "works on my machine could become a thing of the past" :smile:

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/builtins-fetchgit-error-not-a-tree-object/6954/2

So it seems there are two ways to go forward as a default

  • require ref
  • not setting ref clones the whole repository (fetching refs/remotes/*)

Either way, fetching the whole repository should be an option because sometimes one has only a rev but not ref. For example Stack tooling in Haskell world operates this way.

A good example that even experienced Nix users get stuck: https://twitter.com/ProgrammerDude/status/1275375411631927297

To me not specifying a ref is just bad practice. It would be like having fetchurl download an entire website to search for a file with the specified hash, just so the user doesn't have to specify a URI stem. Exaggerating obviously :-)

@edolstra I disagree. Let's take import-cargo as an example where we can only specify rev and url (since both are exposed in a Cargo.lock).

Recently I wanted to replace a crate in a personal rust project by a personal fork and specified a revision which points to a different ref which then broke the build (while cargo didn't have any issues). Thus I think that it's fine to not have specified a ref in some cases.

So it seems there are two ways to go forward as a default

require ref
not setting ref clones the whole repository (fetching refs/remotes/*)

I'll chime in again with my slight modification on the latter of:

"not setting ref clones the whole repository but gives a warning"

That would be kind of annoying in the case of archived checkouts in stack.yaml you linked though @domenkozar, so I'd say there should be a flag to disable that warning so that consumers like haskell.nix could disable it.

Was this page helpful?
0 / 5 - 0 ratings