Binderhub: Submodules are checked-out before the branch

Created on 4 Nov 2019  路  6Comments  路  Source: jupyterhub/binderhub

I have a branch on a GitHub repo which doesn't have submodules, but master has. I want Binder to build my repo/branch, and I can see it's still checking out master's submodules.
I think the checking out of submodules should be done after the checking out of the branch.

Most helpful comment

All 6 comments

The code first checks out the main branch of a repo, then switches to the specified ref (eg branch) and then runs a command to sync the submodules.

https://github.com/jupyter/repo2docker/blob/master/repo2docker/contentproviders/git.py#L18-L55

The first checkout uses --recursive maybe removing that would fix it? However I don't really use/understand submodules so I don't know if that would break things for repos that do have and want to use submodules. Do you know more?

The ideal solution would be to clone the branch and submodules directly with a single command git clone --branch BRANCH --recursive REPO.

@betatim Is the git checkout done in two steps for a reason?

I don't know off the top of my head why we do it in two steps.

Submodules and shallow clones are the two topics that come to my mind as problematic when we try and do clever stuff with checkouts. That is it.

Would git clone --branch WITH-A-REF-HERE --recursive REPO

work? Having a resolved ref instead of a branch name is almost always what happens when repo2docker is invoked.

I've just tested it.... you can clone a branch but not a specific commit.

However there's a -n option to git clone which means it won't automatically checkout HEAD which I think will do the trick:

$ git clone -n https://github.com/ome/omero-py.git xxx

$ cd xxx/

$ git branch
* master

$ ls -a
.    ..   .git

Then I think

git checkout COMMIT
git submodule init
git submodule update

will checkout the required submodules

Do you want to make a PR trying this out @davidbrochart ?

Was this page helpful?
0 / 5 - 0 ratings