Binderhub: Issue with migrating master -> main

Created on 28 Aug 2020  路  9Comments  路  Source: jupyterhub/binderhub

We have recently moved the master branch to main on all the Bokeh repos except our bokeh-notebooks repo. This is because our docs headers link to tutorials at a URL that explicitly includes "master":

https://mybinder.org/v2/gh/bokeh/bokeh-notebooks/master?filepath=tutorial%2F00%20-%20Introduction%20and%20Setup.ipynb

If the master branch is deleted, this binder fails to launch. I had heard GH is now redirecting deleted branches to the new default branch, but even if this is true apparently it is not enough to get binder to launch from that URL if only main exists.

Lots of projects are currently considering or implementing such changes. Is there any possible solution to help support this kind of change transparently the binder side? We could obviously try to update things on our end but the that would mean changing tens of thousand of files on s3 to change, which would incur both operational and financial risk.

cc @betatim

enhancement

Most helpful comment

Given that the scope of the workaround is fairly small and well defined (only for rename away from master for default branch on GitHub鈥攐ther repo providers perhaps in the future if they provide similar resolution for the default branch), I think we can implement this now/soon with little cost, and it may become obsolete if github implements its own redirect.

All 9 comments

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:

Welcome to the Jupyter community! :tada:

The official advice from GitHub is that although they're supporting repos that have already made changes they advise waiting a bit:
https://github.com/github/renaming#later-this-year

And, we鈥檙e also looking into redirecting users who git fetch or git clone the old branch name to the new branch name (with a warning and instructions to update their local clone), so it鈥檚 easy for your contributors to move. You鈥檒l be able to do a rename from GitHub.com, GitHub Desktop, or the CLI.

Hopefully this means it will "just work" with no changes on repo2docker and BinderHub, but we'll have to see.

It'll be good to make a test repo with the default branch renamed and deleted to see what pieces we might have in place that would catch a rename and which would need updating.

I think this might be the only place we use the branch name, making an API request to get the current commit for a ref:

 GET /repos/:owner/:repo/commits/:ref

for bokeh-notebooks, that would be https://api.github.com/repos/bokeh/bokeh-notebooks/commits/master. If that link is redirected to to the default branch, e.g. https://api.github.com/repos/bokeh/bokeh-notebooks/commits/main, I think we're all set. It doesn't appear to be the case yet, though.

Update: I created https://github.com/minrk/test-default-branch with:

  • default branch: master
  • changed default branch to: main
  • deleted master branch

As far as I can tell, the links for deleted branches redirecting to default is only applied to blob pages, not even tree. I tested with the following links:

tested 2020-08-28, may change as GitHub rolls things out

  • blob redirects master to main: https://github.com/minrk/test-default-branch/blob/master/README.md -> https://github.com/minrk/test-default-branch/blob/main/README.md
  • tree doesn't redirect: https://github.com/minrk/test-default-branch/tree/master -> 404, no redirect
  • api doesn't redirect: https://api.github.com/repos/minrk/test-default-branch/commits/master -> 404, no redirect

So if github doesn't provide a redirect in the api, we could implement this redirect ourselves for renamed master only with:

def get_resolved_ref(ref):
    try:
        return resolve(ref)
    except 404 and ref == 'master':
        repo = GET /repos/:owner/:repo
        if repo.default_branch != 'master':
             # ref is master, which doesn't exist and default branch is not master,
             # support folks renaming default branch away from master
             return resolve(repo.default_branch)
        else:
            raise

I also think we should probably change our code in various places away from hardcoded master to a special value, e.g. $default (must not be a valid ref) that resolves to the repo default branch instead of assuming that to be master (or main, etc.)

Given that the scope of the workaround is fairly small and well defined (only for rename away from master for default branch on GitHub鈥攐ther repo providers perhaps in the future if they provide similar resolution for the default branch), I think we can implement this now/soon with little cost, and it may become obsolete if github implements its own redirect.

Just a note that github is now defaulting to "main" so we are going to start running into this more and more. (As @psychemedia recently noticed on Twitter)

On 2020-10-22 https://api.github.com/repos/bokeh/bokeh-notebooks/commits/master now points to 7b6da26945e284b19df07daecc6beabdb7adbe81 which is the same SHA1 I see when I navigate to https://github.com/bokeh/bokeh-notebooks/ directly. So I think the redirect for repos that renamed/changed branch names now works.

I also created https://github.com/betatim/new-default-branch-name which uses the new default branch name of main. It never had a different branch name. Maybe useful for others to test things.

I think this should be resolved by #1186

Let's close it! If others run into issues, we can always re-open. Thanks so much for everyone's work on this 馃憤

Was this page helpful?
0 / 5 - 0 ratings