Git: Get the BusyBox-based MinGit production-ready

Created on 20 Jan 2018  路  18Comments  路  Source: git-for-windows/git

Cygwin (and hence MSYS2, which is a derivative) tries to emulate POSIX functionality on top of the Win32 API. When spawning child processes, this means that fork() needs to be emulated, which is hard, and requires the MSYS2 runtime's address range to be pinned. This leads to many a problem on many, many setups sometimes even only after unrelated software is upgraded!

Git for Windows uses the MSYS2 runtime essentially for two things: SSH and Unix shell/Perl scripting.

In MinGit, we already do not include any Perl scripts. But plenty of Unix shell scripts. Ideally, those would be converted into "builtins", i.e. pure, portable, performant C, which would make everything quite a bit more robust, not to mention fast. Sadly, this is no priority of core Git's developers/maintainer and it is a lot of work.

To side-step this, we put in quite an effort last year to ship a "BusyBox-based" variant of MinGit. BusyBox is an executable that offers minimal versions of many Unix tools, such as a Unix shell, sed, awk, etc, in a single binary (much like git.exe includes many subcommands as "builtins"), and there exists a pure Win32 version of BusyBox that we helped along until it could run Git's Unix shell scripts and test suite.

It is time to get this BusyBox-based MinGit to a point where it is robust enough to be the default MinGit.

Most helpful comment

Mostly by testing in as close to production as you dare ;-)

All 18 comments

@dscho I'm interested in this, especially to reduce/avoid the dependency on MSYS2/Cygwin. How can others help?

How can others help?

Mostly by testing in as close to production as you dare ;-)

I vaguely remember that there have been some inexplicable hangs when I tried to run the test suite (I used GIT_TEST_ARGS=--quiet GIT_TEST_INSTALLED=/path/to/mingw64/bin busybox xargs -P15 ./test-[0-9]*.sh after copying the test artifacts (which are now bundled as Pacman package, too) into a clone of git-for-windows/git). That's the main part I want to address in the near future.

Mostly by testing in as close to production as you dare ;-)

Hey everyone it been hard to get anything out of this

I am sorry to hear that. But I also have to say that I have a hard time understanding what exactly the problem is. Care to explain in any sort of detail?

The Anaconda Distribution has been trying to use busybox Git for Windows and ran into a problem with git submodule hanging:

C:\gfw\cmd\git.exe submodule update --init --recursive
10:53:00.229338 git.c:576               trace: exec: git-submodule update --init --recursive
10:53:00.229338 run-command.c:640       trace: run_command: git-submodule update --init --recursive

I added set -x and a few echo's to C:\gfw\mingw64\libexec\git-core\git-submodule to see what I could see and I saw two problems:

  1. It uses basename and that doesn't seem to be backslash happy:
+ basename 'C:\gfw\mingw64/libexec/git-core\git-submodule'
+ echo 'git-core\git-submodule'
git-core\git-submodule
+ basename 'C:\gfw\mingw64/libexec/git-core\git-submodule'
  1. It hangs at the call to sed. It seems that another busybox got spawned ok and is pegging one of my CPU threads but I guess they're not chatting to each other?
+ sed -e 's/-/ /'

I see a commandline of sh --forkshell 00000000000002CC for the last spawned, busy busybox.

Switching the busybox.exe in the latest release to this one allows this to work (or at least to not hang).

Sorry for the late reply. I am still struggling to find any time for any serious work on BusyBox. This comment makes me believe that the hang is caused by the patches I introduced on top of BusyBox-w32, and that's what most likely also causes the so far unexplained hangs in the test suite I observed.

@dscho is there a particular build we should be helping with testing if this has accidentally resolved itself, or should we hold off for the moment?

No problem at all @dscho there's only so many hours in a day.

is there a particular build we should be helping with testing if this has accidentally resolved itself, or should we hold off for the moment?

@shiftkey Sadly, there is no chance of this being resolved accidentally...

@mingwandroid thanks for understanding!

I actually found busybox mingit to be more usable by end-user than regular mingit, as you can simply run busybox sh to get a working shell, while bash from regular mingit is basically unusable.

while bash from regular mingit is basically unusable.

I don't believe bash.exe ships in the vanilla MinGit environment. sh.exe does, and if you've seen issues with that I'd love to hear more.

it has bash as sh. At least that's what --version tells me.

However, it is not usable as an interactive shell, cause there is no readline, and and the PATH is not automatically set to include /usr/bin, so it doesn't resolve those unix commands by default, like busybox does. (Or am i missing something)


Update, path is set if I call sh.exe --login. However readline is still pretty broken in.

In addition, mingit busybox has broken Unicode support (probably expected given it took a few versions to get unicode support in MSYS).

MinGit is not intended to be used interactively, skipping a ton of parts in the quest to minimize the footprint for applications which want to ship with Git. Any interactive functionality you use in MinGit (BusyBox variant or not) might go away at any stage, without prior warning (apart from this here stern one).

I am increasingly annoyed by the fact that Git-bash behavior is just so special that commands which are intended to work both in cmd.exe _and_ in a real POSIX shell, but they always need special hanling in Git-bash. For example, I frequently use winpty and MSYS2_ARG_CONV_EXCL='*' to run simple commands like winpty docker run -it ubuntu ls //bin. And when I manage to do this some commands crash anyway or produce strange output because somehow the pseudo terminal emulation works differently.

As far as I am concerned, I don't need the full power of bash on Windows, or the path conversion, or the terminal emulation. I only want to native Windows applications in a native Windows terminal. On the other hand, I just can't get used to cmd.exe or PowerShell because I am so addicted to basic command line editing shortkeys like ctrl-p, ctrl-n, ctrl-a, ctrl-e, alt-b, alt-f, alt-w, alt-d, and maybe also ctrl-z and ctrl-r.

I think it would be huge if we can rid Git-for-Windows of the complexity induced by the MSYS2/Cygwin emulation layer (both runtime and terminal emulation) and replace it with Windows native code. Why do we consider BusyBox only for MinGit, but not as a replacement for MSYS2/Cygwin/Git-bash? Ignoring backwards compatibility for now, what would be the minimum requirements for such a replacement?

Do we need more than the following:

  • posix shell script support
  • posix shell command line
  • readline support (for ctrl-a, ctrl-u commands etc.)
  • command completion
  • pager (I am perfectly fine with less, but most people barely manage exiting from less, let alone navigate in it or make case-insensitive searches)

Note that BusyBox comes with its own less and its own readline lookalike. Having said that, we are currently very reliant on MSYS2. Off the top of my head:

  • Perl. git svn and git send-email still require Perl. (So does git add -i, but I already have a version running locally that addresses that, see PRs gitgitgadget/git#170-gitgitgadget/git#175 for details).

  • OpenSSH. The native OpenSSH is getting there, but it is Windows 10 only, and it still has some kinks in some corner cases (and with a user base of over 3 million, a single maintainer could easily be overwhelmed if even only as many as 0.01% hit those corner cases and report those bugs and demand them to be fixed).

  • We are relying on MinTTY to provide a better terminal window, at least on older Windows versions (traditionally, prior to Windows 10, the CMD window is seriously limited, compared to what one is used from Linux and even to a certain extent from macOS). BusyBox still needs to learn about MinTTY's pseudo terminals, so that it correctly detects that it is running in a terminal.

  • Quite likely a lot of other things that I can't think of right now.

Also, please note that BusyBox supports Ctrl+R, but not Ctrl+Z.

Further, BusyBox-w32 has troublesome performance issues, at least currently. In theory it should be a lot faster to execute shell scripts using BusyBox' ash than using MSYS2's Bash. In practice, it seems to be the opposite. My guess is that the way the forkshell emulation is implemented is suboptimal, and leaves a lot of room for improvement.

Finally, let's not forget that Git advertises scripting as the way to make things work. Hooks are strongly expected to be shell scripts. And those shell scripts are definitely outside of the control of Git's own source code, so we would quite likely break power users' scripts by simply switching to BusyBox, as most of its commands/options are noticeably limited compared to the full commands.

So I think that the best we can do is to offer an opt-in to BusyBox. After making it work. Robustly so.

FWIW, I am now happy with Clink, which effectively adds readline capabilities to cmd.exe.

Was this page helpful?
0 / 5 - 0 ratings