In Bazel's own shell integration tests, when they run Bazel it always starts a new server.
Culprit summary: MinGW Bash waits for all child processes of a process running in a subshell to terminate before terminating the subshell. Linux Bash's subshell exits as soon as the direct child process exited.
Finding the culprit required a solid day of debugging.
No, it only happens if Bazel runs in a subshell. If the test runs:
pid1=$(bazel info server_pid)
pid2=$(bazel info server_pid)
echo "pid1=$pid1, pid2=$pid2"
the two PIDs are different. It also runs very slowly (~half a minute), the reason I'll explain later.
Running them directly is fast and prints the same PID:
bazel info server_pid
bazel info server_pid
Because the server.pid.txt file is missing. It took a long time to find out that it was the server itself deleting that file, as part of an orderly shutdown.
Not because anything asks it... It does so because --max_idle_secs is 15 for tests, which elapses, and the server neatly shuts down and cleans up after itself, WAI. This is why two Bazel invocations in a subshell takes about half a minute.
Because it turns out MSYS (and Cygwin) Bash waits for all children of a process running in a subshell to terminate, even if those processes are in a different process group (CREATE_NEW_PROCESS_GROUP).
The Bazel client starts the server process with CREATE_NEW_PROCESS_GROUP. The process tree (observed in Sysinternals' Process Explorer) shows that as long as the parent is running, the child is displayed as a child, but after the parent terminates the child becomes a top-level process. MSYS however doesn't know or doesn't care, and waits for the child process to exit.
I don't know. I tried:
CreateProcess (similar to double-fork idiom on Unixes), the middle process exiting thus orphaning the grandchild -- the process tree looks fine (orphaned process becomes a top-level one), yet Bash still waits for itCREATE_BREAKAWAY_FROM_JOB)msys-2.0.dll and the fork and setsid methods, calling those to emulate a double-forkNone of the above works.
C:\work>type minibazel.cc
#include <windows.h>
#include <stdio.h>
void log(const char *format, ...) {
FILE* f = fopen("c:\\work\\minibazel.txt", "at");
va_list ap;
va_start(ap, format);
fprintf(f, "(pid=%d) ", GetCurrentProcessId());
vfprintf(f, format, ap);
va_end(ap);
fclose(f);
va_start(ap, format);
fprintf(stdout, "(pid=%d) ", GetCurrentProcessId());
vfprintf(stdout, format, ap);
va_end(ap);
}
int ExecuteDaemon(const char* argv0) {
SECURITY_ATTRIBUTES sa;
sa.nLength = sizeof(SECURITY_ATTRIBUTES);
sa.bInheritHandle = FALSE;
sa.lpSecurityDescriptor = NULL;
PROCESS_INFORMATION processInfo = {0};
STARTUPINFOA startupInfo = {0};
char cmdline[1000];
size_t len = strlen(argv0);
strncpy(cmdline, argv0, len);
cmdline[len] = ' ';
cmdline[len + 1] = 'x';
cmdline[len + 2] = 0;
BOOL ok = CreateProcessA(
/* lpApplicationName */ NULL,
/* lpCommandLine */ cmdline,
/* lpProcessAttributes */ NULL,
/* lpThreadAttributes */ NULL,
/* bInheritHandles */ TRUE,
/* dwCreationFlags */ DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP,
/* lpEnvironment */ NULL,
/* lpCurrentDirectory */ NULL,
/* lpStartupInfo */ &startupInfo,
/* lpProcessInformation */ &processInfo);
if (!ok) {
log("ERROR[child] CreateProcess, err: %d\n", GetLastError());
return 1;
}
CloseHandle(processInfo.hProcess);
CloseHandle(processInfo.hThread);
return 0;
}
int main(int argc, char** argv) {
if (argc > 1) {
log("INFO[child] Sleep 10 sec\n");
Sleep(10000);
log("INFO[child] Done\n");
return 0;
} else {
log("INFO[parent] start -------------------\n");
int x = ExecuteDaemon(argv[0]);
log("INFO[parent] Created process, sleep 5 sec\n");
Sleep(5000);
log("INFO[parent] Done\n");
return x;
}
return 0;
}
C:\work>cl minibazel.cc
Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24213.1 for x64
Copyright (C) Microsoft Corporation. All rights reserved.
minibazel.cc
Microsoft (R) Incremental Linker Version 14.00.24213.1
Copyright (C) Microsoft Corporation. All rights reserved.
/out:minibazel.exe
minibazel.obj
$ cat ./subshell.sh
#!/bin/bash
echo "$(date +%H:%M:%S) start subshell"
out=$(c:/work/minibazel.exe)
echo "$(date +%H:%M:%S) done subshell"
$ ./subshell.sh
15:24:01 start subshell
15:24:11 done subshell
Operating System: Windows 10
Bazel version (output of bazel info release): all, the problem is not in Bazel AFAICT
Sadly, no.
Such a clear explanation and nice reproduce! Thanks for looking into this!
This bug is part of https://github.com/bazelbuild/bazel/issues/4292 so I'm leaving it as P2 and adding the Q1-2018 label.
You know what, calling Bazel in a subshell might no longer be a problem after https://github.com/bazelbuild/bazel/commit/9c97bf96f794e2bea6fcb7fe240a64b3e605e292
You're right!