[x]
):After running any operation that uses the new commit-info module on a directory - it leaves a defunct process lying around saying "[git]
...
I fixed it locally in /vendor/code.gitea.io/git/commit-info.go:
I added a line "defer cmd.Wait() after the cmd.exec of git log
@trebonian could you send a PR to github.com/go-gitea/git
?
Not sure how to do it without a fork of gitea on github.
Here is a "git diff" of my change:
diff --git a/vendor/code.gitea.io/git/commit_info.go b/vendor/code.gitea.io/git/commit_info.go
index 106ebc61..c2691fc6 100644
--- a/vendor/code.gitea.io/git/commit_info.go
+++ b/vendor/code.gitea.io/git/commit_info.go
@@ -198,6 +198,7 @@ func getCommitsInfo(state *getCommitsInfoState) error {
}
cmd := exec.CommandContext(ctx, "git", args...)
cmd.Dir = state.headCommit.repo.Path
+ defer cmd.Wait()
readCloser, err := cmd.StdoutPipe()
if err != nil {
EDIT: change the formatting so it's readable // BKC
I don't think calling cmd.Wait()
is the right solution here, since that will block until the git log
command completely finishes (which will take a long time for large repos)
its deferred until this routine returns - which I assumed meant all the log output was processed.
The cmd.Wait() needs to be called before the routine returns. Is there a better place for it?
@trebonian No, the function may return without processing all of the git log
output.
@ethantkoenig then whoever processes the last output needs to do a "Wait" otherwise zombies will be left lying around
It was my understanding that defer cancel()
(here) would clean up the process, but I'll investigate more when I get the chance.
Thanks - the symptom that I am seeing is that there is a new "git" zombie processes after every invocation of gitCommitsInfo.
I am not sure how the Golang runtime handles the "cancel" but the main process needs to do an OS wait on the child process id (at the linux level). I don't know how it can do this if the only handle to that child process is lost after gitCommitsInfo returns.
PS - it should only call the Wait if we know the child is done - otherwise we will block the main process.
I think @trebonian is right. cmd.Wait
is needed if cmd.Start
but not cmd.Run
.
I think @ethantkoenig is correct, and we don't want the main process to block - but I also think we need to call cmd.Wait() in the main process when the cmd'ed process terminates. Note that the current code does not seem to check for any kind of process error while the cmd is running.
I did the following after the cmd.Start() and it seems to work:
go func () { cmd.Wait() }()
This starts a goroutine that will wait until the process exits. This seems to fix up the zombies - but it's not the nicest.
I am not sure, but I think there needs to be something with a "done" channel to detect and report process errors (like getting killed, or signalled while running), and the cmd process done detection is merged with the other thread done check's lower down.
go func () { cmd.Wait() }()
seems work. Or maybe handle the error and write to log.
After running benchmarks, it looks like calling cmd.Wait()
from the main process is okay, so long as we have previously closed the stdout pipe. Closing stdout will end the child process, so that the subsequent call to cmd.Wait()
becomes effectively non-blocking. Benchmarks below:
Current implementation (creates defunct processes):
BenchmarkEntries_GetCommitsInfo/gitea-40 500 27661283 ns/op
BenchmarkEntries_GetCommitsInfo/manyfiles-40 100 195525556 ns/op
BenchmarkEntries_GetCommitsInfo/moby-40 300 50553596 ns/op
BenchmarkEntries_GetCommitsInfo/go-40 30 372394389 ns/op
BenchmarkEntries_GetCommitsInfo/linux-40 20 704852371 ns/op
Proposed changes:
BenchmarkEntries_GetCommitsInfo/gitea-40 500 26902208 ns/op
BenchmarkEntries_GetCommitsInfo/manyfiles-40 100 201779563 ns/op
BenchmarkEntries_GetCommitsInfo/moby-40 300 49267027 ns/op
BenchmarkEntries_GetCommitsInfo/go-40 50 365035008 ns/op
BenchmarkEntries_GetCommitsInfo/linux-40 20 716164050 ns/op
I will send a fix to https://github.com/go-gitea/git.
Most helpful comment
After running benchmarks, it looks like calling
cmd.Wait()
from the main process is okay, so long as we have previously closed the stdout pipe. Closing stdout will end the child process, so that the subsequent call tocmd.Wait()
becomes effectively non-blocking. Benchmarks below:Current implementation (creates defunct processes):
Proposed changes:
I will send a fix to https://github.com/go-gitea/git.