gps.WriteDepTree() is the global entry point that actually creates a vendor/ tree. Right now, it operates entirely in serial, writing out dependency directories one at a time. I don't know exactly how much we'd be able to get out of parallelizing this, as most of the bottleneck here is likely to be I/O. However, it seems worth at least trying to run these operations concurrently; my naive expectation would be that 2x parallelization would at least be able to earn some speedups; the kernel ought to be able to schedule the vcs subprocesses sanely so that read/CPU work segments in one process alternate with writing in the other process, helping to ensure we're always saturating disk writes.
If someone could do an experimental implementation of this, then work up some benchmarks to validate the hypothesis, that'd be great.
Hey, is someone working on this issue, does it need to be claimed?
@tonto nobody's working on it, so it's all yours if you want it! 馃帀 馃帀
@sdboyer cool, I'll have a go then, fingers crossed :)
great! please don't hesitate to ask questions, whether here or in slack.
@sdboyer Without making a PR I pasted the code for WriteDepTree with some ExportProject error handling related questions in the comments...
func WriteDepTree(basedir string, l Lock, sm SourceManager, sv bool) error {
if l == nil {
return fmt.Errorf("must provide non-nil Lock to WriteDepTree")
}
err := os.MkdirAll(basedir, 0777)
if err != nil {
return err
}
wg := sync.WaitGroup{}
// Related to Q1
var e error
for _, p := range l.Projects() {
go func(p LockedProject) {
wg.Add(1)
defer wg.Done()
to := filepath.FromSlash(filepath.Join(basedir, string(p.Ident().ProjectRoot)))
err = sm.ExportProject(p.Ident(), p.Version(), to)
if err != nil {
// Q1 - Since any goroutine can error out, what do we do with err's
// (e here is temporary just to catch any error)
// Since the function previously returned immediately after an error and called `removeAll`
// does this mean that we should stop all other goroutines
// or just record an error like this (or use `pkg/errors` to wrap them) and wait out
// the rest of the goroutines
e = fmt.Errorf("error while exporting %s: %s", p.Ident().ProjectRoot, err)
log.Println("project error: ", to, e)
return
}
if sv {
filepath.Walk(to, stripVendor)
}
// TODO(sdboyer) dump version metadata file
}(p)
}
wg.Wait()
// Related to Q1
if e != nil {
removeAll(basedir)
return err
}
return nil
}
I know that this was just supposed to be a proof of concept bench mark and that maybe the question above is misplaced, so you don't have to answer it right now, but the real problem is that I can't really run the benchmark for WriteDepTree since I get an error from ExportProject:
--- FAIL: BenchmarkCreateVendorTree-4
result_test.go:149: unexpected error after 29 iterations: /var/folders/5_/6g14f8cd4h5d0k43fsmrltgr0000gn/T/vsolvtest/export/github.com/sdboyer/testrepo/README already exists, no checkout
/var/folders/5_/6g14f8cd4h5d0k43fsmrltgr0000gn/T/vsolvtest/export/github.com/sdboyer/testrepo/tmp2 already exists, no checkout
: exit status 128
Probably complaining because of conflicting work between goroutines...
I see that the benchmark uses the built-in SourceMgr - does this imply that it is not thread-safe ?!
Sorry if I got things wrong, your feedback would be helpful.
@tonto no worries! like i said, questions are totally fine 馃槃
that you'd be getting an error like that is surprising - it would seem to suggest that there's a possible scenario in which WriteDepTree() is exiting before all of its subprocesses terminate. that definitely should not be possible.
more generally, though...yep! this is the kind of challenge you have here - a sync.WaitGroup may or may not be the best way of accomplishing this, as you need to communicate progress and potentially errors back from each individual goroutine.
i might instead explore something more like a dispatcher/controller goroutine plus a configurable number of worker goroutines.
does that make sense?
@sdboyer Yes, was thinking along the same lines concerning controller - worker groutines, so that's clear, thanks :)
Regarding the error I'm getting not really but I will investigate further
EDIT: I resolved it silly mistake from me, thanks for the pointers
great!
also, a key note about the domain model here...
there's an assumption that projects cannot be logical descendants of one another - only siblings, or entirely unrelated. that is, we assume that somedomain.com/project and somdomain.com/project/subprojectcannot both exist asProjectRoots in a singleLock`. (they probably shouldn't exist globally either, but that's a different scope of problem). however, this assumption isn't really clearly enforced anywhere. it's probably impossible for it to arise normally, but we'll eventually need to defend against it.
i'm raising it now just to say that, for the purposes of this issue, you _should_ ignore it. we'll work on it in a follow-up 馃槃
done now