Drake: Search branch until finding something out of date in parallel

Created on 23 Nov 2017  路  10Comments  路  Source: ropensci/drake

I just ran into a case where a potential parallelisation wasn't realised.

The dependency structure was something like:

A -> B -> D -> E
A -> C -> F

A, B, C, and D are up to date. Drake checked A, BC, DF, built F, then checked and built E.

I think the most efficient behavior would be:
A, B, C, and D are up to date. Drake checks A, BC, DF, E, then builds EF in parallel.

advanced priority

All 10 comments

I see your point, but I do not know how to solve this one. We would need a whole new parallel graph traversal algorithm, and then some of the most sensitive internals would need refactoring. It is a huge undertaking.

But it may be possible. Somehow, Make already knows how to do this, which means "Makefile" parallelism in drake should not have this problem. I submitted a Stack Overflow post to ask how exactly that works. In the meantime, if you can suggest a parallel graph traversal algorithm, that would help things along.

A new parallel graph traversal algorithm will solve, accelerate, clean, and simplify a lot, including this issue. I sketched one out at https://github.com/wlandau-lilly/drake/tree/parallel_algo (see parallel_worker()). There is a fixed number of persistent workers that stay going for the entire make(). Each worker completes its current target and then searches for the next unclaimed target with completed dependencies. Issues:

  1. Race conditions. I try to make use of the "progress" storr namespace to implement locking, but sometimes I get errors or hanging jobs. I do not yet know why.
  2. Memory management. I have not decided on an alternative to prune_envir(). Could be #167.
  3. Having the same persistent workers for the entire make() means long wall times. Users of resource managers such as SLURM may not be able to request wall times long enough, depending on the user privileges granted by their respective systems. On the other hand, reusing existing workers may remove much of the annoying overhead of starting and stopping jobs. The TORQUE example might even work on my botched local installation.
  4. "Makefile" parallelism: should it still spawn one job per target, or should it reuse the persistent workers? For the sake of the wall time issue, I lean towards the former.

I don't think assuming persistent workers is a problem. Users can control wall times by building subsets of myplan$targets, as I have been doing switching memory and wall time configurations for different sets of targets.

However, you might have a future ambition to build in target-level future.batchtools configurations within myplan to avoid the user having to make multiple calls to make. I wouldn't try the latter feature until the dependent packages are a bit more mature.

I just had a closer look a this and I think that building from your original approach is the way to go. I'm imagining running a job with hundreds of workers all fumbling to get going on the same targets. It'll be a mess. File locking surely isn't a perfect strategy.

I can't imagine that this approach can work with many workers :/

Trimming graphs of up to date targets at each parallel_stage seems like it can work and wouldn't be too complicated to implement?

I hope you are right. It is definitely easier to stick with parallel stages and one job per target, and in the short term, it is much safer. Plus, with a fixed set of permanent workers, your idea of target-level hpc configurations would be impossible.

I just feel bad that drake's graph traversal algorithm is less aggressive than that of Make itself. With lapply()-like backends, drake needs to finish all the jobs in a stage before starting any of the next targets. Make, on the other hand, can just fork new jobs on the fly whenever a worker finishes and the graph opens up. It may be that drake's non-"Makefile" backends are just more constrained than Make and there is nothing we can do about it. But it could also be that I just naively dove into the first algorithm that came to mind without actively searching for enough other possibilities. Maybe future without future_lapply() could be more aggressive, I am not sure.

So we can stick to the original intent of this issue. My company's hpc systems will be down this weekend, and I am spending most of my time with family right now anyway, so my own work will be more delayed than usual.

future without future_lapply can likely achieve aggressive parallelism like you describe. I'm still wrapping my head around that package, but it seems like this is exactly the use case they were thinking of. However, many users (at least me) would want to be able to switch plans between sets of targets. All the best to you and your family over the holiday period, Will! Stop working!

Thanks for the encouragement, Kendon! I am taking a lot of time off.

Part of the solution to this issue will be to always process all the targets before all the imports, no matter what parallel backend is used. This will simplify, fortify, and standardize the parallel execution routines, and it solves a similar problem: for local parallelism, sometimes imported functions are lined up with true targets, which detracts from the parallelism of the targets themselves.

Also, this issue requires a deep update to max_useful_jobs() and rate_limiting_times(), which eventually led me to #170.

Will this still allow drake to check targets at each parallel stage? You
would want to check targets right before they get built in case rebuilding
a dependency doesn't actually change that dependency.

Absolutely. In fact, unnecessary checking is removed elsewhere so drake relies more heavily on last-minute checking in each stage.

Fixed, unit tested, and mentioned in the vignettes and README.

Was this page helpful?
0 / 5 - 0 ratings