Julia: deprecate or improve `@parallel for`

Created on 13 Dec 2016 · 17Comments · Source: JuliaLang/julia

This issue is to debate changes to @parallel for, if any.

Options:

Deprecate @parallel for - the syntax, while convenient, differs from a typical for loop w.r.t. the iterables that can be supported, local variables, error handling, etc. The unstated assumption that an @parallel for is just a for loop executed in parallel is not always correct. pmap with batching can be used in scenarios where @parallel for was previously used with the additional benefits of retrying on errors and elastic nodes (since pmap uses worker pools).
Keep @parallel for with some changes
a. Make the non-reducer case blocking by default. Today you need to prefix a @sync to detect errors which would otherwise be discarded silently. See https://github.com/abhijithch/RecSys.jl/pull/36#issuecomment-266345229.

b. Generalize it to support more types of iterators

c. Any other improvements?

decision parallel stdlib

Source

amitmurthy

Most helpful comment

@parallel for has a proven track record of confusing people and, I think, with good reasons. It wants to pretend that it is a normal for loop - just parallel - but as @amitmurthy points out, the semantics are different. It is kind of misleading advertising to pretend that you can get parallelism just by annotating a for loop with a macro. It can easily go wrong and it often requires explanation so I think we'd be off with providing alternatives.

Our @parallel (op) for is a parallel (map)reduce underneath and the semantics are also similar to mapredure (the map is merged into the reduction operator). Therefore, I think it would be simpler to expose a parallel mapreduce. I don't think that users would expect to be able to update an array from the outer scope in a parallel mapreduce. In contrast, this is frequently happening in a @parallel for because it is a common pattern to update an array with a normal for loop.

andreasnoack on 13 Dec 2016

👍10

All 17 comments

cc: @andreasnoack

amitmurthy on 13 Dec 2016

I'd vote for keeping one paradigm that works well. So +1 to deprecate @parallel forand put all marketing behind pmap.

aviks on 13 Dec 2016

👎7 👍4

andreasnoack on 13 Dec 2016

👍10

Considering @ararslan changed his vote, it is currently 4 to 1 in favor of removing @parallel. However, given the low number of responses, I'll take the easy way out and push it up to @StefanKarpinski to take a call on this.

amitmurthy on 5 Jan 2017

I agree with deprecating this. "Parallel for" syntax of some kind I think only makes sense with shared memory, since then there is at least hope of matching serial behavior. At the very least it would have to be renamed, since we have two forms of parallelism now.

Certainly the reducing form can be deprecated to parallel mapreduce. We already have most of the machinery there with the preduce function that the macro uses.

JeffBezanson on 5 Jan 2017

Now that you mention it, usage in the context of SharedArrays is most appropriate. Suggest we deprecate it for one cycle, and plan on moving it to SharedArrays (just the file for now, under SharedArrays module when we have it). The documentation can be updated right away to restrict its mention to SharedArrays only.

amitmurthy on 5 Jan 2017

👍2

This doesn't seem quite ready or release blocking. Sorry, @amitmurthy!

StefanKarpinski on 2 Feb 2017

Please don't deprecate @parallel until we have an easy and performant replacement in pmap: https://github.com/JuliaLang/julia/issues/21940.

sbromberger on 18 May 2017

👍4

We should sort out which names refer to distributed vs. shared memory for 1.0.

JeffBezanson on 28 Sep 2017

@parallel for looks, to me at least, to be almost identical to #pragma OMP PARALLEL FOR for OpenMP in C++, except that it looks like it's distributed instead of threaded?

schnaitterm on 27 Nov 2017

I suggest simply renaming @parallel to @distributed to get this out of the way for 1.0. Part of the stdlib now, but probably still should be taken care of quickly.

JeffBezanson on 11 Jan 2018

👍3

Hi, I am a PhD student in parallel and distributed computing, and just recently found out about the effort of including native parallel and distributed support in the language. I think it is great.
I do agree that the naming for Julia's directives are a little confusing on what is local parallelism and what is distributed parallelism, as both have its advantages and drawbacks.

I do suggest though, that you keep distributed/parallel macros (OpenMP style) and functional programming map/reduce since they are part of two different programming styles. But I think that the reduce operation should be also handled by a specialization of the reduce function, a map_reduce function would be inconsistent.

I also think that using local parallelism, with co-routines or threads should be simpler and more straightforward then distributed parallelism because the need for local parallelism is much more common.

What I think I missed from multi-threading are critical sessions and atomic instructions. Do you intent to implementing them for 1.0?