This issue is to debate changes to @parallel for
, if any.
Options:
Deprecate @parallel for
- the syntax, while convenient, differs from a typical for
loop w.r.t. the iterables that can be supported, local variables, error handling, etc. The unstated assumption that an @parallel for
is just a for
loop executed in parallel is not always correct. pmap
with batching can be used in scenarios where @parallel for
was previously used with the additional benefits of retrying on errors and elastic nodes (since pmap
uses worker pools).
Keep @parallel for
with some changes
a. Make the non-reducer case blocking by default. Today you need to prefix a @sync
to detect errors which would otherwise be discarded silently. See https://github.com/abhijithch/RecSys.jl/pull/36#issuecomment-266345229.
b. Generalize it to support more types of iterators
c. Any other improvements?
cc: @andreasnoack
I'd vote for keeping one paradigm that works well. So +1 to deprecate @parallel for
and put all marketing behind pmap
.
@parallel for
has a proven track record of confusing people and, I think, with good reasons. It wants to pretend that it is a normal for
loop - just parallel - but as @amitmurthy points out, the semantics are different. It is kind of misleading advertising to pretend that you can get parallelism just by annotating a for
loop with a macro. It can easily go wrong and it often requires explanation so I think we'd be off with providing alternatives.
Our @parallel (op) for
is a parallel (map)reduce
underneath and the semantics are also similar to mapredure
(the map is merged into the reduction operator). Therefore, I think it would be simpler to expose a parallel mapreduce
. I don't think that users would expect to be able to update an array from the outer scope in a parallel mapreduce
. In contrast, this is frequently happening in a @parallel for
because it is a common pattern to update an array with a normal for
loop.
Considering @ararslan changed his vote, it is currently 4 to 1 in favor of removing @parallel
. However, given the low number of responses, I'll take the easy way out and push it up to @StefanKarpinski to take a call on this.
I agree with deprecating this. "Parallel for" syntax of some kind I think only makes sense with shared memory, since then there is at least hope of matching serial behavior. At the very least it would have to be renamed, since we have two forms of parallelism now.
Certainly the reducing form can be deprecated to parallel mapreduce. We already have most of the machinery there with the preduce
function that the macro uses.
Now that you mention it, usage in the context of SharedArrays is most appropriate. Suggest we deprecate it for one cycle, and plan on moving it to SharedArrays (just the file for now, under SharedArrays module when we have it). The documentation can be updated right away to restrict its mention to SharedArrays only.
This doesn't seem quite ready or release blocking. Sorry, @amitmurthy!
Please don't deprecate @parallel
until we have an easy and performant replacement in pmap
: https://github.com/JuliaLang/julia/issues/21940.
We should sort out which names refer to distributed vs. shared memory for 1.0.
@parallel for
looks, to me at least, to be almost identical to #pragma OMP PARALLEL FOR
for OpenMP in C++, except that it looks like it's distributed instead of threaded?
I suggest simply renaming @parallel
to @distributed
to get this out of the way for 1.0. Part of the stdlib now, but probably still should be taken care of quickly.
Hi, I am a PhD student in parallel and distributed computing, and just recently found out about the effort of including native parallel and distributed support in the language. I think it is great.
I do agree that the naming for Julia's directives are a little confusing on what is local parallelism and what is distributed parallelism, as both have its advantages and drawbacks.
I do suggest though, that you keep distributed/parallel macros (OpenMP style) and functional programming map/reduce since they are part of two different programming styles. But I think that the reduce operation should be also handled by a specialization of the reduce function, a map_reduce function would be inconsistent.
I also think that using local parallelism, with co-routines or threads should be simpler and more straightforward then distributed parallelism because the need for local parallelism is much more common.
What I think I missed from multi-threading are critical sessions and atomic instructions. Do you intent to implementing them for 1.0?
And the resolution is? neither @parallel
nor @distributed
seems to be present in 1.0.1 What happened?
It's called @distributed
and it lives in the Distributed stdlib package since 0.7.
Got it. Didn't realize I had to use 'using Distributed' to get it.
In 1.0.1 looks like I have to add many 'using' statements.
In 1.0.1 looks like I have to add many 'using' statements.
1.0 Moved a lot of stuff out of base
. The reasoning is discussed in the 1.0 release blog post.
Thanks, schnaltterm, Lots of good info in those release notes.
Most helpful comment
@parallel for
has a proven track record of confusing people and, I think, with good reasons. It wants to pretend that it is a normalfor
loop - just parallel - but as @amitmurthy points out, the semantics are different. It is kind of misleading advertising to pretend that you can get parallelism just by annotating afor
loop with a macro. It can easily go wrong and it often requires explanation so I think we'd be off with providing alternatives.Our
@parallel (op) for
is a parallel (map)reduce
underneath and the semantics are also similar tomapredure
(the map is merged into the reduction operator). Therefore, I think it would be simpler to expose a parallelmapreduce
. I don't think that users would expect to be able to update an array from the outer scope in a parallelmapreduce
. In contrast, this is frequently happening in a@parallel for
because it is a common pattern to update an array with a normalfor
loop.