In Julia, the statements break
and continue
affects the nearest enclosing loop, and there is currently no way to point them to outer loops.
I know this seems limiting, but it is quite standard. At that point, you'd almost want to add goto (#101). Do you have an interesting use case for this?
@JeffBezanson No, but Dart has this.
The Dart spec is not too encouraging on this:
Labels should be avoided by programmers at all costs. The motivation for including labels in the language is primarily making Dart a better target for code generation.
Well, Julia is significantly faster than most dynamic languages, it is beneficial to make it a compiler target.
I sometimes wished we had this functionality when writing multi-dimensional loops, e.g.:
for i=1:n, j=1:m
condition() && break 2
dosomething()
end
instead of
skip = false
for i=1:n
for j=1:m
condition() && (skip=true; break)
dosomething()
end
skip && break
end
I've sometimes wanted multilevel break
and continue
but C-style goto would be even better.
Well, Julia is significantly faster than most dynamic languages, it is beneficial to make it a compiler target.
This is a strange idea. If you want a great compiler target, compile to LLVM.
@StefanKarpinski Translating to Julia would be easier (and the code would be more readable) than compiling to LLVM
@carlobaldassi I wonder if the better solution for that case is to consider nested loops defined in that syntax as one loop for the purpose of break
, as in https://github.com/JuliaLang/julia/issues/5154.
Yes, that is probably a more intuitive solution and I'd welcome that change. I also agree that goto is better than multilevel break/continue, if feasible.
One might get complaints from people who'd want to make a seemingly-small change like
for j = 1:n
offset = (j-1)*size(A, 1)
for i = 1:m
condition() && multi-loop-break
dosomething()
But the skip
thing does work, so it's not like there's no viable way forward.
To clarify, what I meant was that if someone starting writing some code with the one-loop syntax, counting on how break
works there, then made a change to the two-loop syntax so they could add one small thing, they might be dismayed to discovered that the break
now works differently and might introduce a hard-to-find bug.
Honestly, I think the way break
and continue
work in multiple for loop syntax currently is so unituitive that I doubt anyone is using it. I have to desugar the syntax in my head to remember how it should work. I strongly feel that we should change the multiple iteration syntax to mean a different thing that just nested loops.
I'm on board with that. break
is a bit unfortunate in general, since it really turns a loop into a different kind of thing, no longer parallel.
@StefanKarpinski, it's not that I'm worried about people using break
in multiple loops _now_, I'm worried about the implications if we use the same word to mean two different things, depending on whether it's written in one-loop syntax or two-loop syntax. It's something like 8 editor keystrokes to convert between
for j = 1:m, i = 1:n
# some _huge_ function body that had a break buried somewhere deeply inside it
end
and
for j = 1:m; for i = 1:n
# some _huge_ function body that had a break buried somewhere deeply inside it
end; end
I've made such conversions myself many times, and currently there is no penalty for doing so. _Nothing else_ in julia changes its meaning when you do so, but if we introduce the asymmetry in break
, that won't be true anymore. Since for most functions it won't actually generate an error---just different algorithmic behavior---the resulting bug might be quite hard to track down.
If we just use a different word from break
, there would be no issue.
That is why I originally made the multi-loop syntax just a syntax rewrite. With the change though, we'd be thinking about these not as nested loops but a single loop over a multi-dimensional space, like map(f, A)
where A is >1d. And if we introduce for i in X...
, there will be no simple way to rewrite it that exposes all of the loops. With that syntax I think it's even clearer that break
should break out of the whole thing.
Don't get me wrong, I also think breaking out of the whole thing is the _more_ intuitive behavior, even if it doesn't work that way in C, and for i in X...
was indeed on my mind as well (though I noticed that most of the concrete examples I had thought of using it for were similar to maximum_rgn
, which would be better expressed as a while
so you can preserve iterator state for phase 2 of the algorithm).
I'm just worried about how we deprecate break
in the case of ordinary single loops. Using a different word has its annoyances, but might be the safest.
Using a different word has its annoyances, but might be the safest.
breakall
?
To break out of all loops in the current scope?
for the particular case of nested loops chained together with commas, at least.
Oo, that seems more confusing to me since I would expect breakall
to break out of all loops I'm inside of.
Hmm, more boring than my idea of awesomebreak
but probably better :smile:
breakexactlyasmanyloopsasiwant
I like it! Can we have more such keywords please?
Or, in German: donaudampfschiffahrtsgesellschaftskapitän
.
exit()
will break out of all loops
Do you sing Blue Danube, too?
I do not, but I sing some mean Prince at karaoke. @JeffBezanson can bear witness.
breakallfurrealz()=@error("All the loops need to exit now. Please hold down the power button for at least four seconds.")
easy peasy
The power button is even more emphatic than exit()
as a way of breaking out of loops. That would do it for sure!
More seriously, nbreak
? It conveys the notion that it's equivalent to a countable number of breaks, even if you don't have to explicitly specify n
.
Presumably we'd be planning to make it an error to use break
inside a multiloop, and an error to use whateverwecallit
inside a single loop?
Stefans suggestion makes me feel so privileged having umlauts on my keyboard.
"Can someone ask the german guy to write this code? We have to break out of a multiloop!" ;-)
More seriously, I would make the behavior change now and use break
to break out of all loops. A break that does not break out of the complete loop feels wrong to me. If someone wants to break only out of one loop, he has to use two loops. Makes all this more orthogonal
More seriously, I would make the behavior change now and use
break
to break out of all loops.
The ultimate break
ing change, but yes, I would be fine with that. I'm not sure how many people would be affected by the change in the end, since it is currently not possible to write the for i=1:N, j=1:N
type loop and break
out twice.
Applied 'breaking' tag.
I'd prefer goto
instead. Once there's a question of which loop you're breaking out of, goto
is clearer.
To make sure everyone is on the same page (many clearly are already),
function find_first_nan_in_a_bunch_of_matrices_without_using_linear_indexing(As...)
colindex = zeros(Int, length(As))
rowindex = zeros(Int, length(As))
for iarray i = 1:length(As)
A = As[iarray]
for j = 1:size(As, 2), i = 1:size(As, 1)
if isnan(A[i,j])
colindex[iarray] = i
rowindex[iarray] = j
break # oops, didn't mean to stop processing the arrays, I just meant the row/col loop
end
end
end
end
That's why this is tricky. Having break
and multibreak
both mean "break from one syntactic layer depending on how the loop is written," but to have them _not_ be interchangeable (meaning, it's an error to use break
inside a multiloop or multibreak
inside a single loop) would be safer, even if it is a bit ugly.
Yup, goto
solves it.
Unfortunately goto
will be too familiar for some Fortran developers, but none of my current friends is in that category, so I'll be fine.
Java has labeled break and continue statements (see e.g.
docs.oracle.com/javase/tutorial/java/nutsandbolts/branch.html), seems like
a middle ground where you get the clear location of the labels without the
full mess potential of goto. They choose to label the statement that you
break rather than where you will end up, I'm not sure that that's the most
intuitive choice.
The thing is we (or at least I) _want_ full goto – see https://github.com/JuliaLang/julia/issues/101. It's a useful construct in certain situations and we have it in the lowered AST anyway, so it's mainly a matter of syntax to expose it. The whole "Dijkstra said it's bad" business is blown way out of proportion and context. He was writing that paper in a world where goto was _the_ predominant control flow construct. Goto is so fully squashed today that it poses no real danger – no one in 2014 is about to start writing programs using goto to jump all over the place.
no one in 2014 is about to start writing programs using goto to jump all over the place.
How spine-chillingly ominous.
I wonder how many people have read Dijkstra without reading Knuth's commentary.
I have to confess that I'd never read this. Reading it now – thanks for the reference, @pao!
While not beeing against goto might i ask for a use case for goto? Knuth's paper seems to be about saving some instructions in an inner loop. I would think that today the compiler is smart enough to optimize these situations.
Simd autovectorization will probably not benefit from gotos.
I think most situations requiring goto can also be implemented using exceptions. And this is not uncommon in other languages.
Sometimes it's by far the clearest way to express the control flow of a program. For example, when you may need to go back arbitrarily far in interactive data acquisition (i.e. asking the user for a series of dependent values), or if you need to do some common cleanup but there are lots of exit points that lead to that common cleanup code. The sequential input pattern can be expressed with do/while constructs, but the depth of loop nesting grows exponentially with the number of inputs, whereas it's trivial and clear to express this with gotos without any nesting. The common cleanup can be achieved with a combination of breaks and control variables, but it's annoying and unclear. Another approach is lots of tail calls to an otherwise gratuitous cleanup helper function, but if there is state (which there usually is, otherwise what's there to cleanup?) it either needs to be explicitly passed into the helper function or the helper function needs to be a closure – introducing unnecessary overhead in either case. Again, this is completely trivial and clear to express using goto, and the best possibly code a clever compiler could produce is exactly what the programmer wants to write with gotos. Exceptions are not a good solution to any of these situations – they have unacceptably high overhead and completely obscure the control flow.
The point I wanted to make is that
But I have to admit that I have not used exception in Julia and don't know if it is possible to create own exceptions, which would be required here.
@StefanKarpinski: I agree that goto would actually be a nice thing to have.
@tknopp, exceptions introduce a substantial overhead, and are therefore not ideal for ordinary control-flow.
define "ordinary control-flow". From my point of view avoiding exceptions due to "substantial overhead" is in most cases premature optimization. I can see that there are situations where breaking out of a loop is part of the normal flow and in these situations goto is superior. But in "exceptional cases" (think of a "file not found" thing) I don't think the exception overhead is an issue.
We're in complete agreement. My point is that here the topic is about breaking out of a loop as part of the normal flow (see for example my demo find_first_nan_in_a_bunch_of_matrices_without_using_linear_indexing
above).
true, so the question is whether one wants breakall
to break out of all loops or if goto comes in which case breakall
would not be necessary.
I am myself not sure which I like more but just wanted to question if breakall
+exceptions
might cover all use cases.
What about break LABEL
and continue LABEL
? (This syntax is supported by Java and some other languages).
This is, in my opinion, a little bit clearer than goto LABEL
, as it clearly conveys _what to do_ after you jump to the label (break or continue). It is clearly more flexible than breakall
as you can specify which level of the loops you want to break.
I've never really cared much for the break LABEL
and continue LABEL
style. goto LABEL
seems clearer to me.
continue label
is a functionality different from goto
because it will increase the counter of the outer loop, right?
+Inf for a local goto .
We could talk to people all about its badness and people should avoid using it as long as they can. But let's have _one_ goto first!
However, I don't know whether it's easy or not for the compiler to optimize codes with the existance of goto.
Goto to a label just after the last statement in a loop is equivalent to a
continue
I 100% agree that goto would be simpler and more general than break with labels. It is almost exactly the same feature, except doesn't require a loop to be able to use it.
On a related note, I found myself wanting to do this:
for n in 1:num
if something
n += 1 # skip next iteration
continue
end
end
That doesn't work and so I had to use a boolean variable to track this. Is there any liking to have support to let continue
and break
take an integer for the number of items to continue/break. For example:
break 2 # break out of two levels of loops (default is 1)
...
continue 1 # continue to next iteration (detault)
continue 2 # continue to n+2 iteration
This might be an easy way to get some more expressiveness without adding new constructs...but I don't know of other languages that do this so it is probably a really bad idea :). Thoughts?
Glen
I think at that point I'd use a while loop.
You can use start
, done
and next
to use iterators in a while loop.
You can wrap it in a closure and return to break the nested loop.
(function()
for i = 1:100
for j = 1:100
return
end
end
end)()
Today I found myself wanting to break out of an inner loop and continue an outer loop, and dreamed of being able to do it like this:
for i in some_collection
for j in another_collection
[do_something]
if some_condition
break, continue
end
[do_something_more]
end
[do_yet_another_something]
end
@GunnarFarneback That use case is I think better covered by #1289
There are several alternatives proposed in #1289, at least some of which would work fine for my example. I'm not sure about better though; there's a lack of concensus about the intuitive meaning of for/else whereas there's not that much "break, continue" could possibly mean other than breaking the first loop and continuing the second.
Another difference is that they generalize very differently. If the example instead was
for i in some_collection
for j in another_collection
[...]
if condition1
break, continue
elseif condition2
break
elseif condition3
break, break
end
[...]
end
[...]
end
a for/else wouldn't be helpful at all. And conversely I'm sure there are use cases where for/else would fit perfectly whereas multiple break/continue would be entirely useless.
I kind of like the comma separating these. Once you get what it means, it's pretty clear and general.
I'm reopening because I think the break, continue
idea is worthwhile. Syntax: zero or more break
keywords separated by commas, followed by an optional continue
; having a continue
before a break
makes no sense and should be a syntax error.
Right. Forgot to actually reopen. Thanks, @pao.
Implementing this in the parser is beyond me but as it happens "break; continue" is currently valid Julia syntax. The semantics are less useful but nothing a little macro magic can't fix. Proof of concept implementation as a macro in this gist for those who want to try it out.
x-ref a julia-users discussion: https://groups.google.com/d/msg/julia-users/byhFGOJz7tQ/xu16ktAVAAAJ
I think explicit or implicit named loops would be safer than the break, continue
proposal above based on depth count. (implicit labels could be generated from the first loop-var or iterator-var name). Also, this could be easily provided outside the core language by macro sugar over @goto
.
I agree with @ihnorton that explicit labels are attractive. We can reuse @label
for this. To take the example from https://discourse.julialang.org/t/named-for-loops/27564 we currently have:
for i=1:3
for j=1:3
for k=1:3
@info "indices" (i, j, k)
if k == 2
@goto end_j
end
end
end
@label end_j
end
To extend @label
, we can treat it as a verb which just creates labels end_j
and continue_j
from the loop counter:
for i=1:3
@label for j=1:3
for k=1:3
@info "indices" (i, j, k)
if k == 2
@goto end_j
end
end
end
end
We'd also need @label lname while ...
for while loops and for for
loops with complex assignments.
This would need changes to lowering to support correct placement of continue_j
and to support breaking out of try ... finally
with a @goto
(ensuring the finally block is executed). Alternatively, allow break j
and continue j
to go with the @label
.
[Edit: or in hindsight this doesn't looks so fantastic. Maybe better just to go with @label
and @goto
as they are.]
I'm reopening because I think the
break, continue
idea is worthwhile. Syntax: zero or morebreak
keywords separated by commas, followed by an optionalcontinue
; having acontinue
before abreak
makes no sense and should be a syntax error.
would this stop at function boundaries? Or could I escape a loop in the caller function? Not that it is a good idea, its just something that made me thinking 😀
would this stop at function boundaries?
Yes.
Proof of concept implementation as a macro in this gist for those who want to try it out.
Now available as the registered package Multibreak. The difference to the proposed break, break
syntax is that the package uses semicolon instead of comma and requires attaching the @multibreak
macro to an enclosing scope. Try it out when you run into situations where you'd want to break
or continue
out of multiple loops.
Thanks for the working functionality, @GunnarFarneback.
Regarding the still-open issue for the core language, just a thought:
I looked at Go, Java, JavaScript, Perl, Dart, and modern Fortran for their multilevel break/continue done via labeling, as I'm sure is background for this thread.
I now understand the reluctance to adopt the same in Julia--because disappointingly (to me) all use goto-style labeling for break/continue. I say "goto-style" because all follow the ubiquitous LABEL:
syntax indenting a code line or occupying its own line, shared by goto when present in the language (Go and Perl) with the exception of Fortran's numbered goto.
This tradition seems in part why people spoke of offering goto "instead of" labeled break/continue in Julia: "goto would be simpler and more general than break with labels"; and "I've never really cared much for the break LABEL and continue LABEL style. goto LABEL seems clearer to me." (Taken from the thread.) Under goto-style labeling, I must agree. But these quips never made full sense to me, because labeled break/continue when offered is supposed to be modern structured programming, thus incomparable to goto and independent of its offering. And certainly not conflated with it, despite that both require a labeling.
(Nothing against gotos.)
Maybe that's unfair ragging, so to the point: non-goto-style labeling for break/continue. Compare in Julia: continue labeled goto-style,
@label PROCESSING_LOOP
for i in I
# …
for j in J
# …
condition1 && continue PROCESSING_LOOP
# …
end
# …
end
...vs. a more structured programming look like I'd expect,
for i in I labeled PROCESSING_LOOP
# …
for j in J
# …
condition1 && continue PROCESSING_LOOP
# …
end
# …
end
Besides the unobtrusive readability, importantly the latter continue PROCESSING_LOOP
tells unmistakably _what_ to continue, not "where".
While of course doing so in slightly simpler reading than the goto workaround with label above the final end.
Note Python's PEP for multilevel break/continue had precisely the same thought, a postfixed labeling, which if adopted would have bucked the goto-style tradition above.
Goto-style labels have never been very satisfying for this... if we interpret "the statement immediately following" such a label as _what_ to break/continue (not where), this antagonizes the label's other meaning as a code location for goto used traditionally. Which makes the resulting break/continue unnatural to read, as remarked in the above quips and this thread's 2014 discussion that led to its closing, dismissing labeled break/continue as little more than a limited goto syntax. Whereas the variant here clearly is not, as a modern-structured-programming structure independent of goto.
(Just as a bare single-level break/continue is not dismissed as a limited goto syntax.)
An alternative observation of this distinction from goto labeling, pre/post-fixing aside, is to simply note that the statement immediately following a goto's label can normally be anticipated to be swapped--throughout code development and maintenance as lines are inserted/removed there. So the label naturally occupies its own line. (What I've called a code location.) Whereas the new keyword attaches specifically to a statement, for reference to _it_. (This gives self-referencing code.)
Might I ask, @JeffBezanson, since you closed the issue back when labeled break/continue was the option on the table, before 2016's unlabeled variant led to reopening, can you confirm that the reasoning behind closing no longer quite applies here the same, no longer indicates closing the door on labeled break/continue?
I do not presume to abandon the 2016 unlabeled variant, before the two compete for adoption. Also I'm a nobody here, so I won't debate beyond this post.
Interestingly, Julia might be better positioned than most languages containing goto to adopt unique-from-goto labeling for break/continue, due to goto and its labels being relegated to macro-hood.
(To give credit where due, Fortran already achieved this separation actually, albeit slightly ironically (to me): break/continue labeling is done goto-style, as I call it, and goto labeling via old-school numbering. The former does employ a nice terminology though, "named constructs", which could be borrowed.)
On that note, distancing the new keyword from @label
and "labeling" terminology might be desirable, I could understand, such as a keyword conjuring "named constructs" or whatever...
while delta > epsilon named GRADIENT_DESCENT
# …
(using while
this example for variety)
Now, the weighing of unlabeled vs. labeled syntax has not been taken up here, although I'll note once as others have that any break/continue by reference is arguably safer read/stated than the clever proposal of break, break, break, continue
etc. Which I admit to admiring, besides the way its verbosity scales and besides the strain to human-parse it.
An interesting idea would be to consider the loop variable an implicit label for a loop. E.g. you could write break i
and break the loop where i
is the loop variable. This is complicated by destructuring like for (i, x) in enumerate(v)
and such, but you could use i
or x
in such a case.
That opens up for interesting surprises if you carelessly switch the loop order.
From my point of view the main attraction of break, break
is that it's entirely structural and doesn't involve searching for labels or other identifiers. Certainly, if the loops are sufficiently deep or convoluted there will be a point where unraveling the structure will be more challenging than following a label but don't let us be fooled by the generality of these constructions. I expect that the great majority of the the use cases will be break, break
, a smaller minority break, continue
and only a tiny fraction going deeper. For the truly complex cases I would be content with a standard @goto/@label
. And I would consider refactoring the code.
That opens up for interesting surprises if you carelessly switch the loop order.
I wonder: are those cases any worse than the other kinds of order dependence which can make switching loops invalid? I would guess there's more insidious cases which already exist in normal julia code, such as push!
ing elements into an ordered collection. In that case the loop variables aren't even mentioned.
By which rules we determine the loop variables of while loops?
One of the things that bothers me about break, break
and break, continue
is that it appears more orthogonal than it actually is. The syntax suggests that continue, break
and continue, continue
should also make sense but they do not: it only makes sense to break
or continue
a specific loop that you're in. It would be more honest to write break@2
or continue@2
since the only degrees of freedom are wether you want to break or continue and which loop to break or continue.
By which rules we determine the loop variables of while loops?
Yeah, that's an issue—that isn't clear at all.
Yeah, that's an issue—that isn't clear at all.
But for cartesian for loops that might be the perfect solution. Is it necessary to tackle all types of loops with the same solution? Because breaking intermediate cartesian for loops currently can only be done by splitting them up. Besides that in the case of cartesian for loops continuing a layer is equivalent of breaking the next inner layer.
As a baseline, @goto
with suggestive choice of label names is a quite competitive status quo:
for i in some_collection
for j in another_collection
println(i, " ", j)
if condition()
@goto continue_outer
end
if condition()
@goto break_outer
end
end
@label continue_outer
end
@label break_outer
@goto
with suggestive choice of label names is a quite competitive status quo
I completely agree with this. In addition I find that many places where I might want this can be replaced with a little refactoring and an early return. So I feel like the bar should be fairly high for new syntax. In particular, introducing a keyword to label loops feels excessive.
Hence the idea of trying to use a macro to allow labeling loops in some way (eg, some variant of reusing the @label
macro as I suggested above): it's a middle ground which doesn't introduce language syntax, but does allow things like while
loops to be unambiguously labelled.
it's a middle ground which doesn't introduce language syntax, but does allow things like
while
loops to be unambiguously labelled
To expand on this: we could declare that
for x in xs
introduce a loop label x
.while i < j
can be manually labeled with something like @label myloop while i < j
This should allow continue x
to work in simple obvious cases, and continue myloop
to work in the general case without resorting to @goto
. And it doesn't introduce any new keywords.
Most helpful comment
I'm reopening because I think the
break, continue
idea is worthwhile. Syntax: zero or morebreak
keywords separated by commas, followed by an optionalcontinue
; having acontinue
before abreak
makes no sense and should be a syntax error.