Julia: `break` and `continue` out of multiple loops

Created on 9 Jan 2014  Â·  83Comments  Â·  Source: JuliaLang/julia

In Julia, the statements break and continue affects the nearest enclosing loop, and there is currently no way to point them to outer loops.

speculative

Most helpful comment

I'm reopening because I think the break, continue idea is worthwhile. Syntax: zero or more break keywords separated by commas, followed by an optional continue; having a continue before a break makes no sense and should be a syntax error.

All 83 comments

I know this seems limiting, but it is quite standard. At that point, you'd almost want to add goto (#101). Do you have an interesting use case for this?

@JeffBezanson No, but Dart has this.

The Dart spec is not too encouraging on this:

Labels should be avoided by programmers at all costs. The motivation for including labels in the language is primarily making Dart a better target for code generation.

Well, Julia is significantly faster than most dynamic languages, it is beneficial to make it a compiler target.

I sometimes wished we had this functionality when writing multi-dimensional loops, e.g.:

for i=1:n, j=1:m
   condition() && break 2
   dosomething()
end

instead of

skip = false
for i=1:n
    for j=1:m
        condition() && (skip=true; break)
        dosomething()
    end
    skip && break
end

I've sometimes wanted multilevel break and continue but C-style goto would be even better.

Well, Julia is significantly faster than most dynamic languages, it is beneficial to make it a compiler target.

This is a strange idea. If you want a great compiler target, compile to LLVM.

@StefanKarpinski Translating to Julia would be easier (and the code would be more readable) than compiling to LLVM

@carlobaldassi I wonder if the better solution for that case is to consider nested loops defined in that syntax as one loop for the purpose of break, as in https://github.com/JuliaLang/julia/issues/5154.

Yes, that is probably a more intuitive solution and I'd welcome that change. I also agree that goto is better than multilevel break/continue, if feasible.

One might get complaints from people who'd want to make a seemingly-small change like

for j = 1:n
    offset = (j-1)*size(A, 1)
    for i = 1:m
        condition() && multi-loop-break
        dosomething()

But the skip thing does work, so it's not like there's no viable way forward.

To clarify, what I meant was that if someone starting writing some code with the one-loop syntax, counting on how break works there, then made a change to the two-loop syntax so they could add one small thing, they might be dismayed to discovered that the break now works differently and might introduce a hard-to-find bug.

Honestly, I think the way break and continue work in multiple for loop syntax currently is so unituitive that I doubt anyone is using it. I have to desugar the syntax in my head to remember how it should work. I strongly feel that we should change the multiple iteration syntax to mean a different thing that just nested loops.

I'm on board with that. break is a bit unfortunate in general, since it really turns a loop into a different kind of thing, no longer parallel.

@StefanKarpinski, it's not that I'm worried about people using break in multiple loops _now_, I'm worried about the implications if we use the same word to mean two different things, depending on whether it's written in one-loop syntax or two-loop syntax. It's something like 8 editor keystrokes to convert between

for j = 1:m, i = 1:n
    # some _huge_ function body that had a break buried somewhere deeply inside it
end

and

for j = 1:m; for i = 1:n
    # some _huge_ function body that had a break buried somewhere deeply inside it
end; end

I've made such conversions myself many times, and currently there is no penalty for doing so. _Nothing else_ in julia changes its meaning when you do so, but if we introduce the asymmetry in break, that won't be true anymore. Since for most functions it won't actually generate an error---just different algorithmic behavior---the resulting bug might be quite hard to track down.

If we just use a different word from break, there would be no issue.

That is why I originally made the multi-loop syntax just a syntax rewrite. With the change though, we'd be thinking about these not as nested loops but a single loop over a multi-dimensional space, like map(f, A) where A is >1d. And if we introduce for i in X..., there will be no simple way to rewrite it that exposes all of the loops. With that syntax I think it's even clearer that break should break out of the whole thing.

Don't get me wrong, I also think breaking out of the whole thing is the _more_ intuitive behavior, even if it doesn't work that way in C, and for i in X... was indeed on my mind as well (though I noticed that most of the concrete examples I had thought of using it for were similar to maximum_rgn, which would be better expressed as a while so you can preserve iterator state for phase 2 of the algorithm).

I'm just worried about how we deprecate break in the case of ordinary single loops. Using a different word has its annoyances, but might be the safest.

Using a different word has its annoyances, but might be the safest.

breakall?

To break out of all loops in the current scope?

for the particular case of nested loops chained together with commas, at least.

Oo, that seems more confusing to me since I would expect breakall to break out of all loops I'm inside of.

Hmm, more boring than my idea of awesomebreak but probably better :smile:

breakexactlyasmanyloopsasiwant

I like it! Can we have more such keywords please?

Or, in German: donaudampfschiffahrtsgesellschaftskapitän.

exit() will break out of all loops

Do you sing Blue Danube, too?

I do not, but I sing some mean Prince at karaoke. @JeffBezanson can bear witness.

breakallfurrealz()=@error("All the loops need to exit now. Please hold down the power button for at least four seconds.")

easy peasy

The power button is even more emphatic than exit() as a way of breaking out of loops. That would do it for sure!

More seriously, nbreak? It conveys the notion that it's equivalent to a countable number of breaks, even if you don't have to explicitly specify n.

Presumably we'd be planning to make it an error to use break inside a multiloop, and an error to use whateverwecallit inside a single loop?

Stefans suggestion makes me feel so privileged having umlauts on my keyboard.
"Can someone ask the german guy to write this code? We have to break out of a multiloop!" ;-)

More seriously, I would make the behavior change now and use break to break out of all loops. A break that does not break out of the complete loop feels wrong to me. If someone wants to break only out of one loop, he has to use two loops. Makes all this more orthogonal

More seriously, I would make the behavior change now and use break to break out of all loops.

The ultimate breaking change, but yes, I would be fine with that. I'm not sure how many people would be affected by the change in the end, since it is currently not possible to write the for i=1:N, j=1:N type loop and break out twice.

Applied 'breaking' tag.

5154 is that issue. This one is a feature request to be able to pick which loop to break out of.

I'd prefer goto instead. Once there's a question of which loop you're breaking out of, goto is clearer.

To make sure everyone is on the same page (many clearly are already),

function find_first_nan_in_a_bunch_of_matrices_without_using_linear_indexing(As...)
    colindex = zeros(Int, length(As))
    rowindex = zeros(Int, length(As))
    for iarray i = 1:length(As)
        A = As[iarray]
        for j = 1:size(As, 2), i = 1:size(As, 1)
            if isnan(A[i,j])
                colindex[iarray] = i
                rowindex[iarray] = j
                break  # oops, didn't mean to stop processing the arrays, I just meant the row/col loop
            end
        end
    end
end

That's why this is tricky. Having break and multibreak both mean "break from one syntactic layer depending on how the loop is written," but to have them _not_ be interchangeable (meaning, it's an error to use break inside a multiloop or multibreak inside a single loop) would be safer, even if it is a bit ugly.

Yup, goto solves it.

Unfortunately goto will be too familiar for some Fortran developers, but none of my current friends is in that category, so I'll be fine.

Java has labeled break and continue statements (see e.g.
docs.oracle.com/javase/tutorial/java/nutsandbolts/branch.html), seems like
a middle ground where you get the clear location of the labels without the
full mess potential of goto. They choose to label the statement that you
break rather than where you will end up, I'm not sure that that's the most
intuitive choice.

The thing is we (or at least I) _want_ full goto – see https://github.com/JuliaLang/julia/issues/101. It's a useful construct in certain situations and we have it in the lowered AST anyway, so it's mainly a matter of syntax to expose it. The whole "Dijkstra said it's bad" business is blown way out of proportion and context. He was writing that paper in a world where goto was _the_ predominant control flow construct. Goto is so fully squashed today that it poses no real danger – no one in 2014 is about to start writing programs using goto to jump all over the place.

no one in 2014 is about to start writing programs using goto to jump all over the place.

How spine-chillingly ominous.

I wonder how many people have read Dijkstra without reading Knuth's commentary.

I have to confess that I'd never read this. Reading it now – thanks for the reference, @pao!

While not beeing against goto might i ask for a use case for goto? Knuth's paper seems to be about saving some instructions in an inner loop. I would think that today the compiler is smart enough to optimize these situations.

Simd autovectorization will probably not benefit from gotos.

I think most situations requiring goto can also be implemented using exceptions. And this is not uncommon in other languages.

Sometimes it's by far the clearest way to express the control flow of a program. For example, when you may need to go back arbitrarily far in interactive data acquisition (i.e. asking the user for a series of dependent values), or if you need to do some common cleanup but there are lots of exit points that lead to that common cleanup code. The sequential input pattern can be expressed with do/while constructs, but the depth of loop nesting grows exponentially with the number of inputs, whereas it's trivial and clear to express this with gotos without any nesting. The common cleanup can be achieved with a combination of breaks and control variables, but it's annoying and unclear. Another approach is lots of tail calls to an otherwise gratuitous cleanup helper function, but if there is state (which there usually is, otherwise what's there to cleanup?) it either needs to be explicitly passed into the helper function or the helper function needs to be a closure – introducing unnecessary overhead in either case. Again, this is completely trivial and clear to express using goto, and the best possibly code a clever compiler could produce is exactly what the programmer wants to write with gotos. Exceptions are not a good solution to any of these situations – they have unacceptably high overhead and completely obscure the control flow.

The point I wanted to make is that

  • there is already a mean to break out of the control flow
  • in various cases breaking out of a loop is done in a "exceptional" case. Should be fine to use exceptions then.

But I have to admit that I have not used exception in Julia and don't know if it is possible to create own exceptions, which would be required here.

@StefanKarpinski: I agree that goto would actually be a nice thing to have.

@tknopp, exceptions introduce a substantial overhead, and are therefore not ideal for ordinary control-flow.

define "ordinary control-flow". From my point of view avoiding exceptions due to "substantial overhead" is in most cases premature optimization. I can see that there are situations where breaking out of a loop is part of the normal flow and in these situations goto is superior. But in "exceptional cases" (think of a "file not found" thing) I don't think the exception overhead is an issue.

We're in complete agreement. My point is that here the topic is about breaking out of a loop as part of the normal flow (see for example my demo find_first_nan_in_a_bunch_of_matrices_without_using_linear_indexing above).

true, so the question is whether one wants breakall to break out of all loops or if goto comes in which case breakall would not be necessary.

I am myself not sure which I like more but just wanted to question if breakall+exceptions might cover all use cases.

What about break LABEL and continue LABEL ? (This syntax is supported by Java and some other languages).

This is, in my opinion, a little bit clearer than goto LABEL, as it clearly conveys _what to do_ after you jump to the label (break or continue). It is clearly more flexible than breakall as you can specify which level of the loops you want to break.

I've never really cared much for the break LABEL and continue LABEL style. goto LABEL seems clearer to me.

continue label is a functionality different from goto because it will increase the counter of the outer loop, right?

+Inf for a local goto .
We could talk to people all about its badness and people should avoid using it as long as they can. But let's have _one_ goto first!
However, I don't know whether it's easy or not for the compiler to optimize codes with the existance of goto.

Goto to a label just after the last statement in a loop is equivalent to a
continue

I 100% agree that goto would be simpler and more general than break with labels. It is almost exactly the same feature, except doesn't require a loop to be able to use it.

On a related note, I found myself wanting to do this:

for n in 1:num
  if something
    n += 1  # skip next iteration
    continue
  end
end

That doesn't work and so I had to use a boolean variable to track this. Is there any liking to have support to let continue and break take an integer for the number of items to continue/break. For example:

break 2  # break out of two levels of loops (default is 1)
...
continue 1 # continue to next iteration (detault)
continue 2 # continue to n+2 iteration

This might be an easy way to get some more expressiveness without adding new constructs...but I don't know of other languages that do this so it is probably a really bad idea :). Thoughts?

Glen

I think at that point I'd use a while loop.

You can use start, done and next to use iterators in a while loop.

You can wrap it in a closure and return to break the nested loop.

(function()
  for i = 1:100
    for j = 1:100
      return
    end
  end
end)()

Today I found myself wanting to break out of an inner loop and continue an outer loop, and dreamed of being able to do it like this:

for i in some_collection
    for j in another_collection
        [do_something]
        if some_condition
            break, continue
        end
        [do_something_more]
    end
    [do_yet_another_something]
end

@GunnarFarneback That use case is I think better covered by #1289

There are several alternatives proposed in #1289, at least some of which would work fine for my example. I'm not sure about better though; there's a lack of concensus about the intuitive meaning of for/else whereas there's not that much "break, continue" could possibly mean other than breaking the first loop and continuing the second.

Another difference is that they generalize very differently. If the example instead was

for i in some_collection
    for j in another_collection
        [...]
        if condition1
            break, continue
        elseif condition2
            break
        elseif condition3
            break, break
        end
        [...]
    end
    [...]
end

a for/else wouldn't be helpful at all. And conversely I'm sure there are use cases where for/else would fit perfectly whereas multiple break/continue would be entirely useless.

I kind of like the comma separating these. Once you get what it means, it's pretty clear and general.

I'm reopening because I think the break, continue idea is worthwhile. Syntax: zero or more break keywords separated by commas, followed by an optional continue; having a continue before a break makes no sense and should be a syntax error.

Right. Forgot to actually reopen. Thanks, @pao.

Implementing this in the parser is beyond me but as it happens "break; continue" is currently valid Julia syntax. The semantics are less useful but nothing a little macro magic can't fix. Proof of concept implementation as a macro in this gist for those who want to try it out.

x-ref a julia-users discussion: https://groups.google.com/d/msg/julia-users/byhFGOJz7tQ/xu16ktAVAAAJ

I think explicit or implicit named loops would be safer than the break, continue proposal above based on depth count. (implicit labels could be generated from the first loop-var or iterator-var name). Also, this could be easily provided outside the core language by macro sugar over @goto.

I agree with @ihnorton that explicit labels are attractive. We can reuse @label for this. To take the example from https://discourse.julialang.org/t/named-for-loops/27564 we currently have:

for i=1:3
    for j=1:3
        for k=1:3
            @info "indices" (i, j, k)
            if k == 2
                @goto end_j
            end
        end
    end
    @label end_j
end  

To extend @label, we can treat it as a verb which just creates labels end_j and continue_j from the loop counter:

for i=1:3
    @label for j=1:3
        for k=1:3
            @info "indices" (i, j, k)
            if k == 2
                @goto end_j
            end
        end
    end
end  

We'd also need @label lname while ... for while loops and for for loops with complex assignments.

This would need changes to lowering to support correct placement of continue_j and to support breaking out of try ... finally with a @goto (ensuring the finally block is executed). Alternatively, allow break j and continue j to go with the @label.

[Edit: or in hindsight this doesn't looks so fantastic. Maybe better just to go with @label and @goto as they are.]

I'm reopening because I think the break, continue idea is worthwhile. Syntax: zero or more break keywords separated by commas, followed by an optional continue; having a continue before a break makes no sense and should be a syntax error.

would this stop at function boundaries? Or could I escape a loop in the caller function? Not that it is a good idea, its just something that made me thinking 😀

would this stop at function boundaries?

Yes.

Proof of concept implementation as a macro in this gist for those who want to try it out.

Now available as the registered package Multibreak. The difference to the proposed break, break syntax is that the package uses semicolon instead of comma and requires attaching the @multibreak macro to an enclosing scope. Try it out when you run into situations where you'd want to break or continue out of multiple loops.

Thanks for the working functionality, @GunnarFarneback.

Regarding the still-open issue for the core language, just a thought:

I looked at Go, Java, JavaScript, Perl, Dart, and modern Fortran for their multilevel break/continue done via labeling, as I'm sure is background for this thread.

I now understand the reluctance to adopt the same in Julia--because disappointingly (to me) all use goto-style labeling for break/continue. I say "goto-style" because all follow the ubiquitous LABEL: syntax indenting a code line or occupying its own line, shared by goto when present in the language (Go and Perl) with the exception of Fortran's numbered goto.

This tradition seems in part why people spoke of offering goto "instead of" labeled break/continue in Julia: "goto would be simpler and more general than break with labels"; and "I've never really cared much for the break LABEL and continue LABEL style. goto LABEL seems clearer to me." (Taken from the thread.) Under goto-style labeling, I must agree. But these quips never made full sense to me, because labeled break/continue when offered is supposed to be modern structured programming, thus incomparable to goto and independent of its offering. And certainly not conflated with it, despite that both require a labeling.

(Nothing against gotos.)

Maybe that's unfair ragging, so to the point: non-goto-style labeling for break/continue. Compare in Julia: continue labeled goto-style,

@label PROCESSING_LOOP
for i in I
  # …
  for j in J
    # …
    condition1 && continue PROCESSING_LOOP
    # …
  end
  # …
end

...vs. a more structured programming look like I'd expect,

for i in I labeled PROCESSING_LOOP
  # …
  for j in J
    # …
    condition1 && continue PROCESSING_LOOP
    # …
  end
  # …
end

Besides the unobtrusive readability, importantly the latter continue PROCESSING_LOOP tells unmistakably _what_ to continue, not "where".

While of course doing so in slightly simpler reading than the goto workaround with label above the final end.

Note Python's PEP for multilevel break/continue had precisely the same thought, a postfixed labeling, which if adopted would have bucked the goto-style tradition above.

Goto-style labels have never been very satisfying for this... if we interpret "the statement immediately following" such a label as _what_ to break/continue (not where), this antagonizes the label's other meaning as a code location for goto used traditionally. Which makes the resulting break/continue unnatural to read, as remarked in the above quips and this thread's 2014 discussion that led to its closing, dismissing labeled break/continue as little more than a limited goto syntax. Whereas the variant here clearly is not, as a modern-structured-programming structure independent of goto.

(Just as a bare single-level break/continue is not dismissed as a limited goto syntax.)

An alternative observation of this distinction from goto labeling, pre/post-fixing aside, is to simply note that the statement immediately following a goto's label can normally be anticipated to be swapped--throughout code development and maintenance as lines are inserted/removed there. So the label naturally occupies its own line. (What I've called a code location.) Whereas the new keyword attaches specifically to a statement, for reference to _it_. (This gives self-referencing code.)

Might I ask, @JeffBezanson, since you closed the issue back when labeled break/continue was the option on the table, before 2016's unlabeled variant led to reopening, can you confirm that the reasoning behind closing no longer quite applies here the same, no longer indicates closing the door on labeled break/continue?

I do not presume to abandon the 2016 unlabeled variant, before the two compete for adoption. Also I'm a nobody here, so I won't debate beyond this post.

Interestingly, Julia might be better positioned than most languages containing goto to adopt unique-from-goto labeling for break/continue, due to goto and its labels being relegated to macro-hood.

(To give credit where due, Fortran already achieved this separation actually, albeit slightly ironically (to me): break/continue labeling is done goto-style, as I call it, and goto labeling via old-school numbering. The former does employ a nice terminology though, "named constructs", which could be borrowed.)

On that note, distancing the new keyword from @label and "labeling" terminology might be desirable, I could understand, such as a keyword conjuring "named constructs" or whatever...

while delta > epsilon named GRADIENT_DESCENT
  # …

(using while this example for variety)

Now, the weighing of unlabeled vs. labeled syntax has not been taken up here, although I'll note once as others have that any break/continue by reference is arguably safer read/stated than the clever proposal of break, break, break, continue etc. Which I admit to admiring, besides the way its verbosity scales and besides the strain to human-parse it.

An interesting idea would be to consider the loop variable an implicit label for a loop. E.g. you could write break i and break the loop where i is the loop variable. This is complicated by destructuring like for (i, x) in enumerate(v) and such, but you could use i or x in such a case.

That opens up for interesting surprises if you carelessly switch the loop order.

From my point of view the main attraction of break, break is that it's entirely structural and doesn't involve searching for labels or other identifiers. Certainly, if the loops are sufficiently deep or convoluted there will be a point where unraveling the structure will be more challenging than following a label but don't let us be fooled by the generality of these constructions. I expect that the great majority of the the use cases will be break, break, a smaller minority break, continue and only a tiny fraction going deeper. For the truly complex cases I would be content with a standard @goto/@label. And I would consider refactoring the code.

That opens up for interesting surprises if you carelessly switch the loop order.

I wonder: are those cases any worse than the other kinds of order dependence which can make switching loops invalid? I would guess there's more insidious cases which already exist in normal julia code, such as push!ing elements into an ordered collection. In that case the loop variables aren't even mentioned.

By which rules we determine the loop variables of while loops?

One of the things that bothers me about break, break and break, continue is that it appears more orthogonal than it actually is. The syntax suggests that continue, break and continue, continue should also make sense but they do not: it only makes sense to break or continue a specific loop that you're in. It would be more honest to write break@2 or continue@2 since the only degrees of freedom are wether you want to break or continue and which loop to break or continue.

By which rules we determine the loop variables of while loops?

Yeah, that's an issue—that isn't clear at all.

Yeah, that's an issue—that isn't clear at all.

But for cartesian for loops that might be the perfect solution. Is it necessary to tackle all types of loops with the same solution? Because breaking intermediate cartesian for loops currently can only be done by splitting them up. Besides that in the case of cartesian for loops continuing a layer is equivalent of breaking the next inner layer.

As a baseline, @goto with suggestive choice of label names is a quite competitive status quo:

    for i in some_collection
        for j in another_collection
            println(i, " ", j)
            if condition()
                @goto continue_outer
            end
            if condition()
                @goto break_outer
            end
        end
        @label continue_outer
    end
    @label break_outer

@goto with suggestive choice of label names is a quite competitive status quo

I completely agree with this. In addition I find that many places where I might want this can be replaced with a little refactoring and an early return. So I feel like the bar should be fairly high for new syntax. In particular, introducing a keyword to label loops feels excessive.

Hence the idea of trying to use a macro to allow labeling loops in some way (eg, some variant of reusing the @label macro as I suggested above): it's a middle ground which doesn't introduce language syntax, but does allow things like while loops to be unambiguously labelled.

it's a middle ground which doesn't introduce language syntax, but does allow things like while loops to be unambiguously labelled

To expand on this: we could declare that

  • Unambiguous syntactic forms like for x in xs introduce a loop label x.
  • General ambiguous cases like while i < j can be manually labeled with something like @label myloop while i < j

This should allow continue x to work in simple obvious cases, and continue myloop to work in the general case without resorting to @goto. And it doesn't introduce any new keywords.

Was this page helpful?
0 / 5 - 0 ratings