Rfcs: Allow for and while loop to return value (without discussion of keyword names)

Created on 6 Oct 2016 · 30Comments · Source: rust-lang/rfcs

Introduction

I noticed #961 didn't seem to go anywhere because of the discussion about the name of the new block. That's why I just started working on an RFC with the new block name that IMHO seemed to have the least issues, namely the !break one. However, I found out a lot of behavioural details where not actually decided on or discussed.

What this discussion is about (spoiler: not names of keywords)

This is why I want to start a new thread to discuss these details and not the names of keywords. The names used in this issue are not meant as final keywords for an RFC. They are simply meant to allow a discussion without confusion. After the issues here are settled, the names for the keywords can be discussed again in #961 or in a new issue.

Problems

The new proposal would allow for and while to return a value with the break statement, just like proposed for loop in #1624. We need extra behaviour because for and while can exit without a break. However, this can happen in two different ways. Either the loop body is never executed (for with an empty iterator or while with a false condition at the start). Or the body is executed at least once, reaches the end of the loop body or a continue statement and then the loop stops (e.g. because it's out of elements or the condition has become false).

Solutions

For the case where the body is never executed another block is required that can return a value, this block is from here on called noloop. For the cases where the loop would be executed I see three main solutions. These are below with short names that are used in the rest of this issue between brackets:

Use the same noloop block as is use in the other case. ("noloop")
Use a second new block, from here on called nobreak. ("nobreak")
Use the value from the last statement that is executed, which can be continue. ("last-statement")

Then there are also two combinations possible of the options option above:

Make the nobreak block optional and "noloop" when it is not added. ("nobreak-noloop")
Make the nobreak block optional and "last-statement" when it is not added. ("nobreak-last-statement")

My opinion on these solutions

The "noloop" option seems like the worst choice, as it does not allow to differentiate between the two different cases. The "nobreak" option seems better in this regard as it allows to have different behaviour for both cases, but is more verbose in cases where the behaviour should be the same. The "nobreak-noloop" solution allows to have different behaviour for both cases, but is concise when this is not needed.

The "last-stament" option has a big usability advantage over the previous ones, because it allows returning of a value used in the last iteration. However, this comes with the disadvantage that when this is not needed all possible endpoints need to return the same value, i.e. the end of the loop and all continue statements. With the "nobreak-last-statement" solution you can work around this disadvantage by using the nobreak block.

This reasoning is why I think the solutions can be ordered from most desirable to least in the following order:

"nobreak-last-statement"
"last-stament"
"nobreak-noloop"
"nobreak"
"noloop"

Please comment if you have other ideas that could be used, or you have a different opinion en the proposed solutions.

T-lang

Source

JelteF

Most helpful comment

@JelteF I guess my feeling remains that the need to "produce" values from for loops (or while loops) is really an edge case; almost every time I want it, I realize I could phrase what I want with an iterator. The fact that we're being forced to reach for confusing (else) or unfamiliar and highly unconventional (!break) keywords/syntax in service of an edge goal seems like it's not worth it.

At minimum, I would want to wait until loop { break } is stabilized and in widespread use. That would give us more data on whether we commonly want similar things from a for loop. (Even stabilizing loop/break is currently under dispute precisely because it sees so little use, perhaps partly because it is unstable.)

nikomatsakis on 15 Mar 2017

👀2 🚀2 ❤2 🎉2 👍2

All 30 comments

If the loop executes without being entered then the break never get's entered, which means that !break block makes complete sense. I would not special case this.

There is no other cases where a loop can exit without breaking -- except for return or break 'location which are irrelevant because the value of the break is never used.

I'm a little confused about this, other than the name of the !break block, there isn't much confusion (in my mind) about the syntax or structure. It would simply be something like:

let x = while i < 10 {
    if i == 6 {
        break "found";
    }
    if i == 3 {
        return Err("I was odd <= 3");
    }
    i += 2;
} !break {
    "not found"
};

There is no need for a noloop in that case (because !break covers it -- if it didn't enter the loop it _also_ didn't encounter the break so would go to the !break)

vitiral on 6 Oct 2016

You might be right. The main reason why I thought special casing it would be nice, was so you could use the last iterator value as the return value in case nothing was found. Like the following:

let x = for i in [1, 4, 3, 2] {
    if i == 6 {
        break 6;
    }
    i + 10
} noloop {
    0
}

However, I'm not able to quickly think of a real use case for this. And it would also mean that the last statement needs to be executed each loop execution although it is only used the last time.

JelteF on 6 Oct 2016

that is the precise reason that !break needs to exist though. Your example requires !break, because that is what handles not encountering any break statements. The noloop would be separate and could not have anything to do with "the last iterator value" -- if the loop never executed then there would not be a "last iterator value"!!

You are proposing adding an additional branch in the case that the loop is never entered (noloop). However, this case is _already covered by !break_ -- since if it didn't enter the loop it _also_ didn't encounter break -- so I don't think it would be a good idea.

The "last loop value" is a non-starter anyway IMO -- it would be extremely confusing to work with and non-intuative.

vitiral on 6 Oct 2016

Ick! I strongly prefer doing this sort of thing with return via closures or nested function in languages with those, way more clear and declarative than some strange keyword soup. If I saw one of our grad students using this sort of language feature, then I'd maybe make them use a closure instead.

I suppose #1624 seems okay because loop already denotes an unfamiliar loop, so everyone expects strange flow control, but even there I'd hope tail call optimizations eventually make people loose interest in this loop return value business.

That said, if one wants this sort of feature, there are two reasonable options that avoid needing any keywords :

If you like expressions everywhere, then just make for and while return an enum that describes what happened. I'd suggest just an Option<T> where T is the type of the final expression in the body, and the type supplied by breaks and continues, so None always means the loop never ran, and or_else(), etc. all work. You might consider something more complex, but.. why? You need a value from continue anyways. It'd interact with say #2974 but so does if now. Appears Option<T> gets messy here, but an enum the compiler handled special could avoid breaking existing code.

If you're less dedicated to expressions, and happy to ask for more breaks, then #1624 could be tweaked to allow :

let x = 'a { 
    for i in ... {
        ...
        if foo { break `a bar; }
        ...
    }
    baz
};

In both these cases, there is much less ambiguity for someone reading the code who comes from another language that may do other things.

burdges on 7 Oct 2016

👍3

@burdges what do you think of this:

let x = if y > 10 {
    "small"
} else {
    "big"
};

This is "expressions return values" and is a core principle of rust control flow. These RFC's aim to make it even more clear.

The enum doesn't work, as has been discussed in #961

vitiral on 7 Oct 2016

I have to disagree pretty heavily with anyone proposing last statement.

First of all: it's not backwards compatible.

Second: it can't borrow anything that the conditional also borrows. For example, we should be able to do this:

let k = for i in &mut v {
    if i == x { break i }
} else {
    x
}

This isn't possible to duplicate with the last statement, because the value in the last statement needs to be held in a temporary variable while the condition is evaluated (i borrows the iterator, preventing next() from being called on it until it's gone). So while the version I have would work, it wouldn't work for this:

let k = for i in &mut v {
    if i == x { break }
    i
} noloop {
    x
}

The big problem is that it desugars to this:

let vi = v.iter_mut();
let mut k;
let mut ran = false;
while Some(i) = vi.next() {
    if i == x { break }
    k = i;
    ran = true;
}
if !ran { k = x }

Except the compiler knows k always gets initialized.

notriddle on 7 Oct 2016

👍2 🎉1

It could easily be made backwards compatible. It would only be enabled when
break returns a value or when the noloop block would be added.

Your second statement is a very good reason not to do that though.

On 7 Oct 2016 4:52 am, "Michael Howell" [email protected] wrote:

I have to disagree pretty heavily with anyone proposing last statement.

First of all: it's not backwards compatible.

Second: it can't borrow anything that the conditional also borrows. For
example, we should be able to do this:

let k = for i in &mut v {
if i == x { break i }
} else {
x
}

This isn't possible to duplicate with the last statement, because the
value in the last statement needs to be held in a temporary variable while
the condition is evaluated (i borrows the iterator, preventing next()
from being called on it until it's gone). So while the version I have would
work, it wouldn't work for this:

let k = for i in &mut v {
if i == x { break }
i
} noloop {
x
}

The big problem is that it desugars to this:

let vi = v.iter_mut();let mut k;let mut ran = false;while Some(i) = vi.next() {
if i == x { break }
k = i;
ran = true;
}if !ran { k = x }

Except the compiler knows k always gets initialized.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rfcs/issues/1767#issuecomment-252141548,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG8JtGTqeBLlVXWum42KGAQkTYr7yWnks5qxbP7gaJpZM4KQNhr
.

JelteF on 7 Oct 2016

One thing to note (wrt keywords) is that since the last RFC we’ve added our first contextual keyword. Although I wouldn’t recommend adding another, it is still an option to consider.

As far as behaviour goes, the @notriddle’s point is fair and has no solution (you oughtn’t work around the borrowchecker here). Remembering the result of last expression in every iteration is probably something that will _not_ happen for this reason. This makes result-ful for and some while loops significantly less useful as well.

Probably the only viable thing to consider is running that extra block if the loop wasn’t exit through break EXPR.

nagisa on 7 Oct 2016

👍2 🎉1

Yes, the only viable syntax (with all possible complications) is:

let x = while i < 10 {
    if i == 6 {
        break "found";  // x == "found"
    }
    if i == 3 {
        return Err("i was odd <= 3"); // x will never be set
    }
    if i == -1 {
        panic!("negative numbers!"); // x will never be set
    }
    i += 2;
} !break {
    // only run if `break` never returned a value in above
    // loop (including if the loop was never entered)
    "not found" // x == "not found"
};

The only real discussion is whether there must be a ; after breaks with values -- which I think there should be (just like there is for return

vitiral on 7 Oct 2016

return .. and panic!(..) doesn't need semicolon in your example though.s

Ericson2314 on 8 Oct 2016

https://github.com/rust-lang/rfcs/pull/1624#discussion_r82423426 discussed this, and you are right -- ; is not necessary for return, therefore it shouldn't be necessay for break.

I had been of the opinion that it was necessary, I think it is just conventional.

vitiral on 8 Oct 2016

I believe with the current compiler architecture it would be possible to discriminate between loops whose values are used (and thus require type unification) and those whose values are not used (e.g. due to use of ;). It seems to me that this would provide the better ergonomics.

taralx on 8 Oct 2016

Nominating this for @rust-lang/lang discussion, since it seems to have gotten a bit lost.

Also: would this conflict with potential ways to use for/while as iterators, rather than just returning a single value?

joshtriplett on 23 Feb 2017

Personally I'm just inclined to leave this as it is. I feel it's "enough" for loop to allow a value to be returned.

nikomatsakis on 24 Feb 2017

👍5 ❤1

I'd prefer for/while/etc. {} else {} and else only required, if there is a break in the loop and the return value is used.
In case of loop it may also be useful to add variables, that are only in scope of the loop, else most return values seem useless, since you have to declare them before the loop anyway

loop results = Vec::new(), other_var = 1 {
    //fill vector
    if cond { break results }
}

porky11 on 24 Feb 2017

I also feel like we shouldn't do anything here - it seems like there is no nice solution and the use case is not strong enough for something complex.

nrc on 10 Mar 2017

I think this discussion was basically over, because everyone was agreeing on the way that @vitiral proposed. @nrc I'm not sure why you deem that something complex.

JelteF on 10 Mar 2017

I am strongly disinclined to include else for loops or anything beyond if. I think the meaning of this is very unclear. As evidence, I submit this survey from PyCon) (Python includes this feature):

survey results

Note that fully 75% of respondents either did not know or guessed the wrong semantics (the correct answering being "the else only executes if there was no break in the while loop").

nikomatsakis on 15 Mar 2017

👍6

@nikomatsakis, that's why !break was suggested instead in the other thread, which is very clear in my opinion.

JelteF on 15 Mar 2017

😕1

nikomatsakis on 15 Mar 2017

👀2 🚀2 ❤2 🎉2 👍2

In my opinion, if for and while loops cannot return values, then none should. Complexity always has a cost, but non-uniform/special-case complexity is the worst

vitiral on 15 Mar 2017

But loop is inherently different from for and while because

loop's body is guaranteed to be entered.
loop can only be terminated with a break, so it's obvious where its value comes from.

The language even treats loop {} foo(); and while true {} foo(); differently, where the former gets an unreachable statement warning and the latter does not.

So there's nothing particularly arbitrary about stabilizing loop { break value; } alone. The non-uniform complexity in this situation is due to the pre-existing differences between these loops. This choice falls out naturally as a result.

solson on 15 Mar 2017

👍4

I don't agree at all that it's not uniform. for and while are sugar which expands to loop contain breaks that evaluate to (). This issue basically proposes a syntax for injecting an expression into those sugared over break statements to allow non-unit breaks to unify with them.

withoutboats on 15 Mar 2017

@withoutboats That's the compiler writer's (or language designer's) POV, but it seems clear from previous discussion that people disagree about what the syntax should be and what it should do for while/for. I'm worried that not enough people have the same understanding of the desugaring.

solson on 15 Mar 2017

There were multiple other possibilities for syntax discussed in the other thread fwiw, many of which are less "highly unconventional" than !break.

glaebhoerl on 15 Mar 2017

Since loop has to break eventually (or loop forever), it makes sense to enable it to evaluate to a value when breaking. But for and while don't have to break.
If we had "break with value" for for and while, it wouldn't be clear what the value should be if they don't break. One possible solution is with else:

let r = while cond() {
    if cond2() { break 42 }
    foo();
} else {
    52
};

Edit: Thinking about it more, I think it would make sense to add "loop break value" for for and while (for consistency) but only if no new keywords are introduced for this small feature. So I think we should use else like above. I know it looks weird at first when seeing an else without an if, but for newcomers it's not more confusing than if let syntax, or nested ifs like this:

if if cond1() {
    foo();
    cond11()
} else {
    cond2()
} {
    bar();
} else {
    baz();
}

Or:

if {
    foo();
    cond()
} {
    bar();
} else {
    baz();
}

Or:

if match x {
    42 => true,
    _ => false,
} {
    bar();
} else {
    baz();
}

Or:

if let Some(x) = if cond() {
    foo();
    a
} else {
    b
} {
    bar();
} else {
    baz();
}

Or:

while {
    let r = foo();
    bar(&r);
    cond(&r)
} {
    baz();
}

Which occur often enough in real-world code that tries to minimize mutable state.

Boscop on 15 Mar 2017

😕1

We decided at the lang team meeting to close this issue. We're not inclined to allow for or while loops to evaluate to anything but () for now. We were all very much in agreement with Niko's earlier comment that evaluating these loops is an edge case and all of the proposed solutions have too great a downside.

withoutboats on 8 Apr 2017

Sad to hear that, but if that's the case I think #961 should be closed as
well.

On Sat, Apr 8, 2017, 01:55 withoutboats notifications@github.com wrote:

We decided at the lang team meeting to close this issue. We're not inclined
to allow for or while loops to evaluate to anything but () for now. We were
all very much in agreement with Niko's earlier comment
https://github.com/rust-lang/rfcs/issues/1767#issuecomment-286846376 that
evaluating these loops is an edge case and all of the proposed solutions
have too great a downside.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rfcs/issues/1767#issuecomment-292678002, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABG8JoWLSOyCTcvgqfAX0hYQn62vwTkIks5rtszZgaJpZM4KQNhr
.

JelteF on 8 Apr 2017

I would really love to have this feature, so I try to make my proposal:

be useful in everyday situations
have a clear and intuitive formal semantics and typing rule
avoid the semantical confusion of else

Long story short, the while loop is followed by a then clause:

while BOOL {BLOCK1} then {BLOCK2}

This should be desugared to, and therefore have the same type and semantics with:

loop {if (BOOL) {BLOCK1} else {break {BLOCK2}}}

just as the usual while loop

while BOOL {BLOCK1} // then {}

have already and always been desugared to

loop {if (BOOL) {BLOCK1} else {break {}}}

It requires a bit more care for for but the story remains basically the same.

Note that the break in the then clause is harmless but redundant, since it will be desugared to break (break ...).

The choice of then over else or final is explained in #961

I would suggest then instead of final, since in all currently popular languages where it exists, final(ly) means the exact opposite of getting executed only when not being break-ed before, which is getting executed whatsoever. then would avoids the sort of naming tragedy like return in the Haskell community.

then also avoids the semantical confusion brought by else, since it naturally has a sequential meaning (I eat, then I walk) in parallel with its role in the conditional combination (if/then). In places where it joints two blocks ({ ... } then { ... }) instead of a boolean and a block (x<y then { ... }), the sequential semantics prevails intuitively.

This syntax can be used wherever the loop is meant to find something instead of to do something. Without this feature, we usually do the finding and then put the result somewhere, which is a clumsy emulation of just to find something.

For example:

while l<=r {
  let m = (l+r)/2;
  if a[m] < v {
    l = m+1
  } else if a[m] > v {
    r = m-1
  } else {
    break Some(m)
  }
} then {
  println!("Not found");
  None
}

which means:

loop {
  if (l<=r) {
    let m = (l+r)/2;
    if a[m] < v {
      l = m+1
    } else if a[m] > v {
      r = m-1
    } else {
      break Some(m)
    }
  } else {
    break {
      println!("Not found");
      None
    }
  }
}

Even this desugared version is cleaner than something like

{
  let mut result = None;
  while l<=r {
    let m = (l+r)/2;
    if a[m]<v {
      l = m+1
    } else if a[m]>v {
      r = m-1
    } else {
      result = Some(m);
      break
    }
  }
  if result==None {
    println!("Not found");
  }
  result
}

exprosic on 8 Feb 2020

👍6

What if the clause is next to the conditional?

let var = while BOOL else EXPR { BLOCK }

You're normally reading a while loop as:

Evaluate the conditional.
If true, execute the block.
Else...

This way, else is in proximity to the conditional and the assignment, which I think helps a reader make that association.

And break breaks out of a while loop, so it never hits the conditional. So if you have the intuition that the conditional either goes into the block or stops and returns the else clause, then it makes more sense that the break must bypass else entirely.