rfcs 🚀 - Allow loops to return values other than ()

I'm just starting with rust, but why would loops not just return their last statement like everything else seems to do, functions, if, match?

break statements could be treated like early returns and accept an expression parameter. The same could be done for continue statements, its argument would only be used when the loop exits afterwards.

The else clause is not even needed when implemented like this.

JelteF on 24 Jan 2016

@JelteF the problem is that a loop without a return or break statement will never return, for that reason you cannot end your loop on an expression.

ticki on 24 Jan 2016

@Ticki for and while loops do return without a break, and could evaluate to the value on last iteration of the final expression in their block.

withoutboats on 25 Jan 2016

@withoutboats Yeah, that's right. for and while could definitely return a value! But loop cannot, without being able to specify a return value to the break statement.

ticki on 25 Jan 2016

You want to make loops return value of last expression? You want to make it implicit without any break?
What if I want to break the while and return value? Do you suggest me to use code like

let mut done = false;
while (!done)
{
    let value = get_value();
    done = value.is_valid();
    value
}

I think this is ugly.
We need else block because loop may end before it started when a condition true from start.

KalitaAlexey on 25 Jan 2016

Update: never mind the note below (which is preserved just so the responses that follow continue to make sense).

The reason that the note is irrelevant is that we currently have rules in the type checker ensuring that if a loop-body ends with a expression, that expression must have type ().
(This, I think, will ensure that no existing code will introduce temporaries that need to outlive the loop body itself, unless I am wrong about how the temporary r-value extents are assigned in such a case...)

@JelteF another reason to not just use the last expression in the loop-form's body as its return value is that it would be a breaking change: loop-form bodies today are allowed to end with an expression that is evaluated and then discarded.

Returning the last expression implicitly would change the dynamic extent of the returned value, which in turn would change the lifetime associated with it and its r-value temporaries. And that change would probably inject borrowck errors into existing stable code.

(The alternative of solely using break and else for the return value would be, I _think_, entirely backwards compatible...)

pnkfelix on 25 Jan 2016

@Ticki You might have misunderstood. I only said we did not need the else block but the break and continue statements are obviously needed for early loop exits.

@KalitaAlexey I did suggest code like that, but the break could still be used. It is a very good point that you are making though. I had not thought about the case that the loop would never be evaluated. It seems you are right that the else block is needed in cases where the loop body is never executed, so there is no way to return from it.

@pnkfelix I'm not sure what the breaking change is, since the check for the type of the return value could simply be skipped in cases where it is not saved in a value.

JelteF on 25 Jan 2016

add an optional else clause that is evaluated if the loop ended without using break;

Please, no. Python has this, and every time I encountered it I had to jump into REPL and see when this else thing would get evaluated: on or in absence of break.

I’m not that opposed to being able to “return” something from the loop with a break, but necessity of adding a else-ish thing makes this a no-brainer minus one million to me.

nagisa on 25 Jan 2016

👎9 👍7

I think that an else block is strictly more expressive than returning the final expression and only thing that makes sense in the presence of value-returning breaks. If the value of final expression is going to be returned, why should the evaluation of the last run of the loop be any different from any other run? And are the final expressions (given that there is no side effects) evaluated for nothing on all the other runs? Optimizing them off becomes then burden on the compiler.

If the value-returning break isn't hit, then there needs to be an alternative path that returns a value of the same type. It doesn't have to be named "else", but I think that's a sensible name.

golddranks on 25 Jan 2016

@nagisa

Please, no. Python has this, and every time I encountered it I had to jump into REPL and see when this else thing would get evaluated: on or in absence of break.

This bites me every time too, mainly because while useful its a rarely used feature. Maybe a better/more accurate name would help? (Nothing immediately springs to mind though.)

Or perhaps something slightly different, like having a default expression instead? e.g.:

for x in iterator {
  if foo(x) {
    break "yes!";
  }
} default {
  "awww :("
}

Where the default expression is evaluated either if iterator is empty or foo(x) is false for all x in iterator.

erikjohnston on 25 Jan 2016

👍4

@erikjohnston @nagisa

I agree that the else is always confusing when seeing it. I do think it will be less of a problem when break returns a value, which it doesn't do in Python. But the case still exists when else would be used like in python when the value is not saved in anything and the break might be empty.

I think another name would indeed be good. Something that comes to my mind would simply be nobreak. It's short and describes quite clearly what it is for.

PS. I retract my initial proposal about using the last statement instead of the else block, because of the good arguments against it.

JelteF on 25 Jan 2016

FWIW I think the best way to move forward on this, incrementally, would be to start by only allowing break EXPR inside of loops, and to not touch any of the other looping constructs for now. That sidesteps all the other tricky design questions we've been spinning in circles around.

glaebhoerl on 25 Jan 2016

@glaebhoerl
I doubt that's a good way to go about it. It will only encourage people to "hack" a for or a while loop inside a loop loop. I've not heard any argument against using the else statement except for its name.

JelteF on 25 Jan 2016

This can kind-of be already done as

{
    let _RET;
    for x in iter {
        if pred(x) {
            _RET = x;
            break;
        }
    }
    _RET
}

arielb1 on 25 Jan 2016

Yet again, @glaebhoerl says exactly what I was going to say :). Somebody want to make an RFC for this, I'd be willing to help?

@JelteF heh, that's kinda the point! Once people see how nice this is, there will be more motivation to actually reach a consensus on break-with-value for other types of loops (and maybe even normal blocks!).

Ericson2314 on 25 Jan 2016

@arielb1 Of course it can be done, but the point is that this:

let a = for x in 1..4 {
    if x == 2 {
        break x
    }
} nobreak {
    0
}

looks much cleaner than this:

let a = {
    let mut _ret = 0;
    for x in 1..4 {
        if x == 2 {
            _ret = x;
            break;
        }
    }
    _ret
}

@Ericson2314 It seems that if the only consensus that needs to be reached is the naming, it could be solved rather quickly. It would be weird to hurry an incomplete proposal, if all that needs to be done is pick a name for a statement.

JelteF on 25 Jan 2016

👍3

@JelteF Well I'll grant you that originally there were more ideas, but because https://github.com/rust-lang/rfcs/pull/955 did not happen else { } is the last one that makes sense. On the other hand, there are a few more small details than just the keyword. E.g. should this work?

<some loop> {
    ...
    break; // as opposed to break (); 
    ...
} else {
    my_fun() // returns ()
};

@nagisa Anyone playing around will notice that the type checker will require break .. and else { .. }` to have the same type. IMO that will help make clear the behavior, no manuals needed.

Ericson2314 on 25 Jan 2016

@Ericson2314 I don't see a reason why that should not work. In Python it is not an expression and it still has a use. Namely handeling the edge case when the loop does not break. A simple example can be found here: https://shahriar.svbtle.com/pythons-else-clause-in-loops#but-why
Copy pasted:

for x in data:
    if meets_condition(x):
        break
else:
    # raise error or do additional processing

vs

condition_is_met = False
for x in data:
    if meets_condition(x):
        condition_is_met = True

if not condition_is_met:
    # raise error or do additional processing

As for your comment @nagisa. In this case it might not be directly clear what the else does, which is why I think another name would still be clearer.

JelteF on 25 Jan 2016

I passionately hate the idea itself of having an else keyword associated to a loop in any way, @Ericson2314. It simply makes no sense and I intensely highly doubt one can prove it otherwise. Thinking about it, it might make some sense if the else block was executed when 0 iterations of the loop are executed, actually, but that’s overall an useless construct.

I don’t want to see any of that weirdness in Rust just because Python has it. One might argue for a new keyword, but that’s ain’t happening either, because of backwards compatibility.

EDIT: All looping constructs have trivial desugarings into a plain old loop, @glaebhoerl, so there’s no necessity to do any of the “only allow x in y” dance, I think.

nagisa on 25 Jan 2016

👎3

@nagisa Sure, but the desugarings of while and for into loops _contain_ breaks, ones which don't return a value (said differently: return ()) -- so if you want to break with a value elsewhere in the loop you have a type mismatch. This is precisely what the else clause would be for: in effect it's doing nothing else but providing the value argument to the implicit break embedded in the while/for constructs.

glaebhoerl on 25 Jan 2016

👍1

@nagisa
if new keywords are a problem, maybe something like !break could be used. Which I guess is currently invalid syntax.

JelteF on 25 Jan 2016

@nagisa Personally, it reminds me of the base case of a fold, and thus actually feels quite elegant.

Ericson2314 on 25 Jan 2016

👍2

@nagisa If all looping constructs desugar into loop, we can use value-returning break without problems with loop, but with for, the types don't unify because there is more than one way to return from to loop: either break or then just looping 'till the end, which produces currently (). That's why we need some kind of "default" return value in the case a break isn't hit. Is it just the keyword else you are detesting, or the concept of having default return value by itself?

I just came to think of another possibility for for loop: the for loop could return an Option<T>. This way, we could write

let result = for i in haystack_iter {
    if i == needle {
        break "Found!";
    }
}.unwrap_or("Not found :(");

This is nice in the sense that it doesn't need any new keywords or reusing old keywords in surprising way.

golddranks on 26 Jan 2016

@golddranks But that's confusing. It is like functional programming but ugly.

KalitaAlexey on 26 Jan 2016

👎2

Another note: sometimes I've written a loop that is expected to set some outer state inside the loop. But because setting the state (in that particulal case was) may be expensive, you might want to avoid setting a "default" state before running the loop.

But this results in the fact that the control flow analysis can't be sure if the state is set in the end, since it's possible that the loop runs 0 times. I have to make a boolean flag to check, and even then, if the analysis isn't super smart, it won't be sure. Having a default/nobreak (whatever the name is going to be) code block would help the flow analysis in these kinds of situations. EDIT: of course for that to be of any help, there should be a piece of information available whether the loop terminated without running even once, or if it terminated because it iterated until the end.

golddranks on 26 Jan 2016

All in all: there is actually two aspects in the control flow of the for loop; the second one isn't directly related to the return value, but it might be of relevance to consider at the same time with default block or blocks:

Did the loop break with a value or not? If not, there needs to be _some_ return value for the types to unify.
Did the code inside the loop run at all? In the cases where the loop has side effects (like initializing a variable declared in an outer scope), it might be valuable information for the control flow analysis to understand, and thus, it might be beneficial to have a language-level construct to help with this case.

golddranks on 26 Jan 2016

I see a lot of people complaining about for-else because it is allegedly unintuitive. However it _does_ exist in Python, and has strong parallel with the desugared loop. Adding a keyword seems worse, especially a special keyword for this instance. In my opinion, people will get used to it, and compiler diagnostics will go a long way towards helping people.

For what it's worth, break taking a value will make this substantially more intuitive -- return the break value or _else_ use the else block.

taralx on 26 Jan 2016

👍2

@golddranks because https://github.com/rust-lang/rfcs/pull/955 didn't happen, we cannot do your option idea.

Ericson2314 on 26 Jan 2016

Exhibit A in why the while..else syntax is almost certainly dead in the water. For better or worse.

(And while it's easy to say "oh just come up with a better one then", the fact that across several discussions nobody has yet managed to do so suggests that maybe doing so is not so easy.)

glaebhoerl on 16 Feb 2016

👍1

@taralx wrote:

For what it's worth, break taking a value will make this substantially more intuitive

Except that break already has an optional argument, a loop label. So it would require a bit more thought on syntax.

peterjoel on 16 Feb 2016

@glaebhoerl are you pointing out a technically problem or mere unpopularity? Break-with-value rules out that and (hopefully all other conflicting) desugarings.

Ericson2314 on 16 Feb 2016

Aside from the possible syntax conflict with labels, having loops always return an Option means there doesn't need to be any confusing, new syntax for when there is no break:

let a = for x in 1..4 {
   if x == 2 {
      break 2;
   }
}.or_else( 0 );

I don't think it's much harder to understand than @JelteF's version.

peterjoel on 16 Feb 2016

👍1

@Ericson2314 "Mere" unpopularity.

@peterjoel This is not possible to do backwards compatibly as @Ericson2314 noted just above.

glaebhoerl on 16 Feb 2016

To address the issue that break already has an optional loop-identifier argument, try this on for size (adapting @peterjoel's example).

let a = for x in 1..4 {
    if x == 2 {
        break with "found";
    }
}.or_else("not found");

The equivalent with a loop identifier would be break 'inner with "found". Omitting the with section means returning unit, so the behaviour is the same as before.

If it's decided that we must go with a block-with-leading-keyword rather than returning Option<T> (which I like, but am not _totally_ sold on) I'd like to think we can find something clearer in intent than else—I know from experience that it's not at all clear what that does if you don't already know about it.

ketsuban on 16 Feb 2016

@Ketsuban Other possibilities:

 break 'label "found";
 break "found";
 break 'label;

 break 'label: "found";
 break: "found";
 break 'label;

 break 'label => "found";
 break => "found";
 break 'label;

peterjoel on 16 Feb 2016

I do not see the grammar ambiguity caused by having an optional break value along with an optional label in break 'label value, as long as the label comes first, but I’m strongly opposed to any such feature regardless.

nagisa on 16 Feb 2016

Perhaps we should be working on a way to extend the macro system so people can implement this in macros and see how it fits. Right now, you can't implement the break part of this proposal with macros.

taralx on 16 Feb 2016

@glaebhoerl haha ok.

Ericson2314 on 17 Feb 2016

i don’t get the problems:

let a = for x in 1..4 {
    if x == 2 {
        break "found";
    }
} else { "not found" };

let’s mirror the way if works:

let a = for x in 1..4 {} else { "not found" };
// error: if and else have incompatible types

more if:

let a = for x in 1..4 {
    if x == 2 {
        break "found";
    }
};
// error: for may be missing an else clause

lifetimes are a single token, so anything would work:

break 'a;

break "found";

break 'a "found";

break "found" 'a;

break 'a => "found";

...

flying-sheep on 20 Feb 2016

@flying-sheep wrote:

i don’t get the problems:
let’s mirror the way if works.

There is more than one way to sensibly "mirror the way if works". For example, my first intuition was completely different; I interpreted the else statement to be executed only when the while block was never entered (mirroring an if).

lifetimes are a single token, so anything would work:

The 'foo syntax here refers to a loop label, not a lifetime.

peterjoel on 20 Feb 2016

I interpreted the else statement to be executed only when the while block was never entered (mirroring an if).

well, obviously. what else?

if you don’t break, what will the loop evaluate to?

flying-sheep on 20 Feb 2016

if you don’t break, what will the loop evaluate to?

I entered this from a discussion about adding an else clause to loops, which was not considering making loops into expressions.

peterjoel on 20 Feb 2016

Break with value

let value = for value in values {
    if is_valid(&value) {
        break value;
    }
} else {
    // Some default value of type value
};

or

let value = for value in values {
    if is_valid(&value) {
        break Some(value);
    }
} else {
    None
};

Break without value

for value in values {
    if is_valid(&value) {
        break;
    }
} else {
    // return;
    // panic!();
    // do something when loop ended without break
};

Same with while. I don't see any problems.

KalitaAlexey on 20 Feb 2016

👍1

I was in favour of this initially, but I hadn't really thought it through.

I don't see any problems.

One problem is if you assume that it works the way I described, but it actually works the way you described, then you'll get a surprise when you do something like removing a break from a loop. The "else" code you thought would only execute when the loop block is _never_ run, is now executing even though the loop ran all the way through.

The proposal to turn all loops into expressions, is better abstracted as a more specific operation, which we could call _find_ or _search_. You are iterating until a predicate is matched, then breaking out when you find it. The else here corresponds to a default value if nothing is found. You _could_ add language level syntax for that, but this case is already covered by Rust's rich Iterator API: find, map, fold, filter_map and friends. Your example above would be:

values.iter().find( |value| is_valid(&value) );

With a default value instead of an Option:

values.iter().find( |value| is_valid(&value) ).or_else( default_value );

peterjoel on 20 Feb 2016

👍3

@peterjoel Yeah, but for looks clearer. I like clear code. And about surprise. There is no surprise after some time.

KalitaAlexey on 20 Feb 2016

👎1 👍1

i didn’t think i’d ever say this but: python’s version is the surprising one.

this one makes much more sense, and, if taught with the loop used as expression, it will be intuitive why it works that way.

flying-sheep on 20 Feb 2016

@KalitaAlexey I would argue that the iterator code is clearer. It uses fine-grained abstractions to declare exactly what you want to do. Loops are procedural and unnecessarily accentuate the low level details instead.

peterjoel on 20 Feb 2016

Well, in some cases too many iterator methods can be very obscure. However, in the vast majority of cases, iterators help creating better and cleaner code, by giving more fine-grained control to the programmer.

ticki on 20 Feb 2016

Would really like this. Am I right in thinking that the only problem is the break syntax, since for and while can return an Option<_> as pointed out above?

In that case, why not use break (val)? This doesn't seem to be legal syntax at the moment. I can write a formal RFC if necessary, but it seems pretty clear:

let x: i32 = loop {
    break (5);
};
let y: Option<i32> = while true {
    break (6);
};
let z: i32 = for v in some_list {
    break (v);
}.unwrap_or(0);

dhardy on 19 May 2016

👎1 👍1

:confused:, I dislike that one. It seems quite assymetric with the rest of the syntax. I'd prefer block return values, like:

let a = 'a: while true {
    return 'a 4
}
assert_eq!(a, 4);

or something along these lines.

ticki on 19 May 2016

Then you have the odd case that a value may or may not follow the label. I think anyway that should be break 'a 4; in your example. Also, it would be tempting to shorten to break 4;. Hang on, labels are syntactically clear from the apostrophe, so why wouldn't just break 4; be possible?

This would make break; synonymous with break (); and break 'a; with break 'a ();

Another option would be a new keyword like break_val 4;, but then it really ought to wait for Rust 2.0 and some people don't like introducing new keywords full-stop.

dhardy on 19 May 2016

👍1

@dhardy, I'm afraid you missed the bad new I cited in https://github.com/rust-lang/rfcs/issues/961#issuecomment-175164750 -- we cannot that because for and while loops are already expressions with type () today, whereas they would have type Option<_> under that plan. So we have to do for-else loops or leave them as is.

As @glaebhoerl wisely pointed out, break with value with plain loops is much less controversial and the most general so we should do that first.

Ericson2314 on 20 May 2016

That enshrines Option in the language, though. Are we sure we want to do that?

taralx on 20 May 2016

@Ericson2314 So the problem is that the loops are expressions that return () at the moment?

What prevents us considering them expressions of type Option<_> in the case the loop includes break with value? We don't have breaks with values right now, so the existing loops wouldn't change their type. I think that shouldn't introduce anything odd in the parsing phase, since the loops would parse to expressions anyway, and type checking is later.

golddranks on 20 May 2016

👍3

Yes that could be done @golddranks, though that would mean break isn't the same as break (). Unfortunate, but not a big issue I suppose.

Ericson2314 on 20 May 2016

That enshrines Option in the language, though. Are we sure we want to do that?

don’t we already use some types with language features? e.g. Result with ?/try.

flying-sheep on 20 May 2016

@Ericson2314 I see, () is not equal to Option<()>, and adding a special case to return () instead of Option<T> where T=() would be weird. Not impossible, but certainly a bit weird, and annoying if later someone wants to use the value just to tell whether the loop was 'broken' out of or just finished normally. Or as in your second post break; and break (); behave differently; also weird.

As you say, only allowing the loop variant to return anything other than () is the easier version. This still needs the break-with-value syntax to be added, and it's probably going to trip up existing parsers any way it's done, but I guess it's manageable?

Never mind, I read #955 and see that this has already been thought through very well. Can we go ahead with the loop only version?

dhardy on 20 May 2016

Option<()> is equivalent to bool. Some(()) and the corresponding None mean something very similar to true and false, respectively. the boolinator crate in fact converts between the two.

therefore yes, Some(()) would mean “has been broken out of” and None “ran to the end”.

flying-sheep on 20 May 2016

@flying-sheep correct, but I imagine this is a minor use-case and could be sacrificed (you could for example return an Option<bool> and let the compiler figure out you don't use the extra bit).

dhardy on 20 May 2016

I imagine this is a minor use-case

it’s very common for “did i find something fitting”-type code in other languages, although i guess you’d idiomatically use Iterator::any() for this.

Option<bool> has three possible values, so what would the third one mean? i dislike creating code that assumes variants are not used, if the type system allows tightly fitting your code to only have possible variants.

flying-sheep on 20 May 2016

I think there are two semi-sensible choices, of which neither is perfect. Having a loop with break; return () and a loop with break (); return Option<()> is weird indeed but making break () a special case is even weirder and as mentioned, there are cases where you like to find out whether the loop looped until the end or broke.

The other choice is to make all loops return Option<_>. This strictly said breaks backwards compatibility, but I find it hard to believe that anyone would purposefully write code that relies on that. The only case producing such code I can think of, is macro expansion or code generation. Maybe a crater run could be done to check if that kind of code exists?

golddranks on 20 May 2016

Wait, why would loop {} return Option<_>? If it returns anything, it's because of break...

taralx on 20 May 2016

loop {} without break should have an evaluation type of !.

therefore we already kinda have different return values based on the existence of break

flying-sheep on 20 May 2016

The other choice is to make all loops return Option<_>. This strictly said breaks backwards compatibility, but I find it hard to believe that anyone would purposefully write code that relies on that.

On the contrary, this would break a _lot_ of code. Basically any function that has a (non-diverging) loop as its last expression. For example, this would break:

fn print_some() {
    for i in 0..10 {
        println!("{}", i);
    }
}

The viable non-backwards-incompatible options are to only change the type from () when the break-with-value syntax is used, and

return Option<T> or
return T and require an else clause (mirroring Python's for-else construct).

birkenfeld on 20 May 2016

👍1

@birkenfeld I stand corrected. Yeah, that seems to be the case indeed.

golddranks on 20 May 2016

good point. else is a bit nonobvious. maybe then, but i’m also happy with changing the return type based on the existence of a break value.

as said: the return type of loop {} should be! anyway.

flying-sheep on 20 May 2016

I think "default" or "default break" might be more obvious instead of
"else" and this would solve the problem just mentioned. Also, both default
and break are keywords so it shouldn't break backwards compatibility.

JelteF on 20 May 2016

What is !? Is it a valid type? The return type, as I understand it, should be ().

dhardy on 20 May 2016

dhardy: For loops without a break statement, the return type should be !, which means they never return. Edit: It has been used as a return type annotation, but recently there has been discussion about promoting it to a real type even inside the compiler. Here's the relevant RFC. https://github.com/rust-lang/rfcs/pull/1216

golddranks on 20 May 2016

jup, you can use it e.g. for the return type of your program’s main loop function:

fn main_loop() -> ! {
    loop {
        for event in event_queue.fetch() {
            handle_event(event);
        }
        tick();
    }
    // not reached
}

flying-sheep on 20 May 2016

This isn't quite true: loop{} appears to type as ! but loop{ break; } does not. This complicates things. I'm writing an RFC; I'll post soon.

dhardy on 20 May 2016

this is exactly what i said, isn’t it?

FTR: there’s also a RFC to promote ! to a proper type that can be used in any position.

flying-sheep on 20 May 2016

loop { break; }: ()as it should.

Ericson2314 on 20 May 2016

Sorry for my misunderstanding. I'll let you have a read before making a pull request: loop-break-value

dhardy on 20 May 2016

one thing makes no sense:

if EXPR must evaluate to T and the loop has type T (instead of Option<T>), what does it evaluate to in case a conditional break is not hit?

E.g.: what value has thing_found if no thing == thang?

let thing_found = for thing in things {
    if thing == thang { break thing }
}

flying-sheep on 20 May 2016

The RFC does not propose to modify for etc., only loop which cannot return like this (see discussion at end).

dhardy on 20 May 2016

ah, gotcha. and loops have to be broken out of. but we need to think about eventually extending this to for. if loops evaluate to T, the Option<T> route for for loops is blocked since it would be inconsistent.

then we’d _have_ to go the route for thing in things { ... } default { x }

then again, a loop _always_ returning Some(...) and never None is stupid. maybe for the sake of soundness and consistency, we need to go the for ... default/loop → T route.

it’s a bit less elegant than to leverage the many Option methods i guess, but not much.

flying-sheep on 20 May 2016

@dhardy (re: draft)

let a: i32 = loop {}; is currently legal (as it should be)

let c = loop {
    if Q() {
        "answer"
    } else {
        None
    }
};

I think you're missing some breaks here?

The type of loop expressions is no longer fixed and cannot be explicitly typed.

I'm not sure what this means - are you referring to the fact that the type of a loop would depend on what breaks it contains (or doesn't)? It's already the case that a loop containing a break is typed as () and one that doesn't is typed as !.

glaebhoerl on 20 May 2016

@flying-sheep Read birkenfeld's last comment again. for cannot return Option<T> anyway. Also read this comment.

dhardy on 20 May 2016

@glaebhoerl thanks for the comments. That last bit I wrote before realising that loop may be ! or ().

As to your first point, ! may be coerced to any type? Makes sense. I'll change the RFC.

dhardy on 20 May 2016

@dhardy did https://github.com/dhardy/rfcs/pull/1 because can't line edit. Thanks for this!

Ericson2314 on 20 May 2016

dhardy: I don't know if my intuition is type-theoretically sound, but because ! is a type that has no values, it represents a type that can't be reached runtime. That means that code that handles it, can't also be reached (or the program won't type-check), and that allows to do all kinds of crazy stuff with !, like coerce it to anything. The code can't behave wrong, because it will never be run.

golddranks on 20 May 2016

@golddranks I have assumed this in my changes (that ! may be coerced to any type) but didn't check the language spec. It fits with my language theory.

dhardy on 20 May 2016

@flying-sheep Read birkenfeld's last comment again. for cannot return Option<T> anyway.

for { break value } can, as it doesn’t exist yet.

flying-sheep on 20 May 2016

@flying-sheep no it can't, because it would break this code:

fn foo() -> () {
    // error: 'for ...' has type `Option<()>` but expected type `()`
    for { break; }
}

dhardy on 20 May 2016

break value

not break;

just like you do it with loop in your RFC

flying-sheep on 20 May 2016

@golddranks thanks for throwing me off with the break panic!(); example!

By the way, I talk about the "result type" of the loop, since the type of the loop itself is some code...

dhardy on 20 May 2016

@flying-sheep are you proposing the special case that break; and break (); make for return () while break "example"; makes for return Option<&str>? That's a dichotomy I mentioned above: weird.

dhardy on 20 May 2016

@dhardy I assume you meant to tag me there? "result type" sounded to much like functions types to me, c.f. "type of 1 is uint" vs "result type of 1 is uint", but that's just my opinion and its unimportant :).

Ericson2314 on 20 May 2016

By the way, I tried to take account of type coercion and type deduction in my last examples, but am not familiar with the formal rules (despite making frequent use of dereferencing). Anyone want to point me in the right direction or give some examples?

dhardy on 20 May 2016

@Ericson2314 whoops, yes, you. Code needs to be evaluated :) I may be a little imprecise there.

dhardy on 20 May 2016

@flying-sheep are you proposing the special case that break; and break (); make for return () while break "example"; makes for return Option<&str>? That's a dichotomy I mentioned above: weird.

only break;. break (); would be the break EXPR case in your RFC and return Option<T> (in case of the “no-default” variant) or T (in case of the “default” variant) .

flying-sheep on 20 May 2016

In my RFC I point out that break; and break (); are equivalent (thanks Ericson). I suppose this is reason not to make this the case, but it's weird. Why I avoided for etc.

dhardy on 20 May 2016

In the case of loop, it must return either ! or () already, and break (); must return (), so the two must be equivalent (having an extra rule to say if flip() { break; } else { break (); } is not legal would be silly IMO).

In the case of for possibly returning an Option<_> it seems either () must be handled specially or break; and break (); must not be equivalent.

I think for { ... } default { ... } may be the way to go if for is to be allowed to return values...

dhardy on 20 May 2016

thinking about extensibility aka for/while, we have three options:

make loop { break EXPR } and for { break EXPR }/while { break EXPR } return an Option<T> (consistent between loop types, loop weirdly can’t evaluate to None)
make loop { break EXPR } return T and for { break EXPR }/while { break EXPR } return an Option<T> (inconsistent between loop types, evaluation types make sense)
make loop { break EXPR } and for { break EXPR } default { EXPR }/while { break EXPR } default { EXPR } all return T (consistent between loop types, evaluation types make sense)

Resulting in this table:

| | minimal loop types | Always Option | for/while ... default |
| --- | --- | --- | --- |
| loop { break EXPR } | T | Some(T) | T |
| for { break EXPR }/ while { break EXPR } | Option<T> | Option<T> | T |
| ⇒ | Inconsistent between loops | Makes no sense for loop | Needs an extra block |

i think the last one makes the most sense as it has no inconsistencies.

flying-sheep on 20 May 2016

👍1

@flying-sheep I agree that your third mentioned option should be the way to go. The only thing that should be decided upon would be the keyword used for the non break branch.

I think the available options would be (based on the keywords in https://github.com/rust-lang/rust/blob/master/src/libsyntax/parse/token.rs#L408-L472):

else, disliked by lots of people, so unlikely to be the best choice.
default, simple and considered clearer than else.
default break, a bit long, but makes sure the meaning comes definitely across.
else break, I'd say about as clear as default break.

I can't think of other options that would make sense when only considering the already existing keywords.

JelteF on 20 May 2016

If there's any consensus on syntax, I'll update the RFC. Otherwise I'll leave this out. What should we do, vote, shout as loud as possible, out-argue everyone else? I hate bike-shedding.

dhardy on 20 May 2016

I don't think there is consensus on the syntax, except for the fact that else would be to unclear.
I don't have strong feelings regarding any of the three options I just mentioned and in all this discussion nobody seems to have said anything negative about default. So that would be a good choice.

The only thing that might be something to think about is this comment:
@golddranks commented on 26 Jan

Did the code inside the loop run at all? In the cases where the loop has side effects (like initializing a variable declared in an outer scope), it might be valuable information for the control flow analysis to understand, and thus, it might be beneficial to have a language-level construct to help with this case.

I'm not entirely sure if this info is really important though.

If you would update the RFC with using default then I would be fine with it. But the RFC for just the loop seems a clear go as doesn't have some of the issues for and while have. So it might be better to separate them so discussion about for and while don't influence the loop RFC.

JelteF on 20 May 2016

Default is not a proper keyword currently and putting it in proposed use is a breaking change. The contextual keyword thing that is getting abused in the other RFCs doesn’t work here either because:

let default;
for x in 1..2 {} default = 10; // works currently
// following requires inifinite lookahead
for x in 1 .. 2 {} default /* a lot of things could go here, comments, docs and especially – attributes etc etc */ {}

and we do want to have no infinite lookahead in our parser very badly.

nagisa on 20 May 2016

👍4

Let's just leave it out for now, otherwise the RFC will be bikeshedded to death.

Ericson2314 on 20 May 2016

👍1

make loop { break EXPR } and for { break EXPR }/while { break EXPR } return an Option<T> (consistent between loop types, loop weirdly can’t evaluate to None)

Maybe, just maybe, loop _can_ evaluate to None:

loop {
    if ... {break}  // Evaluates to None.
    if ... {break 123}  // Evaluates to Some (123).
}

ArtemGr on 20 May 2016

@ArtemGr But that still leaves a shitty single-break case. Much better to just manually do the option if that is desired.

Ericson2314 on 20 May 2016

Agreed. My RFC explicitly forbids the type of the break value to disagree (where, without value, () is assumed).

dhardy on 20 May 2016

Okay, just created the pull request. I don't think there's really anything worth adding about for/while/while let at this point.

dhardy on 20 May 2016

(For what it's worth, I think default is the first idea for the while/for syntax that's actually _good_, as in, is not going to make half of people think of the opposite thing from what it actually means. Too bad we can't just add it as a keyword, especially given Default::default() which is in std.)

glaebhoerl on 20 May 2016

@nagisa, @glaebhoerl
An idea that I think would work is using break default instead of default.
It should be fully backward compatible as break cannot take arguments (except for in the now proposed loop-break-value RFC).
It's a bit longer than default, but I would argue that it is even more clear what it does.

JelteF on 20 May 2016

@nagisa
After thinking a bit more about this, I'm guessing it would also need infinite lookahead:

for x in 1..3 {
    for x in 1..2 {
    } break; // works currently
    for x in 1..2 {
    } break  /* a lot of things could go here */ default {}
}

Or am I misunderstanding what it meant?

Solutions I can think of:

!break
!break default
else break
else default

JelteF on 21 May 2016

👍1

It seems that motivation for loop-break-value is the biggest issue, not details of how it functions. Does anyone have motivation to add to the discussion?

dhardy on 21 May 2016

I think a lot of it boils down to style. Us with a FP background are used to continuations and whatnot, and see the let x; trick as needlessly imperative or even obfuscated, even if it is safe.

Also, the syntactic overhead is worse when one needs to introduce a block and a variable, like with

foo(loop { ... break bar; ... })

Ericson2314 on 21 May 2016

👍1

I would like to have some kind of break-with-value so I can prototype the for/else change in macros.

I want for/else for cases where I'm doing an "iterate over X and if I don't find what I'm looking for do Y".

taralx on 22 May 2016

The motivation was pretty clear to me:

Rust is an expression-oriented language. Currently loop constructs don't provide any useful value as expressions, they are run only for their side-effects. But there clearly is a "natural-looking", practical case, described in this thread and the RFC, where the loop expressions could have meaningful values. I feel that not allowing that case runs against the expression-oriented conciseness of Rust.

golddranks on 22 May 2016

👍8

Couple thoughts:

Was it ever considered to change the order of things?

Let x = for thing in things default "nope" {
    If thing.valid() { break "found it!"; }
}

Another option is to introduce comprehensions with a variation on the for construct.

yigal100 on 24 May 2016

👍5

I'd like to throw in another vote that the only good choice presented so far for for/while loops is some variation of the for ... { ... } some_keyword { ... } syntax. else is too confusing and default isn't a keyword, so how about final? As in, _"if we get to the end of the loop without breaking then the final value is this"_.

I think it looks alright:

let foo = for x in iter {
    ...
}
final {
    23
};

The downside is that final blocks could be confused for finally blocks in other languages.

canndrew on 2 Jun 2016

👍2

One of else break, else default, or default break seems potentially non-horrible I guess... is the last one unambiguous? (We should also be careful not to leave any ambiguity landmines for let-else, maybe....)

One oddity is that this would presumably look like:

for elem in elems {
    if pred(elem) {
        break elem
    }
} else break {
    default_value()
}

The oddity is that the expression after break presumably has to be surrounded by braces in the second case, but not in the first one.

This suggests a potential variant where the break is "split off from the else, and moved inside the block":

for elem in elems {
    if pred(elem) {
        break elem
    }
} else {
    break default_value()
}

So the rule would be that the else block _must_ be exited with break, and is not allowed to "run off the end". With the intuition being something like that the else is still part of the loop, and responsible for breaking out of it when the main body didn't do so. Compared to plain for..else, would this make the purpose of the else block more obvious, or would it just be even weirder?

glaebhoerl on 2 Jun 2016

Instead, make else part of the loop's final statement execute as if it is a break expression.

erkinalp on 22 Aug 2016

Edit: I second @JelteF on using !break

vitiral on 28 Sep 2016

👎2

I just wish we'd pick _something_ and go with it. I find myself wanting this all the time.

canndrew on 28 Sep 2016

👍3

well, i would like to have for .. else semantics, it’s known from python:

for x in xs {} returns ()
for x in xs {} else {} returns ()
for x in xs {} else { z } returns z
for x in xs { if pred(x) { break y; } } else { z } requires y and z to have the same type and returns a value of that type. z will be returned when the loop ran all the way through. this can happen if
1. there are no breaks (as shown above)
2. there are no iterations (e.g. for _ in iter::empty() { … } else { z } returns z)
3. no conditional break is hit

you can e.g. do let empty = for _ in it { break false } else { true }.

flying-sheep on 28 Sep 2016

👍1

I would like to note that for ... else _is_ unintuitive in python. I actually love the functionality when I find use for it, but I always have to look it up.

However, for ... else is much more intuitive when it is used for assigning variables. i.e. let v = for x in xs { break y } else { z }, however it is not intuitive when it used how python uses it -- to simply _do something_ (not return something) when break is not called.

I think too many of these examples didn't use this for what it would actually be used for, which made else look worse than it actually is.

When you write:

let out = for x in y {
   if x == value {
       break "found";
    }
} else {
    "not found"
};

It is pretty clear what is going on.

The few times a rust programer wouldn't be using else to return a value, they should be familiar enough with the syntax because it will be much more common in rust than it ever was in python. I think people are overly concerned with how difficult it is to understand in python -- rust and python are different languages and this is an (almost) entirely different use case.

vitiral on 28 Sep 2016

From the error message perspective, is this clear?

error: 'default break' required when 'break' returns a value from a 'for' loop:
break "some value;"
~~~~~~~~~~~~~~~~~~^
suggestion: add 'default break ...;'
for x in X {
    ...
}
  ^

Using _default break_ instead of just _default_ or _else_ emphasises that this has to do with _break_. Using _default_ over _else_ emphasises that this is for a default case, like in match. I don't personally like the style of using an operator as part of a keyword like !break.

Question: do we need the braces around the default case?

let is_empty = for _ in myList { break false; } default break true;
let is_empty = for _ in myList { break false; } default break { true };

The first version is less uniform with if ... {} else {} but more uniform within itself. Are there issues parsing this? (Note: if requires braces around the first block to avoid confusion over association in the if (...) ... if (...) ... else ... case. I don't believe the {} are needed around the else value except for symmetry.)

If there is no significant disagreement, I propose adding the default break ...; extension for for and while to the RFC as an option:

while EXPR { EXPR } default break EXPR;

(requiring the semicolon ';' after default break even if the result type is ()).

dhardy on 28 Sep 2016

I agree that default break is good on two conditions:

default break is legal without potentially breaking anyone's existing code.
the syntax should be in line with if ... else: while EXPR { EXPR } default break { EXPR }

This does cause an interesting case, where default is not itself a reserved keyword, but combinations of words are -- i.e. default break. Since that would not be legal syntax anyway, I'm not sure how bad this is.

I'm wondering if the default partial keyword would have other uses. It's too bad we didn't reserve it early to use it instead of _ in match expressions.

If "two word" statements are valid, then another alternative is not break, which would have the benefit of not being used in the stdlib as far as I know and could have a warning added to it with candidancy to be a reserved keyword in Rust 2.0.

vitiral on 28 Sep 2016

I just created a new issue to discuss other implementation details than the the name of keywords here: https://github.com/rust-lang/rfcs/issues/1767

JelteF on 6 Oct 2016

I tried prototyping this in macros and got stuck when implementing break. Anyone successfully managed this?

taralx on 2 Dec 2016

You mean for for and while? it was just done for loop. Use that to make macros for the other two.

Ericson2314 on 2 Dec 2016

It may be worth noting that this trick already works now. Whether this an argument for or against a separate for…{else|else break|!break|default break|final} syntax, I’ll let you decide:

#![feature(loop_break_value)]
let result = 'l: loop {
    for x in xs {
        if pred(x) {
            break 'l x
        }
    }
    break "not found"
};

andersk on 4 Dec 2016

👍4

i think that’s a case of being overly clever. loops evaluating to values will be rare enough, so i guess this trick will be more confusing for readers than it is useful.

flying-sheep on 4 Dec 2016

👍1

A for loop like is never more intuitive than say xs.find(|x| pred(*x)).unwrap_or("not found"). Iterator::find and Iterator::position take an FnMut, so just worm any side effects into pred. Also unwrap_or_else takes FnOnce and hence FnMut too. Any while loop can be dealt with using say (1..).find(|| ...).unwrap_or_else(...) too.

Adding this short-circuiting find_map method to Iterator would make this rebuke of else clauses for for and while loops complete :

#[inline]
fn find_map<P,R>(&mut self, mut predicate: P) -> Option<R> where
    Self: Sized,
    P: FnMut(&Self::Item) -> Option<R>,
{
    for x in self {
        if let Some(y) = predicate(&x) { return Some(y) }
    }
    None
}

A while let loop is less trivial to displace because it does pattern matching. If that bothers folks, then maybe writing loop { match ... { ... } } as loop match ... { ... } would address that best, but just placing the } } together achieves that too.

burdges on 4 Dec 2016

What's wrong with the following, again?

loop without a break has type !
loop with a break x where x: T has likewise type T; breaking returns the value as-is
for/while without a break has type Option<!>; falling out of the loop without breaking returns None
for/while with a break x where x: T has type Option<T>; breaking with a value x returns Some(x)
break without an explicit value is equivalent to break ()
Make every type coercible into () (complementing ! being coercible into any type), so that code which treats loops as returning () keeps working.

Hmm, now that I think of it, if x { y } could be made into mere sugar for match x { true => Some({ y }), false => None, }, and then yy else z into a right-associative operator meaning match yy { Some(y) => y, None => z, }, which gives you:

for ... else .../while ... else ... for free
#1686 for free
less need for unwrap_or and unwrap_or_else

fstirlitz on 30 Mar 2017

👍1

What's Option<!>? The code possibly doesn't exit?

But code like this is currently legal:

fn f(x: u32) {
    for i in 0..x {
        do_something(i);
    }
}

If for returned None, that would no longer type check (function return type is ()).

dhardy on 30 Mar 2017

@dhardy: The post also contains this proposal, which would avoid breaking that code:

Make every type coercible into () (complementing ! being coercible into any type), so that code which treats loops as returning () keeps working.

Making () the top type is a very radical proposal & one I can't imagine us ever doing. Every value would be able to vanish into unit at the whim of the typechecker.

withoutboats on 30 Mar 2017

That article says the top type is one that can represent every possible value. This is something different: making every type convertible into (). (The ! type RFC states that ! isn't a true bottom type either, despite being convertible into any type.)

Right now, not even code like this is accepted:

let x: () = 5;
let y = 5 as ();

I don't see a reason why it shouldn't. It's valid (even if not very useful). Casting to () can be always defined as ignoring the casted value and always returning ().

fstirlitz on 30 Mar 2017

👎3

@fstirlitz Making the type of for-without-break anything other than () is a breaking change.

Making everything coerce into () restricts the type system in subtle and difficult-to-fix ways. If you want the nitty-gritty I recommend Stephen Dolan's paper on Algebraic Subtyping.

taralx on 31 Mar 2017

Right now, not even code like this is accepted:

let x: () = 5;let y = 5 as ();

I don't see a reason why it shouldn't. It's valid (even if not very
useful). Casting to () can be always defined as ignoring the casted value
and always returning ().

Rust's safety heavily depends on very smart people being able to prove
important claims about how the types work and interact with every other
language feature. I'd guess that adding mathematically horrid things like
this would make that work much harder, and drive away all the people whose
input Rust depends on.

le-jzr on 31 Mar 2017

👎1

I don't understand what is so supposedly 'horrid' about making a unit type into a terminal object in the diagram of possible type casts. It's mathematically sound. The RFC defining the ! type makes it coercible into anything (making it an initial object), and the sky is hardly falling. I don't see many people complaining that C is broken for making casts into (void) possible.

fstirlitz on 31 Mar 2017

I agree, it would be perfectly sound theoretically to allow any type to coerce into (): after all, it's just the same thing as manually writing let foo: () = { bar; }. I can't think of anything type-system-restricting or mathematically horrid about it. Not everything that's theoretically sound is prudent, however, and it seems like this would be error-prone in practice. Consider fn do_thing() { ... code ... return false; ... code ... }. We could turn that false into a () implicitly to match the return type of the function, but it's quite possible that this would end up masking a bug.

(It's an interesting question why this kind of error-proneness seems to be the case for coercing into (), but not for coercing out of !, given that the two are, as far as I can tell, duals.)

glaebhoerl on 31 Mar 2017

👍1

! is naturally coercible into anything because it has no values. Every value of ! is also a value of any other type. In order for () to be a supertype of everything (and a dual of !), it would need to contain ALL the values representible in Rust. In other words, traditional intuition is that when something coerces, there is no loss of information about the value. This clearly doesn't hold here.

le-jzr on 31 Mar 2017

👍2

I don't see many people complaining that C is broken for making casts into (void) possible.

C is broken in so many more important ways that casts into (void) are hardly worth mentioning.

le-jzr on 31 Mar 2017

(It's an interesting question why this kind of error-proneness seems to be the case for coercing into (), but not for coercing out of !, given that the two are, as far as I can tell, duals.)

They're not duals; coercing into () loses enormous amounts of information; coercing from ! only loses information about what code is unreachable.

withoutboats on 31 Mar 2017

👍1

They're not duals; coercing into () loses enormous amounts of information; coercing from ! only loses information about what code is unreachable.

That's not it either. To coerce a value at runtime, you need to have the value, and ! type encodes the fact that the value cannot exist. ! can coerce to anything simply because no path of execution can ever reach the coercion at run time -- it's dead code. You can't lose information in a process that doesn't happen.

le-jzr on 31 Mar 2017

(tired so may not be the best explanation)

The reason that they're duals is that () is the top type. From all types, it's possible to get a () (the ; operator). ! is the bottom type; from !, it's possible to get any type (coercions).

In other words, () is the result of affine-style weakening; you take an arbitrary T, and lose all information. ! is weakening run in reverse; you take no information, and create an arbitrary T out of thin air.

fn weaken<T>(t: T) -> () {
  t;
}

fn reverse_weaken<T>(bot: !) -> T {
  bot
}

(thanks to @eternaleye for help with phrasing)

strega-nil on 1 Apr 2017

😕1 👎1

() is not the top type. It's the type that has a single empty value (), i.e. a zero-sized type. Perhaps you are confused by the Rust semicolon, which (explicitly) throws away the value in front of it and returns ().

On the flipside, coercing from ! (which as you correctly said IS the bottom type) does not create anything out of thin air. It takes the value that was there, and returns it as another type, just like any other coercion. The specific of ! is simply that it's the empty set (this is very different from zero-sized types like ()!), so the value will never exist. In your example, reverse_weaken is impossible to invoke at runtime.

A top type, however, is a type that can hold any value. Note that a true top type cannot exist in Rust, because of value semantics and lack of automatic boxing -- values in Rust can have arbitrary static length, and a top type would need to represent them all, so no amount of memory would suffice for a single value of the top type.

le-jzr on 1 Apr 2017

👍1

While many people here make arguments out of misunderstandings about the type system, this does not mean the original proposal is without merit. Let's stop talking about breaking the type system and focus on the original problem instead?

le-jzr on 1 Apr 2017

👍1

No, () never was the top type. Perhaps what you're thinking of is Any (Scala, Boost, also in Rust's std library)?

() is the unit type, a bit like C's void. Thus fn f(...) -> () {...} is the same as fn f(...) {...}. But not at all like void* which is (possibly) a value with undefined type. void* is more like Any, except Any actually has some guarantees whereas void* doesn't.

As for @fstirlitz's suggestion of allowing any type to be implicitly convertible to (), there have already been several arguments against that and I agree with them (it would be a major change to the type system, significantly reduce type safety and might have other consequences).

As for solving break-with-value from for and while, I still haven't seen a great solution (maybe a couple of "okay" ones involving extra syntax).

dhardy on 1 Apr 2017

If folks still really want these, then we might consider adding a variant of any with a result :

fn any_map<B, F>(self, f: F) -> Option<B> where F: FnMut(Self::Item) -> Option<B> { .. }

And/or a short circuiting fold method :

fn fold_while<B, F>(self, init: B, f: F) -> B where F: FnMut(B, Self::Item) -> Result<B,B> { .. }

At least these treat the borrowing problem that makes for/while .. else .. unworkable by consolidating the borrows into F.

burdges on 1 Apr 2017

@le-jzr I feel like you misunderstand basic type theory. I am not talking about the Python or Java interpretation of types.

The wikipedia article mentions it wrt propositional calculus:

The notion of top is also found in propositional calculus, corresponding to a formula which is true in every possible interpretation.

Just because my comment uses a phrasing you are not used to, does not make my point invalid. In classical type theory, there is no "top type" as used by the GC languages, because there's no way to construct such a type.

Please don't assume I don't understand. It makes me very cranky.

strega-nil on 1 Apr 2017

😕1 👎1

@ubsan But by that interpretation, any type that corresponds to a formula that is true - that is, any inhabited type - should be a top type.

In other words, you can also implement:

fn weaken<T>(t: T) -> i32 {
  4
}

You could argue that this doesn't merely throw away information but also adds it - because this weaken is not surjective - although that gets away from the types-as-propositions interpretation. But then, what about, say, enums with a single variant, or Option<!>, or other types that are homomorphic to ()? Should they allow coercion from anything? (I guess you haven't specifically advocated for coercion, but my point is that they're just as eligible to be 'top types'.)

comex on 1 Apr 2017

Regardless of how () and ! should behave, and whatever "top type" does or doesn't mean, allowing for loops to return a value seems like a really weak argument for making a change as fundamental and far-reaching as enabling casting any type into ().

Ixrec on 1 Apr 2017

👍6

@comex On the Option<!> front:

Option<!> is equivalent to 1+0, which is isomorphic to 1: it has exactly one value (left ()), equivalent to 1, which has exactly one value (()).

i32 is not isomorphic to 1. Given bit := 1 + 1, byte := bit x bit x bit x bit x bit x bit x bit x bit, and i32 := byte x byte x byte x byte; there are ~4 billion valid values, unlike 1, which has one valid value.

For more information theory, I recommend Roshan James' "The computational content of isomorphisms".

Otoh, I don't want coercions to (); just pointing out that ! and () are duals.

strega-nil on 2 Apr 2017

@ubsan I said "homomorphic" when I meant "isomorphic", but otherwise, yes, I know.

Actually, I'm not sure I disagree with you at all.

But my point was:

i32, while not isomorphic to (), has an equal claim to being "top" when considered as a proposition. Propositions can only be true or false, corresponding to inhabited or uninhabited; both i32 and () are inhabited base types. The only difference in this sense is that T -> () is built into Rust as the ; operator, while if you want T -> i32 you have to write it yourself.
If you add the requirement that a type must have exactly one possible value to count as top-like, then () is a candidate, but so are types isomorphic to it such as Option<!>. This is where I may have misunderstood your argument. I agree that () could be viewed as dual to !; I just don't think it's necessarily desirable for the language to view it that way, such as by enabling anything-to-() coercion, or in any other respect. In other words, () is just another type that happens to be top-like; it's not the canonical top-like type.

(I say "top-like" rather than "top" because of course it's not actually a supertype of anything; Rust doesn't have much true subtyping.)

comex on 2 Apr 2017

I'm not sure that "top-like" matters very much, but the homomorphisms do (and I mean homomorphism: a bijective map which in some sense preserves operations/meaning).

@ubsan I assume you mean Option<!> and () are duals. ! and () certainly aren't (() has one value, ! has none).

Duals in the sense that they have the same number of values? Yes, but only if Option::<!>::None is actually constructible (currently it isn't).

Duals in the sense that there exists an isomorphism between the type () and Option<!>, or even just a homomorphism (one-way)? That requires an understanding that the _value_ () and Option::<T>::None are somehow equivalent, e.g. that both are "none".

I am not averse to such concepts being explored and potentially even being accepted, but I don't think here is the place.

I also don't think it would solve this problem: if Option<!> is homomorphic to () with an implicit conversion allowed (so that fn f() { for x in vec![] {} } is valid when for x in vec![] {} has type Option<!>), then the natural implication is surely that Option::<T>::None is implicitly convertible to () for all T via the same homomorphism? Actually, no, because a homomorphism from Option<T> to a tuple would presumably map Some(x) to (x), but this is not type-sound ((x) and () being different types).

I am not so happy defining a homomorphism from Option<!> to () as a special case, though I suppose it could work (it is however rather awkward and a surprising thing for new users to learn).

dhardy on 2 Apr 2017

@dhardy no, duals does not mean isomorphic, it means "(of a theorem, expression, etc.) related to another by the interchange of particular pairs of terms". i.e., forall T, T -> U or forall T, U -> T, where the first U is 1, and the second is 0.

@comex the important thing is that, given referential transparency, there is exactly one implementation of forall T, T -> 1, which is why it's the top type. No matter what type T is, it's impossible to output anything but exactly one value, ().

strega-nil on 2 Apr 2017

👍3 😕1

@ubsan Your arguments are entirely invalid, and I don't know how to explain why any more than I already did. My best guess is that you consider "top" to mean something entirely different than the intended meaning in context of type theory. As for "dual", I can't even guess what formalism you are coming from, as the concept of duality means essentially a completely different thing in each corner of mathematics it's used. That does not implicitly make the definition invalid (after all, same or similar names are used in many contexts for different things), but it does make it irrelevant in this context.

Furthermore, fact remains that Rust does not include any way to convert a value from any type to (). All rust has is the ; operator which _discards_ value on the left and produces (). In fact, by your logic, any zero-sized type would have to be considered a "top type", since for any zero-sized type (also called singleton types in some contexts), "there is exactly one implementation of forall T, T -> 1". Does that mean everything should implicitly coerce into any zero-sized type? Certainly not! That would completely break numerous type safety idioms.

So while we can go back and forth arguing what a top type is, that conversation is meaningless. It was suggested that "() is a top type" implies that "it doesn't hurt the language to implicitly coerce to ()". However, that implication derives from the definition of top type as "type that can hold any possible value without loss". It certainly doesn't hold for a singleton type, even if you lawyer it into "top" by using a different definition.

le-jzr on 2 Apr 2017

❤1 😕1

Your arguments are entirely invalid, and I don't know how to explain why any more than I already did. My best guess is that you consider "top" to mean something entirely different than the intended meaning in context of type theory.

Huh? @ubsan is using "top" the way it's always used in type theory. The top type is the type which can hold values of any other type. () can do this, but you can't access the information once you convert into it. Categorically, the top type is the type for which there's exactly one function T -> () for any given T. `i32 is inhabited but that doesn't make it the top type, Option<!> is also the top type because it's isomorphic to () but it's not Rust's canonical incarnation of the top type. Note that though, in some sense, Rust doesn't really have a top type because expressions can diverge and cause side-effect

As for "dual", I can't even guess what formalism you are coming from, as the concept of duality means essentially a completely different thing in each corner of mathematics it's used. That does not implicitly make the definition invalid (after all, same or similar names are used in many contexts for different things), but it does make it irrelevant in this context.

Given that we're talking about type theory I'd assume that @ubsan means dual in the category-theoretic sense, in which case ! and () are duals as the initial and terminal objects of the category of Rust types.

Furthermore, fact remains that Rust does not include any way to convert a value from any type to (). All rust has is the ; operator which discards value on the left and produces ().

What's the difference between "convert" and "discard"?

In fact, by your logic, any zero-sized type would have to be considered a "top type", since for any zero-sized type (also called singleton types in some contexts), "there is exactly one implementation of forall T, T -> 1".

Correct. They would all be considered the top type if we had subtyping and put them at the top of the type hierarchy.

Does that mean everything should implicitly coerce into any zero-sized type? Certainly not! That would completely break numerous type safety idioms.

Like what? I'm not saying it's a good idea from a language-design point of view, but it's type-theoretically sound.

So while we can go back and forth arguing what a top type is, that conversation is meaningless. It was suggested that "() is a top type" implies that "it doesn't hurt the language to implicitly coerce to ()". However, that implication derives from the definition of top type as "type that can hold any possible value without loss".

No it doesn't. It doesn't hurt the language to implicitly coerce to () because it loses all information when you coerce any value - since there's only one way to be in a state of no information there's no ambiguity about what the coercion means. This coercion satisfies congruence in the sense that if you want to calculate 2 + 3 as a () you can either perform the calculation with ints then convert to (), or convert the ints to ()s then perform the computation on them, and you'll get the same () either way. If you're worried about reversing the coercion and getting the information back out, you can't do this with variants either (not in a type-safe way at least).

It certainly doesn't hold for a singleton type, even if you lawyer it into "top" by using a different definition.

From the Wikipedia article on the top type:

In languages with a structural type system, the top type is the empty structure.

ie. (). If you strictly enforce type-safety and don't have downcasting (which is unsound), then this is equivalent to Variant because a Variant can never be used type-safely and so effectively contains no information.

canndrew on 7 Apr 2017

👍3 👎1

@canndrew are you advocating @fstirlitz's proposal or just defending @ubsan's logic?

I doesn't appear to me that everyone agrees a value-less type (like /dev/null) or Singleton is a top type, but allowing that, the arguments (roughly) make sense.

But I end with the same point as my last post: I am not so happy allowing an implicit conversion from Option<!> to () as a special case, though I suppose it could work (it is however rather awkward and a surprising thing for new users to learn).

dhardy on 7 Apr 2017

I replied in https://internals.rust-lang.org/t/on-type-systems-and-nature-of-the-top-type/5053.
Let's move the conversation there, it doesn't belong here any more.

le-jzr on 7 Apr 2017

👍1

Per #1767 I'm closing this issue. We now support loops to evaluate to non-() values, but we've decided that none of the solutions to making for and while evaluate to other values have small enough downsides to implement. else confuses users, !break is a very surprising syntax, and making them evaluate to Option<T> is a breaking change.

We're open to revisiting this some day if conditions change a lot. For example, possible now that break value is on stable, we'll find out we are frequently transforming for loops into loops to acess this feature. Maybe if we are close to finalizing a generator/coroutine proposal, the calculus on this will change.

withoutboats on 8 Apr 2017

👍11 👎3 ❤1

In case this ever gets revisited, how about combining @glaebhoerl's idea of moving the break into the block and using the 'final' keyword as proposed by @canndrew:

... } final { break value }

I would find the meaning obvious enough reading this code even if I was not familiar with the feature (something I can't say about python's for/else).

orent on 20 Apr 2017

👍12

In case this ever gets revisited, how about combining @glaebhoerl's idea of moving the break into the block and using the 'final' keyword as proposed by @canndrew:
... } final { break value }
I would find the meaning obvious enough reading this code even if I was not familiar with the feature (something I can't say about python's for/else).

I would suggest then instead of final, since in all currently popular languages where it exists, final(ly) means the exact opposite of getting executed only when not being break-ed before, which is getting executed whatsoever. then would avoids the sort of naming tragedy like return in the Haskell community.

then also avoids the semantical confusion brought by else, since it naturally has a sequential meaning (I eat, then I walk) in parallel with its role in the conditional combination (if/then). In places where it joints two blocks ({ ... } then { ... }) instead of a boolean and a block (x<y then { ... }), the sequential semantics prevails intuitively.

exprosic on 8 Feb 2020

👍1

For people not having enough time to read through the whole set of comments here, summary:

"workaround" exists: https://github.com/rust-lang/rfcs/issues/961#issuecomment-264699920
Issue of for/while returning Option<T> instead is code like: https://github.com/rust-lang/rfcs/issues/961#issuecomment-220613169
Issue of else clause being confusing https://github.com/rust-lang/rfcs/issues/961#issuecomment-250100291
Lang team meeting decision of too many downsides in all approaches so far (in forked issue): https://github.com/rust-lang/rfcs/issues/1767#issuecomment-292678002

I wonder if the backward-compatibility issues are no longer such an issue now that we have the "edition" feature and the associated cargo fix --edition to update old code to behave correctly even in the newer edition.

xkr47 on 15 Jun 2020

❤1

I'll suggest coda as a name, maybe?

haltman-at on 30 Sep 2020

Rfcs: Allow loops to return values other than ()

Most helpful comment

All 160 comments

Related issues