Crystal: [RFC] Exhaustive case

Created on 27 Jul 2019  Â·  43Comments  Â·  Source: crystal-lang/crystal

Related: #4837, #4870

Proposal

It would be great to have a construct like case but that would check that all types in the given expression are considered. It would also work for enums, checking that all their members are considered.

Why not make case work like that?

We _could_ try to make the existing case work like that. The problem with that approach are:

  1. it will cause a massive breaking change
  2. it's not easy to implement because: a case without an else is assumed to have a separate branch with nil
  3. you don't always want to consider all cases

Point 2 means that this:

case exp
when Int32
  a = 1
when String
  a = 2
end

gets rewritten to:

if exp.is_a?(Int32)
  a = 1
elsif exp.is_a?(String)
  a = 2
else
  nil
end

which means that a will end up being nil on the else case, and it's really hard to make the compiler rewrite that to something else that, depending on whether exhaustiveness is covered or not, takes the form of nil or NoReturn... because this check is done as a last thing in the semantic pass (the type of exp might change while we are typing the method).

It also happens that sometimes you want to do something in a few cases but you don't care about actually covering all cases.

So, I propose that we add a new construct for this. It will be similar to case except that you can only put types when checking against types, or .call? calls (or symbols?) when checking against enums.

What keyword do we use for this?

For a name we can use case!, which is backwards compatible but maybe a bit weird, or match, which might read better but is not backwards compatible and it's a bit annoying because it means you can't use the name "match" as a variable anymore, and I think that's pretty common when doing some regex stuff.

So assuming we go with case!, this:

case! exp
when Int32
  a = 1
when String
  a = 2
end

would get rewritten to:

if exp.is_a?(Int32)
 a = 1
elsif exp.is_a?(String)
  a = 2
else
  raise "unreachable!"
end

# here a will be of type Int32

with a final check to see whether all the given types cover all types in exp.

For enums

For an enum:

enum Color
  Red
  Green
  Blue
end

case! exp
when :red
  a = 1
when :green
  a = 2
when :blue
  a = 3
end

would get rewritten to something like this:

if exp.is?(:red)
  a = 1
elsif exp.is?(:green)
  a = 2
elsif exp.is?(:blue)
  a = 3
else
  # and because you can have an enum with a value outside of the range this
  # is something that might actually happen, but it's super rare (in unsafe code)
  raise "Unknown value: #{exp}"
end

I'm using is? and not == or === because we could define is? with a single overload against that same enum and because of automatic cast it will work (it doesn't work with == and === because you can compare enum to symbol).

(In fact I'm thinking we can add is? right now and eventually remove those methods generated by enums: this means you can easily compare enums against values using symbols, and compilation is simpler and faster because we don't need to generate all these methods, plus it declutters the enum docs... this was extracted to #8000)

Then you can't mix types and symbols.

else?

Finally I'm thinking that case! will not allow an else branch so you are forced to deal with all types explicitly and not guessing what remains in else.

Do we need this?

Right now there's no way to achieve this with the existing language. Well, we can do something that's close but isn't quite the same.

Let's say we have:

exp = rand < 0.5 ? 1 : "foo"

then we can do:

if exp.is_a?(Int32)
  # exp is Int32
else
  # compiler know (and you too, maybe?) exp is String
end

The problem with that is that it's not clear for a human that in the else branch exp is a String. With case! and not being able to use else it becomes much clearer:

case! exp
when Int32
  # ...
when String
  # ...
end

plus it's clear that exp can't be of any other type.

And then, if for some reason exp holds another type you will get a clear compile error, while with if/case you might get a compilation error depending on how exp is used, but it's less clear.

The same applies to enum: we can case over all enum members except the last one and in the else assume it's the last one. But it's not clear. And if we do list the last one we have to do an ugly raise "unknown value" or something like that.

feature draft

Most helpful comment

I really don't want to introduce a keyword with symbols in. It would be a departure from all existing keywords.

I think we need to keep in mind that if we introduce an exhaustive case version, that it will naturally end up in style guides as "don't use case, use match". It will become the default, and case will become advised against unless you need an else. People will ask why we have two keywords for the same thing! We will only be able to answer "because it's a breaking change". But we made it before 1.0 so people will be confused!

So in that case, I'd rather retrofit case with a compile-time warning and migrate to case being exhaustive unless you use else. This is also how most other languages do it, and it feels more natural to me. The only downside being that it's a breaking change. But we still have huge breaking changes on the horizon of the language. Most of all parallelism, which will be a huge change. So I think this recent trend of moving slowly and not breaking things is a bit premature.

I really don't like the "feeling" of assymmetricality from not allowing else either.

That being said, I'd welcome an exhaustive case of any form, if we cannot agree on this change.

All 43 comments

If this is e.g. an enum, could the compiler check to make sure all cases
are covered?

On Fri, Jul 26, 2019, 5:10 PM Ary Borenszweig notifications@github.com
wrote:

Related: #4837 https://github.com/crystal-lang/crystal/pull/4837, #4870
https://github.com/crystal-lang/crystal/pull/4870
Proposal

It would be great to have a construct like case but that would check that
all types in the given expression are considered. It would also work for
enums, checking that all their members are considered.
Why not make case work like that?

We could try to make the existing case work like that. The problem with
that approach are:

  1. it will cause a massive breaking change
  2. it's not easy to implement because: a case without an else is
    assumed to have a separate branch with nil
  3. you don't always want to consider all cases

Point 2 means that this:

case expwhen Int32
a = 1when String
a = 2end

gets rewritten to:

if exp.is_a?(Int32)
a = 1elsif exp.is_a?(String)
a = 2else
nilend

which means that a will end up being nil on the else case, and it's
really hard to make the compiler rewrite that to something else that,
depending on whether exhaustiveness is covered or not, takes the form of
nil or NoReturn... because this check is done as a last thing in the
semantic pass (the type of exp might change while we are typing the
method).

It also happens that sometimes you want to do something in a few cases but
you don't care about actually covering all cases.

So, I propose that we add a new construct for this. It will be similar to
case except that you can only put types when checking against types, or
.call? calls (or symbols?) when checking against enums.
What keyword do we use for this?

For a name we can use case!, which is backwards compatible but maybe a
bit weird, or match, which might read better but is not backwards
compatible and it's a bit annoying because it means you can't use the name
"match" as a variable anymore, and I think that's pretty common when doing
some regex stuff.

So assuming we go with case!, this:

case! expwhen Int32
a = 1when String
a = 2end

would get rewritten to:

if exp.is_a?(Int32)
a = 1elsif exp.is_a?(String)
a = 2else
raise "unreachable!"end

here a will be of type Int32

with a final check to see whether all the given types cover all types in
exp.
For enums

For an enum:

enum Color
Red
Green
Blueend

case! expwhen :red
a = 1when :green
a = 2when :blue
a = 3end

would get rewritten to something like this:

if exp.is?(:red)
a = 1elsif exp.is?(:green)
a = 2elsif exp.is?(:blue)
a = 3else
# and because you can have an enum with a value outside of the range this
# is something that might actually happen, but it's super rare (in unsafe code)
raise "Unknown value: #{exp}"end

I'm using is? and not == or === because we could define is? with a single
overload against that same enum and because of automatic cast it will work
(it doesn't work with == and === because you can compare enum to symbol).

(In fact I'm thinking we can add is? right now and eventually remove
those methods generated by enums: this means you can easily compare enums
against values using symbols, and compilation is simpler and faster because
we don't need to generate all these methods, plus it declutters the enum
docs... this was extracted to #8000
https://github.com/crystal-lang/crystal/issues/8000)

Then you can't mix types and symbols.
else?

Finally I'm thinking that case! will not allow an else branch so you are
forced to deal with all types explicitly and not guessing what remains in
else.
Do we need this?

Right now there's no way to achieve this with the existing language. Well,
we can do something that's close but isn't quite the same.

Let's say we have:

exp = rand < 0.5 ? 1 : "foo"

then we can do:

if exp.is_a?(Int32)
# exp is Int32else
# compiler know (and you too, maybe?) exp is Stringend

The problem with that is that it's not clear for a human that in the else
branch exp is a String. With case! and not being able to use else it
becomes much clearer:

case! expwhen Int32
# ...when String
# ...end

plus it's clear that exp can't be of any other type.

And then, if for some reason exp holds another type you will get a clear
compile error, while with if/case you might get a compilation error
depending on how exp is used, but it's less clear.

The same applies to enum: we can case over all enum members except the
last one and in the else assume it's the last one. But it's not clear.
And if we do list the last one we have to do an ugly raise "unknown value"
or something like that.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/crystal-lang/crystal/issues/8001?email_source=notifications&email_token=AAM4YSLMGWJDCSHLLRVERR3QBNY3ZA5CNFSM4IHHQMH2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HBZ3JMQ,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAM4YSI72UZKUAS67JGVDB3QBNY3ZANCNFSM4IHHQMHQ
.

Yes. Apart from. the rewritten code I shown, in all cases the compiler will check that all types or all enum members are covered.

Will exhaustive case allow the else branch for values not addressed in all when branches?

No, I mentioned it.

Is it because it is something hard/not possible to implement? It may be convenient sometimes. If I understand it correctly Swift works this way.

If you use else and add a new case you might not realize you need to change something. With an else the compiler has your back.

I mean, if you want to use else just use the regular case. What's the difference then?

Yeah, I guess the difference is not that big.

The regular case doesn't check if all cases are covered. It doesn't care if I add else or not, it has implicit else nil branch.

The exhaustive one will complain until I either cover every possible case in when branches or add explicit else to cover the rest of them.

I wonder... do you think you would've made case exhaustive by default if you were designing Crystal today (no code to worry you'll break)?

  1. it will cause a massive breaking change

That won't be an issue.

  1. it's not easy to implement because: a case without an else is assumed to have a separate branch with nil

I assume it's really hard to implement and might be not worth an effort. Is this situation different in other languages?

  1. you don't always want to consider all cases

You just use explicit else in that case. Nothing wrong with some explicitness I guess. It's like doing this. The intent here is very clear versus implicit else in normal case when you have to wonder if that was an original intent or some when branch is missing.

  # some comment about why we do nothing here

I mean it kind of makes sense for Crystal to be more type safe, more strict, more explicit compared to ruby. Having to use explicit else when not all cases are covered is not a big deal if you ask me. It may be even a good thing.

I really don't want to introduce a keyword with symbols in. It would be a departure from all existing keywords.

I think we need to keep in mind that if we introduce an exhaustive case version, that it will naturally end up in style guides as "don't use case, use match". It will become the default, and case will become advised against unless you need an else. People will ask why we have two keywords for the same thing! We will only be able to answer "because it's a breaking change". But we made it before 1.0 so people will be confused!

So in that case, I'd rather retrofit case with a compile-time warning and migrate to case being exhaustive unless you use else. This is also how most other languages do it, and it feels more natural to me. The only downside being that it's a breaking change. But we still have huge breaking changes on the horizon of the language. Most of all parallelism, which will be a huge change. So I think this recent trend of moving slowly and not breaking things is a bit premature.

I really don't like the "feeling" of assymmetricality from not allowing else either.

That being said, I'd welcome an exhaustive case of any form, if we cannot agree on this change.

I also think having a breaking change is acceptable.

Requiring an empty dangling else in order to cover only a few cases is IMO also better than having two different keywords with almost identical behaviour, which won't be easy to tell apart. Using else explains itself.

We already tried making else required, it was really tedious and boring. You don't always want to match every case...

@rmarronnier Please ask this question on the forum: https://forum.crystal-lang.org Thank you. :heart:

@asterite it's not tedious or boring for new code, just adding it to existing code.

If it was always there, I doubt there would be much complaint about it.

I think the case! brought by @asterite idea was good, but for the opposite effect.

  • case is exhaustive, "safe" by default
  • if we want less safety, add an exclamation mark to use case!, which is the current case.

Using ! is on par with the other parts of the stdlib, to warn of possible unwanted effects, for example:

I'd be fine with case and case! being non-exhaustive. They should both support else though.

Looking at https://github.com/crystal-lang/crystal/pull/4837/files there doesn't seem to be many cases where we had to add an else clause. That said, it looks a bit strange seeing those # nothing, # ignore, etc. comments. But maybe it's better to actually enforce else to give you a chance to think about the cases you are missing, I don't know.

My main with changing the semantic of case is:

  • it's unlike Ruby
  • you don't always want to cover all cases so it's tedious to write an empty else clause
  • it will cause a massive breaking change

But... I think it might be a good idea to make the language more strict and safe by default.

@asterite I think just

case foo
when "bar"
  # ...
else nil
end

is the best-looking version. Although you can skip the nil as well.

To play the other side of the story:

  • it's unlike Ruby

Where we diverge from ruby, it's usually for strictness, this fits that.

  • you don't always want to cover all cases so it's tedious to write an empty else clause

It's actually not that common, and empty else clauses are super easy to type.

  • it will cause a massive breaking change

Better now than never. We can introduce it as a warning first.

One thing that's interesting to me is whether case over types which can't be exhaustively matched need else nil or if the exhaustiveness behaviour of case depends on the type of the condition.

I'm actually fine merging an implementation which compromises on this. We need some kind of exhaustiveness checking in the language, even if it has caveats.

interesting
https://kotlinlang.org/docs/reference/control-flow.html

If when is used as an expression, the else branch is mandatory, unless the compiler can prove that all possible cases are covered with branch conditions (as, for example, with enum class entries and sealed class subtypes).

Elixir-lang also has when keyword, which requires an else branch to be present.

@vlazar @straight-shoota @Sija Thank you!

Now I think we should have a mandatory else, otherwise the compiler has to prove that all possibilities are covered.

Many comments were about finding out where this catches bugs. I don't think this is related to bugs, it's related to code clarity and getting better compiler errors (my first comment talks about this in the section "Do we need this?").

Some day I'll talk with @bcardiff and @waj about this and we'll reach consensus.

I still think the same as in https://github.com/crystal-lang/crystal/pull/4837#issuecomment-394105269 (that was already referenced by @straight-shoota here).

I like the idea of exhaustive checks by default, but the semantics of the control flow needs to be the same, always. In that case, the default else nil would need to go.

That will mean that the user will need to write else nil or else raise "should-not-happen". The else nil is not terrible and the else raise "should-not-happen" is probably a design smell.

So far I am in favor, but I think this is something that needs to go one level deeper in the language.

If we were to have an unreachable keyword that would cause a compile-time error. Then, the default else would be else unreachable. And the exhaustive check is reframed in a code transformation to remove when/else nodes from a case. Or since the case is a rewrite of if statements, the analysis could be done as identifying unneeded branches. Then, the user will also be able to use unreachable in if statements and others. The location of the implicit unreachable is the location of the case and the error reporting will be able to say something about it.

I know the API would not change whether it is done one way or another but seems more like a cornerstone if we base it on unreachable.

I am also not worried about doing breaking changes. BUT I want to have a smooth migration process. It's like an exercise. Unless we don't do it now we won't learn how to do it properly when it matters.

@bcardiff So if I understand what you are saying, if I have this:

case exp
when Int32
  # ...
when String
  # ...
end

if I want to make it clear to the compiler that I'm covering all cases I will need to write:

case exp
when Int32
  # ...
when String
  # ...
else unreachable
end

?

What about telling the compiler I don't want to cover all types? Is that the original code (without an else branch)? If the answer is "yes" then there's an implicit "else nil" around which is what I basically is problematic, because an invisible nil will crawl to other parts of the code and cause a compile error that's hard to understand.

Also, right now if we want to make sure, somehow, that we cover all types we do:

case exp
when Int32
  # ...
when String
  # ...
else raise "Bug: unknown type: #{exp}"
end

So with unreachable it's still like writing that raise but replacing it with a new keyword, except that the compiler will check that you cover all cases.

So my thoughts about unreachable are:

  • It's not clear how to say you don't want to make the compiler check for cover (do you omit the else clause? do you have to write else nil?)
  • It adds a new keyword and concept to the language, and we'll have to revise the entire language to see how and where unreachable is used and how to detect and say errors regarding it
  • It feels a bit low-level to me, like you are instructing the compiler to check for something instead of the compiler doing it automatically for you unless you tell it not to (using else nil)
  • The whole point of adding automatic coverage check is to avoid having to write else raise "shouldn't happen" but with this it seems we still need to write code for it: else unreachable which sounds even more bothersome than adding else nil in the other case

The alternative of making case cover all types by default, otherwise you need to add else nil has these advantages:

  • no new keyword or concept
  • scoped only to case
  • similar to many languages (Rust, Kotlin)

As for moving forward without breaking changes immediately, I can imagine the compiler can always check for coverage and emit a warning when that doesn't happen, suggesting to add an explicit else nil because in future versions the code will stop compiling.

I meant for

case exp
when Int32
  # ...
when String
  # ...
end

to expand to

case exp
when Int32
  # ...
when String
  # ...
else
  unreachable
end

If the programmer wants to avoid the exhaustive check an else nil / else raise should be used explicitly.

The case is then translated to if.
The branch removal is needed only if there is reachable unreachable (isn't it nice?).
The type restriction logic is already there for if statements as pointed in the "do we need this" section.

@bcardiff So for case expressions, unreachable would only be an internal implementation detail?

Why do we need this as a keyword then?
You said:

Then, the user will also be able to use unreachable in if statements and others.

But is there really a use case for this? And should this be treated together with this exhaustive case discussion? I mean, we could implement exhaustive case by default without another keyword. The keyword would be just on top for other use cases. But I think these use cases (and the discussion about them) can be delayed, and thus the keyword.

Oh, if unreachable is something that can be used to implement exhaustive case, but it's not necessary to implement it, we can always defer that for later and implement exhaustive case without it (there's already a PR for that, we would need to rebase it against master and tweak it a bit, but it was working).

And should this be treated together with this exhaustive case discussion?

Same thoughts here. I understand that unreachable can be useful in other cases other than case (no pun intended :-P), but if it's not necessary to implement it then it can be dealt separately.

My main concern is about case stop being a syntax sugar and increasing the minimal/core crystal language. Its semantics won't be able to be expressed with other constructs.

The case is not a simple construct with all the syntax variations that are supported.

Its semantics won't be able to be expressed with other constructs.

But I think this is what we are looking for here. We don't want case to be a simple syntax sugar anymore, we want it to provide more guarantees in our code. And I think it's fine if it can't be expressed with other constructs. In Rust match can't be expressed with other constructs. Same thing with kotlin's switch where you have to make sure all cases are covered, otherwise you have to put an else.

But unreachable would lead to the same behavior in a more elemental way. Allowing the case to still be written with other constructs. If, type filter, unreachable plus code elimination.

I worried that it might be an issue in the future if we increase the core aspects of the language. Both for theoretical and practical issues.

But unreachable would lead to the same behavior in a more elemental way

Well, we could also introduce goto and make if and while be rewritten to that... but I think goto is too low-level and I consider unreachable the same way, specially because I can't come up with other use cases for unreachable other than case.

For example unreachable exists as a macro in Rust but it's just a way to say raise "can't happen", the compiler won't try to prove that the code is really unreachable. I can't think how the compiler will try to figure out whether some code is unreachable based on arbitrary conditions and not restricted ones (covering all types or all enum members).

I do respect the opinion that it would be great if more and more constructs can be expressed as lower-level ones (we already do that with string interpolation, until, unless), I just don't think that we should extend the language with a new construct just to be able to implement another construct, where we could do it without new things.

If I were you guys, I'll implement the unreachable version, as it would allow to turn any if into a coverage check. Then you plug in an SMT solver for anything that is not as easy as an enum...

I'm mostly joking, but it occurred to me that unreachable can be nice for writing things like:

if 0 < i < n then a = 0 elsif i <= 0 then a = 1 elsif i >= n then a = 2 else unreachable end

The compiler would then check that indeed that else is unreachable.

This said, for now I would go for the most simple option that gets the job done, and then change it as needed. I guess that means having case be part of the compiler.

Not bad the example of an SMT solver!

@asterite my argument was about not getting the language much bigger, the goto/if is about shrinking it. It's not the same.

But well, let's focus on the nice side, we seem to agree that:

  • the implicit else nil should go away
  • a case without else will trigger and exhaustive check over the value
  • if the compiler is unable to perform a check (case str; when "foo"; ... end) it will request that an explicit else is needed
  • if the exhaustive check reports a missing case it will generate a compile-time error.

Am I summing thing right?

While at it, why not thinking of incorporating Rust's like enums? (that is, Algebraic Data Types)

Am I summing thing right?

Yes!

While at it, why not thinking of incorporating Rust's like enums? (that is, Algebraic Data Types)

Hola Beta!

Well, we can already have something like that:

record Point2, x : Int32, y : Int32
record Point3, x : Int32, y : Int32, z : Int32

alias Point = Point2 | Point3

And with the proposed changed you will be able to do:

case point
when Point2
  # ...
when Point3
  # . ...
end

and have the compiler yell at you if you add another type to the alias.

Then we can think about a macro that makes it easier to define such things, but I don't know if that's necessary.

Hola! :-)

But you can't write recursive records, right? Or how would you specify a binary tree?

Well, you can have recursive aliases but aliases can't yet be generics, and then that kind of programming doesn't suit Crystal well or it's not common. After all Crystal is OOP. For some cases using algebraic data types is simpler than using inheritance. But I wouldn't use it all the time and for everything. That's why I'm also not sure about adding a macro for it. A binary tree can be implemented with a class just fine.

I'll take the discussion elsewhere (but for the record, Rust disagree :-) )

@asterite @beta-ziliani a middle ground is to emulate Scala's "sealed classes" (also in Kotlin), where you have an abstract class and a bunch of subclasses, but once you close the file there can be no other subclass. That would give case on this class a simple way to check/enforce exhaustiveness, so you can drop the else when it's exhaustive.

This can be a clean and convenient alternative to unions such as the one in Json::Any (just use sealed struct).

This can probably be closed now that #8424 is merged?

Although #8424 says it is a Part of: #8001. Does that mean some part of #8001 is not implemented by #8424?

It's because it's only a warning for now. The idea is that it's eventually an error. I'll do that for 0.35.0

Was this page helpful?
0 / 5 - 0 ratings