This is a proposal to add an enum construct to Scala's syntax. The construct is intended to serve at the same time as a native implementation of enumerations as found in other languages and as a more concise notation for ADTs and GADTs. The proposal affects the Scala definition and its compiler in the following ways:
enum.scala.Enum and a predefined runtime class scala.runtime.EnumValues.This is all that's needed. After desugaring, the resulting programs are expressible as normal Scala code.
enums are essentially syntactic sugar. So one should ask whether they are necessary at all. Here are some issues that the proposal addresses:
Enumerations as a lightweight type with a finite number of user-defined elements are not very well supported in Scala. Using integers for this task is tedious and loses type safety. Using case objects is less efficient and gets verbose as the number of values grows. The existing library-based approach in the form of Scala's Eumeration object has been criticized for being hard to use and for lack of interoperability with host-language enumerations. Alternative approaches, such as Enumeratum fix some of these issues, but have their own tradeoffs.
The standard approach to model an ADT uses a sealed base class with final case classes and objects as children. This works well, but is more verbose than specialized syntactic constructs.
The standard approach keeps the children of ADTs as separate types. For instance, Some(x) has type Some[T], not Option[T]. This gives finer type distinctions but can also confuse type inference. Obtaining the standard ADT behavior is possible, but very tricky. Essentially, one has to make the case class abstract and implement the apply method in the companion object by hand.
Generic programming techniques need to know all the children types of an ADT or a GADT. Furthermore, this information has to be present during type-elaboration, when symbols are first completed. There is currently no robust way to do so. Even if the parent type is sealed, its compilation unit has to be analyzed completely to know its children. Such an analysis can potentially introduce cyclic references or it is not guaranteed to be exhaustive. It seems to be impossible to avoid both problems at the same time.
I think all of these are valid criticisms. In my personal opinion, when taken alone, neither of these criticisms is strong enough to warrant introducing a new language feature. But taking them together could shift the balance.
We define a new kind of enum class. This is essentially a sealed class whose instances are given by _cases_ defined in its companion object. Cases can be simple or parameterized. Simple cases without any parameters map to values. Parameterized cases map to case classes. A shorthand form enum E { Cs } defines both an enum class E and a companion object with cases Cs.
Here's a simple enumeration
enum Color {
case Red
case Green
case Blue
}
or, even shorter:
enum Color { case Red, Green, Blue }
Here's a simple ADT:
enum Option[T] {
case Some[T](x: T)
case None[T]()
}
Here's Option again, but expressed as a covariant GADT, where None is a value that extends Option[Nothing].
enum Option[+T] {
case Some[T](x: T)
case None
}
It is also possible to add fields or methods to an enum class or its companion object, but in this case we need to split the `enum' into a class and an object to make clear what goes where:
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some[+T](x: T) {
def isDefined = true
}
case None {
def isDefined = false
}
}
The canonical Java "Planet" example (https://docs.oracle.com/javase/tutorial/java/javaOO/enum.html) can be expressed
as follows:
enum class Planet(mass: Double, radius: Double) {
private final val G = 6.67300E-11
def surfaceGravity = G * mass / (radius * radius)
def surfaceWeight(otherMass: Double) = otherMass * surfaceGravity
}
object Planet {
case MERCURY extends Planet(3.303e+23, 2.4397e6)
case VENUS extends Planet(4.869e+24, 6.0518e6)
case EARTH extends Planet(5.976e+24, 6.37814e6)
case MARS extends Planet(6.421e+23, 3.3972e6)
case JUPITER extends Planet(1.9e+27, 7.1492e7)
case SATURN extends Planet(5.688e+26, 6.0268e7)
case URANUS extends Planet(8.686e+25, 2.5559e7)
case NEPTUNE extends Planet(1.024e+26, 2.4746e7)
def main(args: Array[String]) = {
val earthWeight = args(0).toDouble
val mass = earthWeight/EARTH.surfaceGravity
for (p <- enumValues)
println(s"Your weight on $p is ${p.surfaceWeight(mass)}")
}
}
Changes to the syntax fall in two categories: enum classes and cases inside enums.
The changes are specified below as deltas with respect to the Scala syntax given here
Enum definitions and enum classes are defined as follows:
TmplDef ::= `enum' `class’ ClassDef
| `enum' EnumDef
EnumDef ::= id ClassConstr [`extends' [ConstrApps]]
[nl] `{’ EnumCaseStat {semi EnumCaseStat} `}’
Cases of enums are defined as follows:
EnumCaseStat ::= {Annotation [nl]} {Modifier} EnumCase
EnumCase ::= `case' (EnumClassDef | ObjectDef | ids)
EnumClassDef ::= id [ClsTpeParamClause | ClsParamClause]
ClsParamClauses TemplateOpt
TemplateStat ::= ... | EnumCaseStat
Enum classes and cases expand via syntactic desugarings to code that can be expressed in existing Scala. First, some terminology and notational conventions:
E as a name of an enum class, and C as a name of an enum case that appears in the companion object of E.We use <...> for syntactic constructs that in some circumstances might be empty. For instance <body> represents either the body of a case between {...} or nothing at all.
Enum cases fall into three categories:
[...] or with one or more (possibly empty) parameter sections (...).Simple cases and value cases are called collectively _singleton cases_.
The desugaring rules imply that class cases are mapped to case classes, and singleton cases are mapped to val definitions.
There are seven desugaring rules. Rules (1) and (2) desugar enums and enum classes. Rules (3) and (4) define extends clauses for cases that are missing them. Rules (4 - 6) define how such expanded cases map into case classes, case objects or vals. Finally, rule (7) expands comma separated simple cases into a sequence of cases.
An enum definition
enum E ... { <cases> }
expands to an enum class and a companion object
enum class E ...
object E { <cases> }
An enum class definition
enum class E ... extends <parents> ...
expands to a sealed abstract class that extends the scala.Enum trait:
sealed abstract class E ... extends
If E is an enum class without type parameters, then a case in its companion object without an extends clause
case C <params> <body>
expands to
case C <params> <body> extends E
If E is an enum class with type parameters Ts, then a case in its companion object without an extends clause
case C <params> <body>
expands according to two alternatives, depending whether C has type parameters or not. If C has type parameters, they must have the same names and appear in the same order as the enum type parameters Ts (variances may be different, however). In this case
case C [Ts] <params> <body>
expands to
case C[Ts] <params> extends E[Ts] <body>
For the case where C does not have type parameters, assume E's type parameters are
V1 T1 > L1 <: U1 , ... , Vn Tn >: Ln <: Un (n > 0)
where each of the variances Vi is either '+' or '-'. Then the case expands to
case C <params> extends E[B1, ..., Bn] <body>
where Bi is Li if Vi = '+' and Ui if Vi = '-'. It is an error if Bi refers to some other type parameter Tj (j = 0,..,n-1). It is also an error if E has type parameters that are non-variant.
A class case
case C <params> ...
expands analogous to a case class:
final case class C <params> ...
However, unlike for a regular case class, the return type of the associated apply method is a fully parameterized type instance of the enum class E itself instead of C. Also the enum case defines an enumTag method of the form
def enumTag = n
where n is the ordinal number of the case in the companion object, starting from 0.
A value case
case C extends <parents> <body>
expands to a value definition
val C = new <parents> { <body>; def enumTag = n; $values.register(this) }
where n is the ordinal number of the case in the companion object, starting from 0.
The statement $values.register(this) registers the value as one of the enumValues of the
enumeration (see below). $values is a compiler-defined private value in
the companion object.
A simple case
case C
of an enum class E that does not take type parameters expands to
val C = $new(n, "C")
Here, $new is a private method that creates an instance of of E (see below).
A simple case consisting of a comma-separated list of enum names
case C_1, ..., C_n
expands to
case C_1; ...; case C_n
Any modifiers or annotations on the original case extend to all expanded cases.
Non-generic enum classes E that define one or more singleton cases are called _enumerations_. Companion objects of enumerations define the following additional members.
enumValue of type scala.collection.immutable.Map[Int, E]. enumValue(n) returns the singleton case value with ordinal number n.enumValueNamed of type scala.collection.immutable.Map[String, E]. enumValueNamed(s) returns the singleton case value whose toString representation is s.enumValues which returns an Iterable[E] of all singleton case values in E, in the order of their definitions.Companion objects that contain at least one simple case define in addition:
A private method $new which defines a new simple case value with given ordinal number and name. This method can be thought as being defined as follows.
def $new(tag: Int, name: String): ET = new E {
def enumTag = tag
def toString = name
$values.register(this) // register enum value so that `valueOf` and `values` can return it.
}
The Color enumeration
enum Color {
case Red, Green, Blue
}
expands to
sealed abstract class Color extends scala.Enum
object Color {
private val $values = new scala.runtime.EnumValues[Color]
def enumValue: Map[Int, Color] = $values.fromInt
def enumValueNamed: Map[String, Color] = $values.fromName
def enumValues: Iterable[Color] = $values.values
def $new(tag: Int, name: String): Color = new Color {
def enumTag: Int = tag
override def toString: String = name
$values.register(this)
}
final case val Red: Color = $new(0, "Red")
final case val Green: Color = $new(1, "Green")
final case val Blue: Color = $new(2, "Blue")
}
The Option GADT
enum Option[+T] {
case Some[+T](x: T)
case None
}
expands to
sealed abstract class Option[+T] extends Enum
object Option {
final case class Some[+T](x: T) extends Option[T] {
def enumTag = 0
}
object Some {
def apply[T](x: T): Option[T] = new Some(x)
}
val None = new Option[Nothing] {
def enumTag = 1
override def toString = "None"
$values.register(this)
}
}
Note: We have added the apply method of the case class expansion because
its return type differs from the one generated for normal case classes.
An implementation of the proposal is in #1958.
On the Java platform, an enum class may extend java.lang.Enum. In that case, the enum as a whole is implemented as a Java enum. The compiler will enforce the necessary restrictions on the enum to make such an implementation possible. The precise mapping scheme and associated restrictions remain to be defined.
One advantage of the proposal is that it offers a reliable way to enumerate all cases of an enum class before any typechecking is done. This makes enums a good basis for generic programming. One could envisage compiler-generated hooks that map enums to their "shapes", i.e. typelevel sums of products. An example of what could be done is elaborated in a test in the dotty repo.
A very nice explanation of the new feature 👍
There seems to be an inconsistency between the desugaring Rule 5 and the following code example:
enum Option[+T] {
case Some(x: T)
case None extends Option[Nothing]
}
If I understand correctly, the desugaring Rule 5 says that for the case None, it is an error for Option to take type parameters.
A case without explicitly given type or value parameters but with an explicit extends clause or body
case C extends |parents| |body|
expands to a value definition
val C = new |parents| { |body|; def enumTag = n }
where n is the ordinal number of the case in the companion object, starting from 0. __It is an error in this case if the enum class E takes type parameters__.
Another minor question is, it seems the following code in the example expansions does not type check:
object Some extends T => Option[T] {
def apply[T](x: T): Option[T] = new Some(x)
}
We need to remove the part extends T => Option[T]?
@liufengyun
If I understand correctly, the desugaring Rule 5 says that for the case None, it is an error for Option to take type parameters.
Well spotted. This clause should go to rule 6. I fixed it.
Another minor question is, it seems the following code in the example expansions does not type check
You are right. We need to drop the extends clause.
In the following introductory example:
~ dotty
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def applyT = if (x != null) Some(x) else None
case Some(x: T) {
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
~
I find it a little bit confusing that in the case Some(x: T) definition the type parameter T is bound to the one defined in enum class Option[+T]. I think it is the first time that symbol binding crosses lexical scopes.
Also, how would that interact with additional type parameters?
~ dotty
case SomeA
~
Also, how would that interact with additional type parameters?
We have to disallow that.
Keeping type parameters undefined looks more like an artifact of desugaring and Dotty's type system than a feature to me. Are there any cases where this would actually be useful?
enum Option[T] {
case Some(x: T)
case None()
}
OTOH, covariant type parameters look very useful and are common in immutable data structures. Could this case be simplified?
enum Option[+T] {
case Some(x: T)
case None extends Option[Nothing]
}
How about automatically filling in unused type parameters in cases as their lower (covariant) or upper (contravariant) bounds and only leaving invariant type parameters undefined?
- It should be possible to model Java enumerations as Scala emumerations.
Instead of only exposing Java enums to Scala in this way, Is there a well-defined subset of Scala enumerations that can be compiled to proper Java enums for the best efficiency and Java interop on the JVM?
I'm proposing modification to the longer syntax:
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some[T](x: T) { // <-- changed
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
In this case the T is obviously bound in the scope. It still desugars to the same thing, but I feel it's more regular and it allows to rename the type argument:
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some[U](x: U) extends Option[U] { // <-- changed
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
On the meeting, we've also proposed an additional rule:
require that all extends clauses in case-s list the enum super-class.
This will run the code below invalid:
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some(x: Int) extends AnyRef { // <-- Not part of enum
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
@DarkDimius I think this is still insufficient because it is still (a little bit) confusing that the T type parameter of case Some[T] is automatically applied to the T type parameter of the parent Option[+T] class.
Despite these inconveniences, I think that the shorter syntax is a huge benefit, so I find it acceptable to have just case Some(x: T) as a shorthand for case class Some[T](x: T) extends Option[T]. It is always possible to fallback to usual sealed traits and case classes for the cases we need more fine grained control (e.g. case class Flipped[A, B]() extends Parent[B, A] can not be expressed with case enums).
One more point discussed on the dotty meeting:
there should be additional limitation that no other class can extend abstract case class. Otherwise the supper-class isn't a sum of it's children and serializationpatmat won't be able to enumarate all children.
It is always possible to fallback to usual sealed traits and case classes for the cases we need more fine grained control
Sealed classes give less guarantees. The point of this addition is that you _cannot_ get equivalent guarantees from sealed classes.
e.g.
case class Flipped[A, B]() extends Parent[B, A]can not be expressed with case enums).
Given currently proposed rules it can be expressed, you simply need to write it explicitly using the longer vesion.
However, unlike for a regular case class, the return type of the associated apply and copy methods is a fully parameterized type instance of the enum class E itself instead of C
Am I understanding correctly that the following occurs?
enum IntWrapper {
case W(i:Int)
case N
}
val i = IntWrapper(1)
some match {
case (w:W) =>
w.copy(i = 2)
.copy(i = 3) //this line won't compile because the previous copy returned an IntWrapper
case N => ???
}
If so then it seems like copy should still return C
If so then it seems like copy should still return C
That's a good argument. I dropped copy from the description.
Instead of only exposing Java enums to Scala in this way, Is there a well-defined subset of Scala enumerations that can be compiled to proper Java enums for the best efficiency and Java interop on the JVM?
AFAICT, any enum containing only simple cases (i.e., without ()) can be compiled to Java enums, and exposed as an enum to Java for interop. This even includes enum classes with cases that redefine members.
For enumerations I would love to see a valueOf method with String => E type as well, to look values up by name as well as ordinal.
i'd probably never use a naked Int=>E valueOf method for fear of exceptions; i'd very much prefer Int=>Option[E]. or maybe (if that's still a thing in dotty) something like paulp's structural pseudo-Option used in pattern matching.
oh, and if there is a String=>E (Option preferred, of course), then why not an E=>String, too?
This looks great!
I don't think the long form is an improvement, though. The case keyword is all you need to disambiguate the members of the ADT from other stuff.
enum Either[+L, +R] {
def fold[Z](f: L => Z, g: R => Z): Z
case Left(value: L) {
def fold[Z](f: L => Z, g: Nothing => Z) = f(value)
}
case Right(value: R) {
def fold[Z](f: Nothing => Z, g: R => Z) = g(value)
}
}
I don't see any issues here. I agree with Stefan that generics should be handled automatically by default, and have the type parameter missing and filled in as Nothing if the type is not referenced. If you want something else, you can do it explicitly.
case Right[+L, +R](value: R) extends Either[L, R]
Currently this is looking great! I wrote Enumeratum and would be happy to see something like this baked into the language :)
Just a few thoughts/questions based on feedback I've received in the past:
valueOf non-throwing (returns Option) by default. Slightly easier to reason about and might even be fasterwithName method might also be nice to haveenumTag for a given enum member so that users can control the resolution valueOf? If so, it might be nice to have the compiler check for uniqueness too :)AFAICT, any enum containing only simple cases (i.e., without ()) can be compiled to Java enums, and exposed as an enum to Java for interop. This even includes enum classes with cases that redefine members.
Compiling to Java enums has some downsides:
java.lang.Enumjava.lang.Enum might not be desired (e.g. case Person(name: Name) would not be allowed because java.lang.Enum.name()String is final, so the accessor method for name would clash.This suggests to me that we need an opt-in (or maybe an opt-out) annotation for this compilation strategy.
Java enums are exposed the the Scala typechecker as though they were constant value definitions:
scala> symbolOf[java.lang.annotation.RetentionPolicy].companionModule.info.decls.toList.take(3).map(_.initialize.defString).mkString("\n")
res21: String =
final val SOURCE: java.lang.annotation.RetentionPolicy(SOURCE)
final val CLASS: java.lang.annotation.RetentionPolicy(CLASS)
final val RUNTIME: java.lang.annotation.RetentionPolicy(RUNTIME)
scala> showRaw(symbolOf[java.lang.annotation.RetentionPolicy].companionModule.info.decls.toList.head.info.resultType)
res24: String = ConstantType(Constant(TermName("SOURCE")))
This is something of an implementation detail, but is needed:
The enums from this proposal will need a similar approach, and I think that should be specced.
@ichoran The long form is intended to allow for
A played with various variants but found none that was clearer than what was eventually proposed. If one is worried about scoping of the type parameter one _could_ specify that the long form is a single syntactic construct
enum <ident> <params> extends <parents> <body>
[object <ident> extends <parents> <body>]
and specify that any type parameters in <params> are visible in the whole construct. That would be an option.
@ritschwumm
I'd probably never use a naked Int=>E valueOf method for fear of exceptions; i'd very much prefer Int=>Option[E]. or maybe (if that's still a thing in dotty) something like paulp's structural pseudo-Option used in pattern matching.
What about making valueOf an immutable map?
@retronym Thanks for the analysis wrt Java enums. It seems like an opt-in is the best way to do it. How about we take inheritance from java.lang.Enum as our cue? I.e.
enum JavaColor extends java.lang.Enum {
case Red
case Green
case Blue
}
Then there would be no surprise that we cannot redefine name because it is final in java.lang.Enum.
Also, can you suggest spec language for the constantness part?
A
withNamemethod might also be nice to have
I agree. This would also be required for most of the useful generic programming stuff we want to do (e.g. automatically generate serializers/deserializers for enumerations based on their name rather than their ordinal).
@szeiger I agree it would be nice if we could fill in extremal types of co/contravariant enum types, i.e. expand
case None
to
case None extends Option[Nothing]
But maybe it's too much magic? Have to think about it some more.
A withName method might also be nice to have
I agree. This would also be required for most of the useful generic programming stuff we want to do (e.g. automatically generate serializers/deserializers for enumerations based on their name rather than their ordinal).
Agreed. But that means we'd have to design that feature with the generic programming stuff, because it would likely end up on the type level? Not sure abut this point.
I pushed a new version where enumerations now define three public members:
How does the result type of apply affect cases with type parameters that do not coincide with the superclass' type parameters?
Besides, the rationale for using the super class as the apply result type is that it will be more friendly to type inference. However, I fail to come up with a realistic example that would not correctly infer before, and correctly infer with this feature. For example, the typical type inference problem:
val x: List[Int] = ???
val reversed = x.foldLeft(Nil)((ys, y) => y :: ys)
In current Scala, the type parameter of foldLeft is inferred as Nil.type, and then y :: ys does not conform to that. With the scheme proposed here, it still fails to compile, because the type parameter is inferred as List[Nothing].
The exact same would happen to None and Option[Nothing].
Is there any (realistic, common) snippet that would fail to compile before, and succeed now?
@sjrd
~ scala
colors.foldLeft(Color.Black)((result, color) => someOp(result, color))
~
Where someOp computes a color based on two colors.
However, your point with List[Nothing] and Option[Nothing] still holds :-(
Where someOp computes a color based on two colors.
Have you ever actually seen such a case? I.e., an operation that takes an enum value and something, and returns a new value of the same enum set?
There's a reason I insisted on realistic, common. We sure can come up with snippets, but that does not count.
Here is a snippet adapted from a real example:
~~~ scala
sealed trait Step
case class A(b: Boolean) extends Step
case class B(s: String) extends Step
xs.foldLeftStep( … )
~~~
@retronym @odersky I like the idea of opting into Java enum compatibility by extending java.lang.Enum. The spec just needs to be compatible with Java enums (https://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.9.3) in principle so that a compiler can emit them in these cases. In particular, defining both the Scala enum values method from the current implementation and the static values method required by Java enums could be problematic.
i'd probably never use a naked Int=>E valueOf method for fear of exceptions; i'd very much prefer Int=>Option[E]. or maybe (if that's still a thing in dotty) something like paulp's structural pseudo-Option used in pattern matching.
Can't agree with this more. If we are going to the trouble of adding enums, we really need a String => Option[T] function which lets you look up an enum by its String representation. Thats a lot of the reason why libraries like https://github.com/lloydmeta/enumeratum are required in order to do basic stuff that is already available by default in other languages
Otherwise I agree with the proposal in general
After thinking about values some more: Enumerations are mostly "normal" bytecode but with some special support by the Java compiler. In this sense they are similar to varargs, so we could treat values: Array[E] vs values: Iterable[E] in the same way:
When defining an enum that extends java.lang.Enum in Scala, generate values method with the Java signature instead of the one with the Scala signature.
When using any Java enum (whether defined in Java or Scala) from Scala, treat its values method as having the Scala signature (with the same downside as for Java varargs: every call needs to allocate a wrapper Seq).
I updated the proposal to add withName, and make both valueOf and withName maps. The implementation #1958 has also been updated. Still to do: Clarify the connection to Java enums and what it means for values.
I'm VERY excited with this proposal. I spent +50 hours working on a solution +2 years ago, planning to turn it into a macro. And to solve the core problem, Odersky is the one that finally got me to see a key fix was to use abstract case class.
It thrills me to the moon that I can consider abandoning my own implementation.
@sjrd this doesn't take typeclasses into account. Having an Option[Nothing] is still miles more useful than None.type if for example you care about using a Monad[Option] with it.
Is there a precedent for having the compiler output depend on scala.Map?
@odersky thinking about it again, having a Map[Int,E] and an additional Iterable[E] is overkill. a simple (preferrably immutable) Seq is just as good. it's not that big of a difference whether i call Map#get or Seq#lift.
Is there a precedent for having the compiler output depend on scala.Map?
Afaik there isn't no such precedent currently currently in Dotty.
If my answer here is accurate, there isn't any precedent in scalac either.
Will case class and sealed be removed in Scala 3?
Will case class and sealed be removed in Scala 3?
Case class: definitely not. We need open as well as closed sums. Sealed: will also stay because there are more complex class hierarchies that cannot be explained by enums but that are still confined to one compilation unit.
@odersky , is it possible to allow extending an enum in the same source file?
I didn't feel good if the language has two very similar features: sealed trait and enum
I did not like the nested definition in the enum syntax, too.
It brings Java style static member definitions back. Static member definitions are inconsistent with type / companion separation conventions in Scala language, and cause confusion on Scala's path-dependent type syntax.
It looks like the only reason for case val is sharing Java class files between enum values. If it is the case, why not apply this optimization on all object?
Is it possible to share the class file for all empty-body objects that have the same super types and same modifiers (no matter whether they are case object or not) ?
I didn't feel good if the language has two very similar features: sealed trait and enum
They are not just similar, but related: One expands into the other. There's arguably lots of precedent in Scala for this. Think about how function values expand into objects, or how for expressions expand into map,flatMap,withFilter operations.
Is it possible to share the class file for all empty-body objects that have the same super types and same modifiers (no matter whether they arecase object or not) ?
That's a good question. There's a problem with the fact that each object is supposed to define its own class. I am not sure to what degree one can ignore that. Also, there's the issue that objects are lazy and we want non-lazy vals for efficiency.
how for expressions expand into map,flatMap,withFilter operations.
I bet the for/yield expanding is a nightmare for scala.meta guys. It also troubled macro authors like me, because we have to deal with two different syntaxes with the (almost) equal AST.
One expands into the other
One key difference, though, is the return type of the apply method of the case companions. Do you think we could pull this feature to sealed traits too?
That's a good question. There's a problem with the fact that each object is supposed to define its own class. I am not sure to what degree one can ignore that. Also, there's the issue that objects are lazy and we want non-lazy vals for efficiency.
I know the optimization will break the behavior of getClass. Fortunately, IIRC, there is no mention about getClass in SLS.
One key difference, though, is the return type of the
applymethod of the case companions. Do you think we could pull this feature tosealed traitstoo?
Usually adding methods to an existing object would not break source code backward compatibility.
Compiling empty-body objects to the same shared class would expose more fragility at the binary compatibility level: now adding the first member to an object would break binary compatibility. I don't think this is an option.
Why not go full blown GADT style? enum Option[T] = None | Some(t: T) or enum Either[A, B] = Left(a: A) | Right(b: B) ? And then just replace enum with data?
@notxcain
We want to maintain ability to for enumeration values to define custom methods and inherit custom classes that weren't necessary inherited by the common enum definition.
Binary compatibility is a platform-dependent attribute that does not affect
the semantic of the language.
If the case val behavior is the same as case object, then we might not need
case val.
I prefer to keep using case object. The implementation difference can be
distinguished by annotations.
2017-02-15 17:09 GMT+08:00 Sébastien Doeraene notifications@github.com:
Compiling empty-body objects to the same shared class would expose more
fragility at the binary compatibility level: now adding the first member to
an object would break binary compatibility. I don't think this is an option.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/lampepfl/dotty/issues/1970#issuecomment-279955818,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAktus47yZktMfVoHR_WgMGPLR6MHt4rks5rcsDngaJpZM4L-5lT
.
--
杨博 (Yang Bo)
Binary compatibility is a platform-dependent attribute that does not affect the semantic of the language.
It does affect the semantics of the ecosystem, though, which is closely related to the language. Dismissing binary compat concerns like that is dangerous for the well-being of the language, because a language cannot survive without its ecosystem.
I agree binary compatibility matters. However, since both @shareJavaClass case object and case val have the same impact on binary compatibility, why not keep the existing syntax?
I prefer @shareJavaClass case object because an author of a macro library like upickle does not have to modify its code as the AST does not change.
I am fully in favor of such proposal, for all the reasons invoked above!
A few months ago I made a prototype based on macro annotations that had almost the same syntax as this proposal. This was a good compromise, reusing the existing Scala syntax but allowing for more concision.
However, I am worried that this syntax will confuse newcomers more than anything. The main reason, as pointed by @Atry, is that it blurs the distinction between object and class members.
Concepts such as type members are already hard to grasp coming from other languages; if we make the object/class member distinction murkier, it will jump in the way of the fundamental intuition.
Since we have the possibility to change the syntax of the language, I would propose a syntax where the cases are not syntactically scoped inside the class, but rather "appended" to it, a bit like one defines co-recursive entities in ML by starting with let or type and then separating definitions with and –– but to remain Scala-ish and avoid adding keywords we could use with instead:
enum Option[T]
with case Some(x: T)
with case None
object Option[T] {
def apply[T](x: T) = if (x != null) Some(x) else None
}
Or if we really want less key strokes, simply use case alone:
enum Option[T]
case Some(x: T)
case None
enum Color case Red case Green case Blue
Since the cases are syntactically kept together with the enum (as opposed to potentially defined in an object later in the file), it is no more a problem to have them refer to the enum's type parameters.
I would also be in favor of two improvements:
extends clause. Parent enum parameters that are not parameters in the case can be defined in the body of the case.For example:
enum Param(name: String, typ: Type)
case Nominal(name, typ)
case Positional(ord: Int, typ) { val name = s"_$ord" }
Would be comparable to the current:
sealed abstract class Param(val name: String, val typ: Type) extends Product with Serializable
final case class Nominal(override val name: String, override val typ: Type) extends Param(name, typ)
final case class Positional(ord: Int, override val typ: Type) extends Param(s"_$ord", typ)
It also means we can express the Flipped case aluded by @julienrf without an explicit extension clause, as in: enum Parent[A,B] case Flipped[B,A].
I agree with letting unspecified type parameters resolve to the lower bound of covariant and upper bound of contravariant parameters. This is consistent with the way they are inferred in expressions.
Also, the same type/bounds elision principle could be applied to overriding methods as well (which is implemented in my boilerless prototype). I think it would make sense to allow that in enums only, because in that situation one can quickly look up the parent method signature (cf. the enum has to be defined just above the cases).
Combined with the previous remark, this means one can write:
enum EitherOrBoth[+A,+B] { def fold[T](f: A => T, g: B => T)(m: (T,T) => T): T }
case First (value: A) { def fold(f,g)(m) = f(value) }
case Second(value: B) { def fold(f,g)(m) = g(value) }
case Both(fst: A, snd: B) { def fold(f,g)(m) = m(f(fst),g(snd)) }
All that being said, and if we assume nominal implicit extension, the magic of automatic type parameters insertion becomes of dubious usefulness. It would be better IMHO to have cases list all their parameters explicitly (types and values). This would not lead to much more code, but would be much clearer.
Say, for example:
enum MyOption[+T <: AnyRef] // bounds are implicitly propagated to homonymous case parameters
case Som[+T](x: T) // means `case Some[+T<:AnyRef](x:T) extends MyOption[T]`
case Non // means `case Non extends MyOption[Nothing]`
I like this proposal. I do hope two things come in:
1) inferring Nothing on covariant enums that have cases that don't use a type parameter.
2) I really don't like preventing adding additional types on the cases: https://github.com/lampepfl/dotty/issues/1970#issuecomment-279356882
If we ban adding additional types, then when you need to add one, you have to refactor to sealed trait/case classes and expand out this boiler plate by hand, which is a pretty bad experience.
I can't see why we can't just take all the types in the enum and consider those the first types on all the cases, and any additional types come after that. (Or scala could have multiple type parameter blocks, which would also be nice for many inference situations).
Or if we really want less key strokes, simply use case alone:
enum Option[T]
case Some(x: T)
case None
That was actually the design I started with. The problem with it is that it effectively disallows toplevel enums. First, simple cases like None could not be mapped to vals because they are not allowed at toplevel. Second, enums with many alternatives would pollute the namespace too much.
Note that one can always "pull out" cases into the enclosing scope using val and type aliases. So cases inside objects are more flexible.
cases inside objects are more flexible
Actually, I had in mind that enum E case A case B ... would expand into class E; object E { case class/object A; case class/object B; ... }. Realizing now that the expansion of my Param example was inaccurate.
I think it's still less counter-intuitive than something that looks like a class but whose "members" are moved to a companion object. After all, adding case in front of a class creates a companion object with an apply method inside. Is it a big stretch to accept that case following a class create case classes in the companion object?
A language should provide mechanism, not policy.
I agree that:
values, valueOf and java.lang.Enum interoperability are important for a sealed traitI did not recognize "less verbose" as a goal for a language feature itself. "less verbose" should be achieved by composing tiny, reusable, elegance, atomic features.
I propose not to change Scala 2.x syntax except replacing sealed trait to enum.
enum Color
object Color {
case object Red extends Color
case object Blue extends Color
final case class Rgb(r: Byte, g: Byte, b: Byte) extends Color
}
expand to
sealed trait Color extends java.lang.Enum with scala.Enumeration
object Color extends scala.EnumerationCompanion {
@shareJavaClass case object Red extends Color
@shareJavaClass case object Blue extends Color
final case class Rgb(r: Byte, g: Byte, b: Byte) extends Color
}
As enum already cover all usage of sealed. I guess the sealed keyword might be removed from the language.
My take on the whole compatibility with Java enums, if Java enum compatibility is going to harm the design of enum in general, then compatibility should be opt in with some annotation. Scala is going to start taking into account other platforms, and I don't think that being shoestringed with Java compatibility is a good idea. Binary compatibility for adding more entries in the enum should be provided however, that is killer feature.
Also if the syntax is overly verbose for enums, then there isn't really going to be that much of a difference between using enum and something like Enumeratum which means there is little advantage of having this as a language feature.
In my opinion, I would rather prefer a light enum that does less with minimal syntax that focuses on performance/memory usage and provides SYB generated methods for looking up enum cases by String value rather than something which is sought of but not really ADT's.
After reading the entire discussion, its starting to get muddied about what the difference between what people are discussing and what is possible now with ADT's/case objects. If the difference between the two isn't going to be substantial, its really just going to be confusing for end users and we don't want to repeat what happened with Scala enum again
Another reason against this PR:
Multiple case object for one sealed trait is a rare situation in real world Scala code. For a enumerator that allows large candidates, we usually store them in an external format, like database schema, XML DTD, or JSON schema.
Popular Scala libraries, like scalaz, usually modeled with a lot of case classes to represent its states, even for those cases that have no arguments, because they might need type parameter like case class Tower[A]().
Even enum in Java language is rarely used as well. They use visitor pattern to avoid the need of ADT.
If you search "enum" in all Java source code on Github, you will find the main usage of enum is testing enum keyword itself.
https://github.com/search?l=Java&q=enum&type=Code&utf8=%E2%9C%93
Multiple case object for one sealed trait is a rare situation in real world Scala code.
I don't think we live in the same "real world". A search in the Scala.js repo shows that virtually all case objects are part of a collection of several that extend a sealed trait or sealed abstract class. And in most of those cases, it's actually a proper enum, in the sense that there is no case class in the same set.
@sjrd How many case classes are there in Scala.js repository? What is the ratio between case object and case class?
@atry @sjrd - to sort out usage, @olafurpg has a huge corpus that can be used to check relevant usage statistics. This feels to me to be sliding into conjecture...
I don’t know how relevant it is but in a simple project (~10k LOC) I count about 30 enum-like definitions (sealed traits extended by case objects only).
I noticed that almost all usages of case object in Scala.js repository are schema for configuration. While my libraries usually provide APIs for some standalone mechanisms, which does not provide configuration to users.
I guess that's why @sjrd and I live in different "real world"s. No surprise that other utility libraries like Scalaz or Shapeless do not use case object very often as well.
@Atry
Multiple case object for one sealed trait is a rare situation in real world Scala code. For a enumerator that allows large candidates, we usually store them in an external format, like database schema, XML DTD, or JSON schema.
Uh I wouldn't make assumptions like this. Where I work, we have a huge amount of code which is enums that are defined with enumeratum (i.e. case object's in sealed abstract class which could easily be a sealed trait)
@Atry
How many case classes are there in Scala.js repository? What is the ratio between case object and case class?
scalajs$ git grep 'case object' | wc
75 502 8418
scalajs$ git grep 'case class' | wc
269 2197 35161
scalajs$ git grep 'case class' | grep -v '/Trees.scala' | wc
146 1101 19364
75 case objects versus 269 case classes. Of which 123 are for defining the ASTs of the IR and JavaScript code.
In any case, it's definitely not rare.
Its also used all over the place when building webservers (one of the primary types of software that is built with Scala today) to define enums which get outputted in JSON/XML's
Or any type of GUI needs a concept of ordering enums.
Its really not rare at all
I think this code is unreadable :
enum class Option[+T] extends Serializable {
def isDefined: Boolean
}
object Option {
def apply[T](x: T) = if (x != null) Some(x) else None
case Some(x: T) {
def isDefined = true
}
case None extends Option[Nothing] {
def isDefined = false
}
}
Why is Some defined in the object ? It is a class ! enum should be used to list all possible case, what is an abstract function declaration doing in an enum ? In my opinion it should look like this :
enum Option[+T] {
Some(x: T)
None
} {
def isDefined: Boolean = this match {
case Some(_) => true
case None => false
)
}
It is a lot like ruest enum and much more intuitive in my opinion.
>
Enumerations should be efficient, even if they define many values. In
particular, we should avoid defining a new class for every value.
Even if the use of case object is not "rare", is it really "many values"?
Considering there are much more non-case objects than case objects, is it
possible to apply similar optimizations to those objects?
2017-02-16 19:15 GMT+08:00 Sébastien Doeraene notifications@github.com:
@Atry https://github.com/Atry
How many case classes are there in Scala.js repository? What is the ratio
between case object and case class?scalajs$ git grep 'case object' | wc
75 502 8418
scalajs$ git grep 'case class' | wc
269 2197 35161
scalajs$ git grep 'case class' | grep -v '/Trees.scala' | wc
146 1101 1936475 case objects versus 269 case classes. Of which 123 are for defining the
ASTs of the IR and JavaScript code.In any case, it's definitely not rare.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/lampepfl/dotty/issues/1970#issuecomment-280303490,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAktuhkNqWDvmcoMILHJuF4xBmcuB5S-ks5rdC_LgaJpZM4L-5lT
.
--
杨博 (Yang Bo)
Even if the use of case object is not "rare", is it really "many values"?
I believe there are a lot enumeration-like structures that have many (10-256) values and that are not represented as case objects, precisely because case objects are too heavyweight. For instance, there are quite a few of these in the various compilers that I know. The idea is that the enum construct should cover these use cases.
So the number of case objects in existing code bases is not a good indicator for deciding whether we need more lightweight enums.
Enumerations should be efficient, even if they define many values. In particular, we should avoid defining a new class for every value
(I realize the following is a tangent and I am not involved at all writing compilers or optimizers.)
Scenarios where Scala will be deployed the good old fashioned way as non-optimized JAR files will remain with us for a long time, but won't the upcoming Dotty linker change our notion of what might be heavyweight vs. lightweight, inefficient vs. efficient?
For example with Scala.js (and its linker/optimizer) you care much less about optimizing for certain patterns because you know the optimizer takes care of it (for example erasing certain collection operations down to basic operations on Scala.js arrays, see @sjrd's talks on this).
With an appropriate linker/optimizer, a Scala ADT with hundreds of objects wouldn't necessarily imply the creation of hundreds of Java classes at runtime: the linker/optimizer could create a much more optimal runtime representation of sealed hierarchies as needed.
I don't think this changes much regarding the original proposal, which has other motivations. I just wanted to point out that the performance motivation might mostly go away in the future.
@ebruchez Scala.js optiimizes away objects that are not referenced. Here we are talking about large enums which are referenced in large match expessions. E.g. an enum for token classes in a compiler.
Those won't be optimized away.
Generally, there's a danger to assume a "sufficiently smart compiler". If you have a concrete plan how to optimize a certain construct, that should surely be factored in. But if it's only a vague feeling, don't count on it.
That's understood. I realize that the Scala.js linker doesn't do this kind of things right now (AFAIK). I had something more concrete in mind in fact, although I realize that this would need implementation work in the linker.
Stephen Compall wrote here about the equivalence between type-safe pattern matches and "matchless folds". If I understand well this implies that, assuming type-safety, one could write pattern matches for ADTs which do not rely on individual isInstanceOf checks (or maybe just one such test on the base class/trait).
This means that many common sealed hierarchies could be implemented at runtime with much more efficient representations. Obviously the runtime pattern matcher would have to be aware of that.
This tells me that it might not entirely be a crazy assumption that common enums patterns could be massively optimized by the linker.
@ebruchez It comes down to this: Can we assume that an optimizing compiler/linker will replace object definitions with value definitions? For this we need to show:
Both conditions are uncheckable with our current technologies. The first condition might be checkable with a global code analysis or an effect system, but neither exist yet. The second condition is probably too vaguely defined to be effectively checkable.
The enum proposal needs to give an expansion anyway for simple enum cases. It currently maps them to value definitions, which is straightforward. Mapping them to object definitions instead would couple the whole issue to the two hairy problems mentioned above, which looks too risky. Not to mention that interop with Java enumerations would become a game of roulette with an unpredicatable optimizer.
@ebruchez
Comparing to Scala.js (or Dotty's experimental deep linker) is not an apples for apples comparison.
In regards to Dotty's deep linker, this is not designed to work with libraries and only works with final "executable" jars. At least with Scala on the JVM, any built artifact which is a JAR is treated as library so it needs to be able to be dynamically linked at runtime. Artifacts built with Dotty's deep linker wouldn't be able to be linked as libraries afaik.
Scala.js is slightly different in this regard. It target is still building highly optimized .js targets (which is similar to Dotty's deep linker that builds highly optimized JARs), however for Scala.js libraries it uses its own internal bytecode which differs from JVM bytecode which still allows you to make Scala.js libraries.
The reason this is important is that enum aren't "just case objects". Case objects carry around class information, because they can be referenced/looked up and extended in ways that enum's can't be. Having an actual enum keyword signifies the compiler that this construct is a much more limited version of case object, and because of this the compiler can do optimizations (which are not possible if we are dealing with dynamically linked libraries) that it otherwise can't do. This is the same reasoning behind constructs like AnyVal, its not possible to automatically have unboxed value classes if we also have to provide dynamically linked libraries. The deal is the same with abstract class versus trait or the new @static proposal being made.
This is why I am also in favour of making the proposed enum construct to be as limited as possible, in the same way how AnyVal is very limited compared to a standard one field case class constructor. The more things that Enum is asked to support, the closer it is to becoming a slightly different variant of our standard sealed abstract class/sealed abstract trait + case object pattern we have now.
At least personally, the things that are important for enum are the same things that are added with https://github.com/lloydmeta/enumeratum, and nothing more, specifically
And more importantly, what it doesn't support
This is, of course, so we can support the enums not instantiating a class (or two) class instances per enum value
@notxcain
Why not go full blown GADT style? enum Option[T] = None | Some(t: T) or enum Either[A, B] = Left(a: A) | Right(b: B) ? And then just replace enum with data?
@DarkDimius
We want to maintain ability to for enumeration values to define custom methods and inherit custom classes that weren't necessary inherited by the common enum definition.
Objective 5:
It should support all idioms that can be expressed with case classes. In particular, ...elided.., and arbitrary statements in a case class and its companion object.
I'd argue if we follow @notxcain 's idea to go pure ML-style GADT/ADT to disallow template for enum class or cases (extends is fine), we loose nothing. As Scala has a wonderful feature implicit helpers, I can foresee a programming paradigm emerge:
data Option[+T] = Some(x: T) | None
implicit class OptionHelper[T](opt: Option[T]) extends AnyValue {
def isDefined: Boolean = opt match {
case Some(_) => true
case None => false
)
}
This way, Scala can have the best of FP and OO. Also, the expression problem reminds us that there are two orthogonal ways to define operations on data. As case classes are mainly the OO way of operations on data, I think it's probably good for this new language feature to go the FP way of operations on data. This also alleviates the problem of confused users: which construct to use?
@liufengyun Just adding standard ADTs would not be very Scala-like. Scala has always avoided having FP and OOP features side by side. Instead it tries very hard to unify them. So if we define enums, we want to map them to classes, and we want to not artificially restrict the things you can do with these classes because that would lose orthogonality.
Just adding standard ADTs would not be very Scala-like. Scala has always avoided having FP and OOP features side by side.
@odersky I guess there is some language design philosophy here I cannot argue against. But as a programmer, the appeal of syntactical simplicity is irresistible -- I believe that's also most Scala programmers think Scala is. It's just about the syntax, behind the scene the standard ADTs also map into classes. For programmers, the syntactical simplicity makes them love the language, not just use the language, IMHO.
Or even something like
data Option[A] = Some(a: A) | None
ops Option[A] {
def isDefined: Boolean = this match {
case Some(_) => true
case None => false
}
}
Just improvising
It's just about the syntax
@liufengyun What about the syntax I proposed earlier?
enum Option[T] case Some(x: T) case None
This lends itself to both short, elegant definitions and to more typical OOP: you can add braces to introduce members for cases (and it still looks like Scala), but if you want to keep the ADT clean and add your methods via implicits, no one is preventing you from doing that. Why impose arbitrary restrictions on programmers?
@LPTK
enum Option[T] case Some(x: T) case None: This lends itself to both short, elegant definitions and to more typical OOP: you can add braces to introduce members for cases (and it still looks like Scala)
First, I'd argue that this is not the simplest possible syntax. By the philosophy make simple things easy and difficult things possible, the ML-like ADT approach also supports the OOP by extends, it just disallow templates.
Second, syntactical simplicity also relates to redundancy of language features. By disallowing templates, we are minimising the overlap between the new ADT and sealed class/trait and differentiating their use cases.
Third, it's related to @odersky 's concern about top-level definitions:
That was actually the design I started with. The problem with it is that it effectively disallows toplevel enums. First, simple cases like None could not be mapped to vals because they are not allowed at toplevel. Second, enums with many alternatives would pollute the namespace too much.
If we have completely new ML-style ADT syntax that has no prima facie connection with case object and case class, the language designer has much more flexibility in the implementation. For example, the implementation can put the case defs in the object Color or object Weekday. This implementation behaviour is justified only if we make the syntax more different from case object/class (don't use the keyword case), otherwise it contradicts the existing intuitions.
This also means programmers can still define the companion object Color and object Weekday, in the compiler, it just merges the custom provided companion object with the synthesised one, consistent with the existing Scala behavior.
@liufengyun Have you seen the answer I provided to the message you quoted? (Genuine question; not saying you should necessarily agree with it.)
I had in mind that enum E case A case B ... would expand into class E; object E { case class/object A; case class/object B; ... }. Realizing now that the expansion of my Param example was inaccurate.
I think it's still less counter-intuitive than something that looks like a class but whose "members" are moved to a companion object. After all, adding case in front of a class creates a companion object with an apply method inside. Is it a big stretch to accept that case following a class create case classes in the companion object?
There is nothing in this approach (AFAIK) that would prevent what you then describe:
This also means programmers can still define the companion object Color and object Weekday [...] it just merges the custom provided companion object with the synthesised one
@LPTK Yes, I see it's just syntactical difference from yours -- I'm just reserved about using the case keyword and allowing templates for cases and enum.
BTW, I see a potential usability problem with putting cases definition inside an object. If a programmer defines multiple ADTs, say 10, in a single file. To use the ADTs in another file, the programmer has to import 10 times. It will become very annoying in a project where all data definitions are centralized in a single file.
I'm not sure if it's technically possible to alleviate this problem by putting top-level statements (non class/object defs) inside the package object -- some tricky merging is required in the Namer. Technically, it seems feasible.
It will become very annoying in a project where all data definitions are centralized in a single file.
Good point. But a simple @expose or @forward macro annotation could automatically create type and value forwarders alongside any non-top-level annotated class, exposing selected members of its companion object (essentially its public classes, objects, and those values that result from enum cases, I would suggest).
@mdedetrich
In regards to Dotty's deep linker, this is not designed to work with libraries and only works with final "executable" jars. At least with Scala on the JVM, any built artifact which is a JAR is treated as library so it needs to be able to be dynamically linked at runtime. Artifacts built with Dotty's deep linker wouldn't be able to be linked as libraries afaik.
Scala.js is slightly different in this regard. It target is still building highly optimized .js targets (which is similar to Dotty's deep linker that builds highly optimized JARs), however for Scala.js libraries it uses its own internal bytecode which differs from JVM bytecode which still allows you to make Scala.js libraries.
I am not sure I follow or understand the distinction above. In both cases, you have whole-program optimization under a closed-world assumption, except for explicit entry points. My understanding is that all Scala libraries, at some point in the future, would include TASTY trees, which the linker will use for its analysis.
Artifacts built with Dotty's deep linker wouldn't be able to be linked as libraries afaik.
If you mean runtime linkage(as in dropping a jar in J2EE container classpath) then you're right.
If you mean compile-time dependencies, than, as correctly pointed out by @ebruchez, no. The reason is that you can recover the entire library from TASTY and recompile it under different assumptions, if you need to.
Note that it's too early to tell how it will work out, but design wise there's nothing prohibiting compile-time dependencies on pre-optimized jar.
This tells me that it might not entirely be a crazy assumption that common enums patterns could be massively optimized by the linker.
Yes, they might be. But I think there are two points which are convoluted in this discussion: enum semantics(and whether we need them), and performance of implementation.
My understanding is that enum started as a way to provide bigger guarantees that sealed, that enable reliable discovery of subclasses in metaprographing. In this regard, I consider enum a substantial improvement over current situation and I think we should include them.
As far as how to compile them - the current scheme is very inefficient if you compare it with C enums, but C enums are a lot less expressive. There may be a place for an analysis to optimize existing classesobjects into enums and then find if those could be represented in a more compactefficient way. I think that this is a separate project which may be a nice feature for Dotty linkerScala.js linker. For Dotty linker specialization and devirtualization should are already huge and we don't want to spread us thin.
@odersky
- The object's initializer does not have side effects
- Nobody does interesting things with the object's class (because it will go away)
Both conditions are uncheckable with our current technologies. The first condition might be checkable with a global code analysis or an effect system, but neither exist yet. The second condition is probably too vaguely defined to be effectively checkable.
Once you have that linker/optimizer and an "enumeration optimizer", it seems to me that the main (only?) risk would be to get slower enumerations if you do, as you say, "interesting things" with the objects. Like for @tailrec, you could have an annotation to ensure that the enumeration is in fact compiled/linked efficiently. If it is not possible to check that, then you would get a warning or error. But I can also imagine how too may uses of the enumeration might fail the check to make the annotation useful, although that remains a bit unclear to me.
Not to mention that interop with Java enumerations would become a game of roulette with an unpredicatable optimizer.
It wouldn't be if Scala enumerations that must be Java-compatible are marked as such, like extending from a particular trait or having an annotation, as suggested in some comments above. Only pure-scala enumeration would benefit from an optimized representation.
All in all, I don't have anything against the simpler and more straightforward solution of using values rather than objects.
I still would love to see somebody experimenting with optimizing the runtime representation of ADTs in the context of an linker/optimizer. As @DarkDimius just wrote above, there is other fish to fry, so I will leave things at that for now ;)
I have updated the proposal to reflect all suggestions in the discussion that were adopted so far. I believe this proposal as a whole will not change much anymore. I would still welcome suggestions on details. More fundamental change requests, such as a fundamental change in syntax or scope of the proposal could be worked out fully as alternative proposals, ideally including an implementation, so that they can be discussed in depth. In that case, it would be best to make alternative proposals in separate issues.
We'd like to reach a decision whether we want to go ahead with this or not by the end of next week.
I like the basic design of these enum GADTs:
enum Option[+T] {
case Some(x: T)
case None extends Option[Nothing]
}
But to add instance members, would it be possible to put them directly
in the cases? I.e.,
enum Option[+T] {
case Some(x: T) { override def isDefined: Boolean = true }
case None extends Option[Nothing] {
override def isDefined: Boolean = false
}
def isDefined: Boolean
}
object Option {
def apply[T](t: T): Option[T] = if (t != null) Some(t) else None
}
The logic is that if something inside an enum definition doesn't start
with the case keyword, it's an instance member.
The logic is that if something inside an enum definition doesn't start
with the case keyword, it's an instance member.
I am philosophically opposed to that. It makes enum look like a way to introduce another kind of class but then case makes no sense in a class - it should go in the object! This matters when you consider what other members can be accessed from a case. The existing syntax treats an enum alone as neither a class not as an object (or, if you want, as both a class and an object).
OK. Not sure I understand this part:
This matters when you consider what other members can be accessed from a case.
Re: enums being just another kind of class [with instance members], I think that is familiar to Java programmers.
OK. Not sure I understand this part:
Cases can directly access all members of the object, but only non-private members of the class. So putting them in the class causes confusion. Example:
enum Option[+T] {
case Some(x: T) { override def isDefined: Boolean = default }
case None extends Option[Nothing] {
override def isDefined: Boolean = !default
}
private val default: Boolean = true
def isDefined: Boolean
}
object Option {
private val default: Boolean = false
def apply[T](t: T): Option[T] = if (t != null) Some(t) else None
}
According to the intuitive nesting rules, the default value accessed by Some and None is the one in the class. But after translation it's the default value in the object. First rule of language design: Don't mess with the scoping rules :smile:.
Thanks for the explanation, I see the problem now.
First rule of language design: Don't mess with the scoping rules 😄.
Pithy and succinct. I'll remember that one :-)
First rule of language design: Don't mess with the scoping rules
Couldn't the following be viewed as a contradiction to this principle?
enum class E[T]
object E {
type T = String
def foo(x: T) = x // here T = String
case A (x: T) // here T is an invisible class parameter
}
As remarked by @julienrf and addressed by @DarkDimius in his proposal for the long form, the type parameter T "crosses scopes" in a rather counter-intuitive way.
@LPTK I agree it's a borderline case. Technically, the type parameter of E is duplicated as a type parameter of foo rather than extending to foo. But it certainly looks confusing. One refinement would be to disallow the local definition of T in the E object. There's precedence for that. We also disallow
class C[T] { type T }
the type parameter of E is duplicated as a type parameter of foo
I'm not sure I'm following you. With the current version of #1958 and from what I see after -Xprint:front, method foo does not seem to have type parameters:
final module class E$() extends Object() { this: E.type =>
type T = String
def foo(x: E.T): E.T = x
final case class A[T](x: T) ...
disallow the local definition of T in the E object
What about:
object Main {
enum class E[T]
case class F[T <: String](x:T)
type T = String
object E {
def foo(x: T) = x
case A (x: T)
}
E.A(42) // ok
// F(42) // rejected
// E.foo(42) // rejected
}
It seems definitions intervening between the enum and its object could equally make things more confusing, so it would make sense to force the object to be defined syntactically right after the enum _and_ prevent the object from defining types with the same name as type parameters from the enum... At this point, we're not too far from the original syntax enum A case B case ... :smile:
as a type parameter of foo
Sorry, that should have been A.
For those interested, I fleshed out some implementation of my above proposal, as an experiment. See issue https://github.com/lampepfl/dotty/issues/2055.
I would love this to be implemented but one thing that has to be supported is the ability to implement Visitor pattern dynamics and have more in the enum than simply the object itself. For explanation look at my post on Stack overflow: http://stackoverflow.com/questions/43152963
This is one of the best ways to leverage enums in a code base to avoid having complex switch logic.
I integrated @szeiger's proposal
How about automatically filling in unused type parameters in cases as their lower (covariant) or upper (contravariant) bounds and only leaving invariant type parameters undefined?
It's now number 3 of the new desugaring rules. The rules is complicated, but it's one stumbling block less for defining simple ADTs.
Is there a mapping from this java enum to this proposal? This is a purposely convoluted example of what you can do in Java...
````
interface Mergeable {
A mergeWith(A other);
}
enum Color implements Mergeable
BLACK("As dark as can be") {
@Override
public Color mergeWith(Color other) {
switch (other) {
case BLACK:
return BLACK;
default:
return GREY;
}
}
},
WHITE("Blinding white") {
@Override
public Color mergeWith(Color other) {
switch (other) {
case WHITE:
return WHITE;
default:
return GREY;
}
}
},
GREY("Somewhere in between") {
@Override
public Color mergeWith(Color other) {
return GREY;
}
public String invisibleButFindMeViaReflectionSeriously() {
return "you found me out!";
}
};
private final String description;
Color(String description) {
this.description = description;
}
public String getDescription() {
return description;
}
}
````
syntactically, the translation would be the following, but It does not appear to be valid:
````
trait Mergeable[A] { def mergeWith(a: A): A }
enum Color(description: String) extends Mergeable[Color] {
case Black("As dark as can be") {
override def mergeWith(color: Color) = color match {
case Black => Black
case _ => Grey
}
}
case White("Blinding White") {
override def mergeWith(color: Color) = color match {
case White => White
case _ => Grey
}
}
case Grey("Somewhere in between") {
override def mergeWith(color: Color) = Grey
def structuralOrNotIDontKnow() = "method not on super"
}
}
````
What I am highlighting here are a few features of java enums:
Scanning the examples in this issue, I did not see any examples where the enum had constructor parameters. I could be blind though. There is a lot of talk about encoding GADT's more simply, which I will certainly use, but not much talk about encoding multiple instances of simple stuff, like the canonical java "Planet" example (https://docs.oracle.com/javase/tutorial/java/javaOO/enum.html), where the only difference is in the values embedded in the enum instances -- no behavior difference.
A description and an implementation of a translation to Java enums is still work to do. Since I am not very current on the details of java enums, I would appreciate help from others here.
I don't think it should be a requirement that we can support _all_ features of java enums, though. Specifically, the example above passes the string associated with a case as a parameter (at least that's how I understand it, I might be wrong here). That's not supported in the proposal, you'd have to override toString instead.
Scanning the examples in this issue, I did not see any examples where the enum had constructor parameters
enum classes can have parameters but then enum cases need to use the usual extends syntax to pass them. There's no shorthand for this, as in Java enums.
@scottcarey I have added a Scala version of the planet example to the description above. Thanks for pointing me to the Java version!
I made two changes to the proposal.
Rename the methods defined for an enumeration as follows: values -> enumValues, valueOf -> enumValue, withName -> enumValueNamed.
The reason is that a common use case of enumerations is a wildcard import of the enum object to get all the cases, e.g. import Color._. But by the same import we also get the implicitly defined methods. So it's better that these methods have uncommon names that do not conflict by accident with something the user defined.
Generate the utility methods not just for simple cases but for all singleton cases. The Java planet example could not have been written conveniently without this generalization.
Supporting every possible quirk of Java enums is not necessary, but there is some significant overlap to take advantage of.
The best way to think about Java enums is to consider that the singleton pattern in java is:
java
enum ThereCanBe {
ONLY_ONE;
}
(as recommended by EffectiveJava for a decade but ignored by bloggers talking about advantages Scala has over Java)
Decorate the enum with whatever compiler-known data (member variables) and behavior (methods) you want.
This extends to more than one instance of the type, all known in advance by the compiler, leading to efficient dispatch over them (via switch) and tools like EnumMap.
With that in mind, this proposal is essentially providing two things that I see:
object to allow for more than one instance of a singleton type -- the same thing that happens when you go past one instance in a java enum. This patches the gap that leads people to write Java enums within Scala projects, because they are so much easier to work with for this case.Both cases can lead to improved performance via dispatching pattern matches with a switch over the ordinal, rather than cascaded instanceof checks.
In the first case, where everything is a singleton, this maps neatly to java enums as far as I can see.
Note, that in the case where a plain object -- not just an enum -- is a singleton, it could also be translated to a java enum in bytecode. This is interesting because enums are particularly useful in the JVM for this case:
I have thought that top-level scala objects (plain singletons) could under the covers all be encoded on the JVM as enums, avoiding all sorts of messiness in the process. It won't work for non singleton cases, like objects inside of instances.
So as this approaches the time where the bytecode encoding to enums is considered, consider it for simple top level objects to.
scala
enum ThereCanBe {
case OnlyOne
}
de-sugars to roghly the following if I am reading things right:
scala
sealed abstract class ThereCanBe
object ThereCanBe {
val OnlyOne = new ThereCanBe {}
}
Which can be encoded as a Java enum:
java
enum ThereCanBe {
OnlyOne;
}
which isn't that different than an ordinary object singleton
scala
object ThereCanBe {
val stuff = "stuff"
}
Which encodes as:
java
enum ThereCanBe {
$Instance;
private final String stuff = "stuff";
public String stuff() {
return stuff;
}
}
After all, an enum with only one valid value is a singleton, and so is a top level object.
I have made one more tweak: enum utility methods are emitted only if there are some singleton cases and the enum class is not generic. This avoids generation of utility methods for types such as List and Option. The general rationale is that, if the enum class is generic, the utility methods would lose type precision. E.g. List.enumValues would return a Seq[List[_]], which is not that useful.
@scottcarey Interesting idea, to encode singletons as enumerations. Maybe we can use this for the Scala translation, but we'd need a lot of experimentation to find out whether it's beneficial. Note that top-level objects are already heavily optimized.
Java (language) enumerations may not have a custom superclass, so class C; object O extends C is not expressible.
Otherwise, the encoding is pretty similar to scala objects.
public enum Test {
T1
}
public final class p1.Test extends java.lang.Enum<p1.Test> {
public static final p1.Test T1;
private static final p1.Test[] $VALUES;
public static p1.Test[] values();
Code:
0: getstatic #1 // Field $VALUES:[Lp1/Test;
3: invokevirtual #2 // Method "[Lp1/Test;".clone:()Ljava/lang/Object;
6: checkcast #3 // class "[Lp1/Test;"
9: areturn
public static p1.Test valueOf(java.lang.String);
Code:
0: ldc #4 // class p1/Test
2: aload_0
3: invokestatic #5 // Method java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum;
6: checkcast #4 // class p1/Test
9: areturn
private p1.Test();
Code:
0: aload_0
1: aload_1
2: iload_2
3: invokespecial #6 // Method java/lang/Enum."<init>":(Ljava/lang/String;I)V
6: return
static {};
Code:
0: new #4 // class p1/Test
3: dup
4: ldc #7 // String T1
6: iconst_0
7: invokespecial #8 // Method "<init>":(Ljava/lang/String;I)V
10: putstatic #9 // Field T1:Lp1/Test;
13: iconst_1
14: anewarray #4 // class p1/Test
17: dup
18: iconst_0
19: getstatic #9 // Field T1:Lp1/Test;
22: aastore
23: putstatic #1 // Field $VALUES:[Lp1/Test;
26: return
}
@odersky One really useful feature in Java is the EnumSet which allows grouping of your enums which I hope can be considered.
https://docs.oracle.com/javase/7/docs/api/java/util/EnumSet.html
There was a late change in the proposal. It now demands that all type parameters of cases are given explicitly, following @LPTK's proposal of nominal correspondence in that respect. It's more verbose but also makes it clearer what happens. Example: Previously, you wrote:
enum Option[+T] {
case Some(x: T)
case None
}
Now you have to write:
enum Option[+T] {
case Some[+T](x: T)
case None
}
@odersky what are the benefits of this?
@notxcain, fixing oddities in scoping rules. Like those: https://github.com/lampepfl/dotty/issues/1970#issuecomment-284025913
Were the significant whitespace proposal to be accepted then it would appear that case could be omitted since case lines would share the same level of indentation:
enum Option[+T]
Some[+T](x: T)
None
IOW, the first level of indentation in an enum block would imply case Type. Would love to see the same for match (and pattern matching in general). Basically omit the case requirement from the language entirely.
@retronym Also note that the java encoding of enums sets a special flag on the generated class, ACC_ENUM. This triggers a lot of special handling in the JVM. Constructors can't be called, even with reflection / unsafe. Serialization is automatic and can not be overridden.
From JLS 8.9
An enum type has no instances other than those defined by its enum constants. It is a compile-time error to attempt to explicitly instantiate an enum type (§15.9.1).
The final clone method in Enum ensures that enum constants can never be cloned, and the special treatment by the serialization mechanism ensures that duplicate instances are never created as a result of deserialization. Reflective instantiation of enum types is prohibited. Together, these four things ensure that no instances of an enum type exist beyond those defined by the enum constants.
However, although there are no standard ways to break the guarantee above, and the well known Unsafe tricks won't work, there are some holes in Oracle's JVM via internal com.sun reflection packages:
http://jqno.nl/post/2015/02/28/hacking-java-enums/
tl;dr -- the bytecode emitted can potentially leverage ACC_ENUM in certain cases to gain a tighter guarantee that singleton enums are actually singletons (per classloader, of course).
This is very exciting! I wonder if we could allow the following syntax (proposed here as implicit case) to make writing Shapeless-style typelevel functions easier:
/** Typelevel function to compute the index of the first occurrence of type [[X]] in [[L]]. */
enum IndexOf[L <: HList, X] extends DepFn0 {
type Out <: Nat
type Aux[L <: HList, X, N <: Nat] = IndexOf[L, X] { type Out = N }
implicit case IndexOf0[T <: HList, X] extends Aux[X :: T, X, _0] {
type Out = _0
def apply() = Nat._0
}
implicit case IndexOfN[H, T <: HList, X, I <: Nat](implicit i: Aux[T, X, I])
extends Aux[H :: T, X, Succ[I]] {
type Out = Succ[I]
def apply() = Succ[I]
}
}
I've scanned around this topic looking for a clear rationale on why the suggestion is to have three separate constructs (enum, enum class and [enum] object). What are the benefits of this as compared to something like described here
edit: This would also avoid the "scope crossing" type parameters mentioned here
enum was merged a while ago so closing this issue, http://contributors.scala-lang.org/ is a better place to have follow-up discussions.
Most helpful comment
This looks great!
I don't think the long form is an improvement, though. The
casekeyword is all you need to disambiguate the members of the ADT from other stuff.I don't see any issues here. I agree with Stefan that generics should be handled automatically by default, and have the type parameter missing and filled in as Nothing if the type is not referenced. If you want something else, you can do it explicitly.