Here's an unsoundness hole I wasn't aware about, and I can't find in the issue tracker — it involves initialization, but not null. Instead of using null, we can just select an ill-bounded type from a not-yet-initialized path that's going to crash/loop when we get there — so we get a ClassCastException instead of an abort (exception) or loop.
scala> new {
| val a: String = (((1: Any): b.A): Nothing): String
| val b: { type A >: Any <: Nothing } = ???
| }
java.lang.ClassCastException: java.lang.Integer cannot be cast to scala.runtime.Nothing$
at rs$line$1$$anon$1.<init>(rs$line$1:2)
scala> new {
| val a: String = (((1: Any): b.A): Nothing): String
| def loop(): Nothing = loop()
| val b: { type A >: Any <: Nothing } = loop()
| }
|
java.lang.ClassCastException: java.lang.Integer cannot be cast to scala.runtime.Nothing$
at rs$line$4$$anon$1.<init>(rs$line$4:2)
at rs$line$4$.<init>(rs$line$4:5)
One can verify that commenting out val a gives code that involves no nulls.
We've known for a while that we can inhabit types with bad bounds with expressions that don't return a value — val v { type A >: Any <: Nothing } = failing/looping. But we always reasoned that v.A would only be available after the crash. The above examples show a loophole we missed.
This loophole is outside the taxonomy of "scaling-up" loopholes in https://www.scala-lang.org/blog/2016/02/17/scaling-dot-soundness.html.
And it doesn't appear in any known DOT variant that I'm familiar with, because it requires both paths and potentially diverging initialization expressions. If this issue is indeed novel to us, I'll ask other DOT researchers.
Above, the type of b has obvious bad bounds, but I assume we can give it non-obvious bad bounds (as usual) to get the same error.
I can see two likely sound potential fixes, but they are not yet realistic.
@total) when we can't check totality ourselves — the main benefit would be that @total documents the danger.this.f before f is initialized, following @liufengyun et al.'s safe initialization work. I'd expect this to be pretty restrictive, but I don't know.Regarding 2: In fact, I realized this problem by comparing DOT's and @liufengyun's typing rules for object creation: one declares all fields to be instantly initialized and accessible in the typing context when typing other fields (which is valid if initializers are syntactic values), the other only has in context the actually initialized fields.
/cc @alexknvl @olhotak @amaurremi @namin @tiarkrompf @odersky
Assume we have a working initialization checker. Is the following true:
That's what I assume, but I want to be sure.
@odersky That makes sense if we change the initialization checker for it and accept the false negatives? That's option 2 above.
The checker would have to reject this.b.A even though this doesn't actually use this.b at runtime. I guess the checker didn't, but haven't tried. @liufengyun might know better?
Also not sure what should happen with more useful applications like (hierarchical) cakes, and implications for existing code. https://gist.github.com/Blaisorblade/23b9eae6df46bad3f551e51407c9286d
@odersky For instance, from the above gist, should the initialization checker reject this code? It uses t.Exp before t is initialized, and t.Exp could have bad bounds...
trait Eval { outer =>
val t: Trees
val n: Normalize { val t: outer.t.type }
type Val
val eval: t.Exp => Val
val normalizingEval: t.Exp => Val =
exp => eval(n.normalize(exp))
}
The problem is not one of initialization. The problem is that the type b.A is in scope.
Marianna and I have a similar example with a lazy val that is never evaluated:
object nullless {
trait LowerBound[T] {
type M >: T;
}
trait UpperBound[U] {
type M <: U;
}
lazy val nothing : Nothing = nothing
class Box[V](val v: V)
lazy val box : Box[UpperBound[String] & LowerBound[Int]] = new Box(nothing)
def upcast(t : box.v.M) : box.v.M = t
def main(args : Array[String]) : Unit = {
val zero : String = upcast(0)
println("...")
}
}
Notice that at runtime, nothing and box are never needed, so they are not evaluated. Yet box.v.M is allowed to be used as a type in upcast, which causes the unsoundness.
Paolo's examples get a compile-time error in scalac:
<console>:15: error: lower bound Any does not conform to upper bound Nothing
val b: { type A >: Any <: Nothing } = loop()
Dotty doesn't have the same check for the bounds of a type parameter. Even if Dotty had such a check, it could be subverted with intersection types, as in our example above.
Scalac does not have intersection types, so the check cannot be subverted that way. However, Scalac's check can be subverted with enough indirection in the chain of type bounds:
object bar {
trait Outer {
trait T {
type A <: M
}
trait S {
type B >: M
}
val t: T
val s: S
type M >: t.A <: s.B
}
trait Sub[L, U] extends Outer {
override val t: T { type A = L }
override val s: S { type B = U }
}
lazy val nothing: Nothing = nothing
lazy val sub: Sub[Int, String] = nothing
def upcast(t: sub.M): sub.M = t
def main(args : Array[String]) : Unit = {
val zero : String = upcast(0)
println("...")
}
}
Compiles with Scalac, and crashes with
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
at bar$.main(bar.scala:21)
at bar.main(bar.scala)
Dotty checks type bounds more thoroughly, recursively, so it detects the problem in the bar example:
-- Error: /tmp/bar.scala:11:9 --------------------------------------------------
11 | type M >: t.A <: s.B
| ^
|illegal cyclic reference: upper bound Outer.this.s.B of type M refers back to the type itself
-- [E007] Type Mismatch Error: /tmp/bar.scala:21:31 ----------------------------
21 | val zero : String = upcast(0)
| ^
| Found: Int(0)
| Required: bar.sub.M
two errors found
But here is an example that exposes unsoundness in both Scalac and Dotty. It uses indirection to fool Scalac and intersection to fool Dotty:
object bar {
trait Sub {
type M
type L <: M
type U >: M
type M2 >: L <: U
}
class Box[V](val v: V)
class Caster[LL, UU] {
trait S2 {
type L = LL
type U = UU
}
final lazy val nothing: Nothing = nothing
final lazy val sub: S2 with Sub = nothing
final lazy val box : Box[S2 with Sub] = new Box(nothing)
def upcast(t: box.v.M2): box.v.M2 = t
}
def main(args : Array[String]) : Unit = {
val zero : String = (new Caster[Int,String]()).upcast(0)
println("...")
}
}
This compiles in both Scalac and Dotty. In both cases, running it gives
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
at bar$.main(bar.scala:21)
at bar.main(bar.scala)
Note that the with is interpreted as a true intersection by Dotty, but not by Scalac, but Scalac is tricked instead by the chain of type bounds.
@olhotak is the Box example the same as this https://github.com/lampepfl/dotty/issues/50#issuecomment-177736753 or is there something else going on? I thought this got fixed but haven't really followed.
Dotty rejects all of my examples with -strict.
In Paolo's examples, if you change the val b to lazy val b, Dotty gives a compile error:
2 | val a: String = (((1: Any): b.A): Nothing): String
| ^
|Object{A >: Any <: Nothing}(b) is not a legal path
|since its underlying type Object{A >: Any <: Nothing}(b) has a member A with possibly conflicting bounds Any <: ... <: Nothing
The realizability check is being skipped because the val is not lazy. Running the realizability check for non-lazy vals would fix the original problem (make Dotty reject the examples in the first post).
The problem is not one of initialization. The problem is that the type
b.Ais in scope.
What I mean: it seems okay to have b.A in scope after b is initialized, for the usual reasons (b then has evidence that b.A has good bounds).
Running the realizability check for non-lazy vals would fix the original problem (make Dotty reject the examples in the first post).
Yes, but doing that for all non-lazy vals seems too restrictive... But given an initialization check, we could allow paths that are either initialized or pass realizability.
@olhotak You have to compile this one with -strict to get the errors:
~/workspace/dotty/tests/neg> scalac nullless.scala -strict
-- Error: nullless.scala:11:21 -------------------------------------------------
11 | def upcast(t : box.v.M) : box.v.M = t
| ^^^^^
|(nullless.UpperBound[String] & nullless.LowerBound[Int])(nullless.box.v) is not a legal path
|since its underlying type nullless.Box[nullless.UpperBound[String] & nullless.LowerBound[Int]](nullless.
| box
|) has a member value v which is not a legal path
|since v: nullless.UpperBound[String] & nullless.LowerBound[Int] has a member M with possibly conflicting bounds Int <: ... <: String
-- Error: nullless.scala:11:32 -------------------------------------------------
11 | def upcast(t : box.v.M) : box.v.M = t
| ^^^^^
|(nullless.UpperBound[String] & nullless.LowerBound[Int])(nullless.box.v) is not a legal path
|since its underlying type nullless.Box[nullless.UpperBound[String] & nullless.LowerBound[Int]](nullless.
| box
|) has a member value v which is not a legal path
|since v: nullless.UpperBound[String] & nullless.LowerBound[Int] has a member M with possibly conflicting bounds Int <: ... <: String
two errors found
Big note to myself: Always compile with -strict when checking for soundness issues. I hunted quite a bit for why this passed until I realized I need to do that.
The reason for -strict is backwards compatibility, of course.
Most helpful comment
The problem is not one of initialization. The problem is that the type
b.Ais in scope.Marianna and I have a similar example with a lazy val that is never evaluated:
Notice that at runtime,
nothingandboxare never needed, so they are not evaluated. Yetbox.v.Mis allowed to be used as a type inupcast, which causes the unsoundness.