Dotty: Unsoundness from not-yet-initialized paths — no null-involved

Created on 5 Feb 2019 · 12Comments · Source: lampepfl/dotty

Here's an unsoundness hole I wasn't aware about, and I can't find in the issue tracker — it involves initialization, but not null. Instead of using null, we can just select an ill-bounded type from a not-yet-initialized path that's going to crash/loop when we get there — so we get a ClassCastException instead of an abort (exception) or loop.

scala> new {
     |   val a: String = (((1: Any): b.A): Nothing): String
     |   val b: { type A >: Any <: Nothing } = ???
     | }
java.lang.ClassCastException: java.lang.Integer cannot be cast to scala.runtime.Nothing$
    at rs$line$1$$anon$1.<init>(rs$line$1:2)

scala> new {
     |   val a: String = (((1: Any): b.A): Nothing): String
     |   def loop(): Nothing = loop()
     |   val b: { type A >: Any <: Nothing } = loop()
     | }
     |
java.lang.ClassCastException: java.lang.Integer cannot be cast to scala.runtime.Nothing$
    at rs$line$4$$anon$1.<init>(rs$line$4:2)
    at rs$line$4$.<init>(rs$line$4:5)

One can verify that commenting out val a gives code that involves no nulls.

Discussion

We've known for a while that we can inhabit types with bad bounds with expressions that don't return a value — val v { type A >: Any <: Nothing } = failing/looping. But we always reasoned that v.A would only be available after the crash. The above examples show a loophole we missed.

This loophole is outside the taxonomy of "scaling-up" loopholes in https://www.scala-lang.org/blog/2016/02/17/scaling-dot-soundness.html.

And it doesn't appear in any known DOT variant that I'm familiar with, because it requires both paths and potentially diverging initialization expressions. If this issue is indeed novel to us, I'll ask other DOT researchers.

Above, the type of b has obvious bad bounds, but I assume we can give it non-obvious bad bounds (as usual) to get the same error.

I can see two likely sound potential fixes, but they are not yet realistic.

checking strong normalization for value members. Likely sound, pretty restrictive. This could "work" if we required programmers to use some escape hatch (@total) when we can't check totality ourselves — the main benefit would be that @total documents the danger.
forbidding paths starting in this.f before f is initialized, following @liufengyun et al.'s safe initialization work. I'd expect this to be pretty restrictive, but I don't know.
We could also declare these errors out-of-scope.

Regarding 2: In fact, I realized this problem by comparing DOT's and @liufengyun's typing rules for object creation: one declares all fields to be instantly initialized and accessible in the typing context when typing other fields (which is valid if initializers are syntactic values), the other only has in context the actually initialized fields.

initialization bug

Source

Blaisorblade

👍2

Most helpful comment

The problem is not one of initialization. The problem is that the type b.A is in scope.

Marianna and I have a similar example with a lazy val that is never evaluated:

object nullless {
  trait LowerBound[T] {
    type M >: T;
  }
  trait UpperBound[U] {
    type M <: U;
  }

  lazy val nothing : Nothing = nothing

  class Box[V](val v: V)
  lazy val box : Box[UpperBound[String] & LowerBound[Int]] = new Box(nothing)
  def upcast(t : box.v.M) : box.v.M = t

  def main(args : Array[String]) : Unit = {
    val zero : String = upcast(0)
    println("...")
  }
}

Notice that at runtime, nothing and box are never needed, so they are not evaluated. Yet box.v.M is allowed to be used as a type in upcast, which causes the unsoundness.

olhotak on 5 Feb 2019

👍2

All 12 comments

/cc @alexknvl @olhotak @amaurremi @namin @tiarkrompf @odersky

Blaisorblade on 5 Feb 2019

Assume we have a working initialization checker. Is the following true:

Every program falling into this soundness hole will be rejected by the initialization checker.

That's what I assume, but I want to be sure.

odersky on 5 Feb 2019

@odersky That makes sense if we change the initialization checker for it and accept the false negatives? That's option 2 above.

The checker would have to reject this.b.A even though this doesn't actually use this.b at runtime. I guess the checker didn't, but haven't tried. @liufengyun might know better?

Also not sure what should happen with more useful applications like (hierarchical) cakes, and implications for existing code. https://gist.github.com/Blaisorblade/23b9eae6df46bad3f551e51407c9286d

Blaisorblade on 5 Feb 2019

@odersky For instance, from the above gist, should the initialization checker reject this code? It uses t.Exp before t is initialized, and t.Exp could have bad bounds...

trait Eval { outer =>
  val t: Trees
  val n: Normalize { val t: outer.t.type }
  type Val
  val eval: t.Exp => Val
  val normalizingEval: t.Exp => Val =
    exp => eval(n.normalize(exp))
}

Blaisorblade on 5 Feb 2019

The problem is not one of initialization. The problem is that the type b.A is in scope.

Marianna and I have a similar example with a lazy val that is never evaluated:

object nullless {
  trait LowerBound[T] {
    type M >: T;
  }
  trait UpperBound[U] {
    type M <: U;
  }

  lazy val nothing : Nothing = nothing

  class Box[V](val v: V)
  lazy val box : Box[UpperBound[String] & LowerBound[Int]] = new Box(nothing)
  def upcast(t : box.v.M) : box.v.M = t

  def main(args : Array[String]) : Unit = {
    val zero : String = upcast(0)
    println("...")
  }
}

Notice that at runtime, nothing and box are never needed, so they are not evaluated. Yet box.v.M is allowed to be used as a type in upcast, which causes the unsoundness.

olhotak on 5 Feb 2019

👍2

Paolo's examples get a compile-time error in scalac:

<console>:15: error: lower bound Any does not conform to upper bound Nothing
          val b: { type A >: Any <: Nothing } = loop()

Dotty doesn't have the same check for the bounds of a type parameter. Even if Dotty had such a check, it could be subverted with intersection types, as in our example above.

Scalac does not have intersection types, so the check cannot be subverted that way. However, Scalac's check can be subverted with enough indirection in the chain of type bounds:

object bar {
  trait Outer {
    trait T {
      type A <: M
    }
    trait S {
      type B >: M
    }
    val t: T
    val s: S
    type M >: t.A <: s.B
  }
  trait Sub[L, U] extends Outer {
    override val t: T { type A = L }
    override val s: S { type B = U }
  }
  lazy val nothing: Nothing = nothing
  lazy val sub: Sub[Int, String] = nothing
  def upcast(t: sub.M): sub.M = t
  def main(args : Array[String]) : Unit = {
    val zero : String = upcast(0)
    println("...")
  }
}

Compiles with Scalac, and crashes with

java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
    at bar$.main(bar.scala:21)
    at bar.main(bar.scala)

Dotty checks type bounds more thoroughly, recursively, so it detects the problem in the bar example:

-- Error: /tmp/bar.scala:11:9 --------------------------------------------------
11 |    type M >: t.A <: s.B
   |         ^
   |illegal cyclic reference: upper bound Outer.this.s.B of type M refers back to the type itself
-- [E007] Type Mismatch Error: /tmp/bar.scala:21:31 ----------------------------
21 |    val zero : String = upcast(0)
   |                               ^
   |                               Found:    Int(0)
   |                               Required: bar.sub.M
two errors found

But here is an example that exposes unsoundness in both Scalac and Dotty. It uses indirection to fool Scalac and intersection to fool Dotty:

object bar {
  trait Sub {
    type M
    type L <: M
    type U >: M
    type M2 >: L <: U
  }
  class Box[V](val v: V)

  class Caster[LL, UU] {
    trait S2 {
      type L = LL
      type U = UU
    }
    final lazy val nothing: Nothing = nothing
    final lazy val sub: S2 with Sub = nothing
    final lazy val box : Box[S2 with Sub] = new Box(nothing)
    def upcast(t: box.v.M2): box.v.M2 = t
  }
  def main(args : Array[String]) : Unit = {
    val zero : String = (new Caster[Int,String]()).upcast(0)
    println("...")
  }
}

This compiles in both Scalac and Dotty. In both cases, running it gives

java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
    at bar$.main(bar.scala:21)
    at bar.main(bar.scala)

Note that the with is interpreted as a true intersection by Dotty, but not by Scalac, but Scalac is tricked instead by the chain of type bounds.

olhotak on 5 Feb 2019

@olhotak is the Box example the same as this https://github.com/lampepfl/dotty/issues/50#issuecomment-177736753 or is there something else going on? I thought this got fixed but haven't really followed.

TiarkRompf on 5 Feb 2019

Dotty rejects all of my examples with -strict.

olhotak on 5 Feb 2019

In Paolo's examples, if you change the val b to lazy val b, Dotty gives a compile error:

2 |   val a: String = (((1: Any): b.A): Nothing): String
  |                               ^
  |Object{A >: Any <: Nothing}(b) is not a legal path
  |since its underlying type Object{A >: Any <: Nothing}(b) has a member A with possibly conflicting bounds Any <: ... <: Nothing

The realizability check is being skipped because the val is not lazy. Running the realizability check for non-lazy vals would fix the original problem (make Dotty reject the examples in the first post).

olhotak on 5 Feb 2019

👍1

The problem is not one of initialization. The problem is that the type b.A is in scope.

What I mean: it seems okay to have b.A in scope after b is initialized, for the usual reasons (b then has evidence that b.A has good bounds).

Running the realizability check for non-lazy vals would fix the original problem (make Dotty reject the examples in the first post).

Yes, but doing that for all non-lazy vals seems too restrictive... But given an initialization check, we could allow paths that are either initialized or pass realizability.

Blaisorblade on 5 Feb 2019

@olhotak You have to compile this one with -strict to get the errors:

~/workspace/dotty/tests/neg> scalac nullless.scala -strict
-- Error: nullless.scala:11:21 -------------------------------------------------
11 |  def upcast(t : box.v.M) : box.v.M = t
   |                 ^^^^^
   |(nullless.UpperBound[String] & nullless.LowerBound[Int])(nullless.box.v) is not a legal path
   |since its underlying type nullless.Box[nullless.UpperBound[String] & nullless.LowerBound[Int]](nullless.
   |  box
   |) has a member value v which is not a legal path
   |since v: nullless.UpperBound[String] & nullless.LowerBound[Int] has a member M with possibly conflicting bounds Int <: ... <: String
-- Error: nullless.scala:11:32 -------------------------------------------------
11 |  def upcast(t : box.v.M) : box.v.M = t
   |                            ^^^^^
   |(nullless.UpperBound[String] & nullless.LowerBound[Int])(nullless.box.v) is not a legal path
   |since its underlying type nullless.Box[nullless.UpperBound[String] & nullless.LowerBound[Int]](nullless.
   |  box
   |) has a member value v which is not a legal path
   |since v: nullless.UpperBound[String] & nullless.LowerBound[Int] has a member M with possibly conflicting bounds Int <: ... <: String
two errors found

odersky on 6 Feb 2019

Big note to myself: Always compile with -strict when checking for soundness issues. I hunted quite a bit for why this passed until I realized I need to do that.

The reason for -strict is backwards compatibility, of course.

odersky on 6 Feb 2019

Was this page helpful?

0 / 5 - 0 ratings