Dotty: Verify HKT GADT code

Created on 25 May 2020  Â·  12Comments  Â·  Source: lampepfl/dotty

Minimized code

sealed trait Test[+F[_], +A] extends Product with Serializable

object Test {

  implicit class Syntax[F[_], A](val self: Test[F, A]) extends AnyVal {

    def fold[B](completed: F[A] => B): B = self match {
      case Completed(fa) => completed(fa)
    }
  }


  final case class Completed[F[_], A](fa: F[A]) extends Test[F, A]
}

Output

Dotty 0.24.0-RC1:

[error] -- [E007] Type Mismatch Error: /Users/daniel/Development/Scala/cats-effect/core/src/main/scala/cats/effect/Test.scala:26:38 
[error] 26 |      case Completed(fa) => completed(fa)
[error]    |                                      ^^
[error]    |                 Found:    (fa : F$1[A$1])
[error]    |                 Required: F[A]
[error]    |
[error]    |                 where:    A$1 is a type in method fold with bounds <: A
[error]    |                           F$1 is a type in method fold with bounds <: F

Dotty master:

-- [E029] Pattern Match Exhaustivity Warning: try/dj.scala:7:43 ----------------
7 |    def fold[B](completed: F[A] => B): B = self match {
  |                                           ^^^^
  |                          match may not be exhaustive.
  |
  |                          It would fail on pattern case: Test.Completed(_)

longer explanation available when compiling with `-explain`
-- Warning: try/dj.scala:3:7 ---------------------------------------------------
3 |object Test {
  |       ^
  |       the type test for Test.Completed[?, ?] cannot be checked at runtime
2 warnings found

Expectation

I think 0.24.0-RC1 got it right, current master is missing the error and producing some weird warnings.

gadt typer bug

Most helpful comment

OK, I tried to break down the example into more elementary terms — something closer to a formal calculus — to make it more obvious (to me at least) where things should or shouldn't go wrong.

I'm starting from the following simplified example:

object Test {
  sealed trait Test[+F[_], +A] extends Product with Serializable
  final case class Completed[F[_], A](fa: F[A]) extends Test[F, A]
  def foo[F[_], A, B](x: Test[F, A], completed: F[A] => B): B = x match {
    case Completed(fa) => completed(fa)
  }
}

When I run this in scastie I get the 0.24.0-RC1 error. (Aside: what's the easiest way to get an up-to-date dotty shell on MacOS these days? Is there a brew package?)

Now let's try to translate this into something DOT like:

/* Define the interface first... */
trait TestIntf {
  trait Test[+F[_], +A] {            // should be type `type Test[+F[_], +A] <: ...` 
    type CompletedF[X] <: F[X]       // but dotty doesn't like structural records.
    type CompletedA    <: A
    def elim[B](f: CompletedF[CompletedA] => B): B
  }
  type Completed[F[_], A] <: Test[F, A];
}

/* ... then, separately, the implementation. */
object TestImpl extends TestIntf {
  type Completed[F[_], A] = Test[F, A] {      // should be `... = Test[F, A] & ...`
    type CompletedF[X] = F[X]
    type CompletedA    = A
    def elim[B](f: CompletedF[CompletedA] => B): B
  }

  def Completed[F[_], A](fa: F[A]): Completed[F, A] = new Test[F, A] {
    type CompletedF[X] = F[X]
    type CompletedA    = A
    def elim[B](f: CompletedF[CompletedA] => B): B = f(fa)
  }

  def foo[F[_], A, B](x: Test[F, A], completed: F[A] => B): B = x.elim {
      fa => completed(fa)
    }
}

Note that I separated interface from implementation so that we can see what the constraints are that we set up and where the implementation breaks those.

I tried to get rid of pattern matching, traits and refinement, but unfortunately, dotty won't quite accept my bare-bones (HK-)DOT code because it doesn't like raw structural records (at least not with polymorphic methods in it). This is why Test is still a trait (rather than just an abstract type with an upper bound). But Completed is completely abstract.

Note that there is no pattern matching in this code. Instead, there is an abstract elim method in Test that has to be implemented by the concrete "subclasses" to implement pattern matching for the various constructors. Remember that Test is a sealed trait, so we know all it's constructors. When that's the case, we can always define an eliminator, i.e. a method that takes a bunch of functions arguments that correspond to the various clauses in a pattern match. In a more complex setting, elim would take one argument for each constructor of Test, each of which would be a function that takes as its inputs the argument of that constructor. Here there's only one such constructor (namely Completed) so elim only takes one function (for scrutinizing the fa argument of Completed). Because the type parameters F and A don't necessarily correspond to the actual parameters in subclasses (remember this is a GADT), we need to also include two abstract types that represent the eventual concrete type parameters in Completed. Because F and A are covariant type arguments, they are upper bounds of the eventual types. Alternatively, these abstract types could have been type parameters of elim itself, but I think the current representation is more intuitive.

The implementation is then quite straightforward: we fix a concrete representation for Completed and implement its constructor. Again, I need to cheat a bit and use a refinement instead of new and a bare structural type.

The crucial bit is the implementation of foo, which now uses elim instead of match. And here we can see the culprit: we're trying to pass an instance of F[A] to f, which expects an instance of Completed$F[Completed$A]. We know that Completed$F <: F and Completed$A <: A but that's not helpful here. Firstly, we'd need the constraint to go the other way; secondly, we'd need F and Complete$F to be covariant for things to work as expected.

I don't know whether this clarifies much.

I guess the more low-level version exposes some tacit assumptions that are not plainly visible in the original. For example that there are constraints between the type parameters of Test and Completed, or that pattern matching introduces some contravariance. In any case, I'm now pretty convinced that the original error is correct.

Let me know if this helped.

All 12 comments

Hmm sorry, this is about GADT, not about exhaustivity checking, there might be some exhaustivity checking issue but they're not what's preventing the initial error.

@smarter seems like letting GADT refinements be used for HKTs didn't go so well. Do you suppose there's anything I could read on how subtyping is supposed to work for HKTs in presence of variance before I fiddle with that code again?

Do you suppose there's anything I could read on how subtyping is supposed to work for HKTs in presence of variance before I fiddle with that code again?

Like a write-up or paper? I don't think that exists.

I haven't read it yet, but maybe the closest thing is @sstucki's thesis.

I don't think variance was treated in Sandro's thesis (turns out things are already really complicated without it!)

No, unfortunately I did not manage to include variances in my thesis. Adding them is WIP but turned out to be much harder than I thought. The issue is that having variances, currying and dependent types in the same calculus destroys some nice properties that one normally uses to model such theories. For the time being, it's an open problem to combine these, to the best of my knowledge.

For many instances, however, we don't need to worry about dependent kinds (in particular if no type bounds are involved). And for that case, we have a theory. See for example the following paper by Andreas Abel (you can safely ignore the bit about sized types if you're only interested in variances and HK types):

Andreas Abel, _Polarized Subtyping for Sized Types_. MSCS, vol. 18, pp. 797-822, special issue on subtyping. Cambridge University Press, 2008. PDF

Hum, the interplay between variances and refinement here hurts my head. I'm honestly not sure how this would be desugared into any formal system I know of.

But I think @smarter is right that the 0.24.0-RC1 behavior is correct, because Test is covariant in it's arguments, we cannot assume that the F and A from the Complete instance are the same as the parameters passed to the Test type constructor.

If you give me a few days, I might be able to get back to you with a more definitive answer.

In the meantime, read Abel's paper y'all. It's really nice!

OK, I tried to break down the example into more elementary terms — something closer to a formal calculus — to make it more obvious (to me at least) where things should or shouldn't go wrong.

I'm starting from the following simplified example:

object Test {
  sealed trait Test[+F[_], +A] extends Product with Serializable
  final case class Completed[F[_], A](fa: F[A]) extends Test[F, A]
  def foo[F[_], A, B](x: Test[F, A], completed: F[A] => B): B = x match {
    case Completed(fa) => completed(fa)
  }
}

When I run this in scastie I get the 0.24.0-RC1 error. (Aside: what's the easiest way to get an up-to-date dotty shell on MacOS these days? Is there a brew package?)

Now let's try to translate this into something DOT like:

/* Define the interface first... */
trait TestIntf {
  trait Test[+F[_], +A] {            // should be type `type Test[+F[_], +A] <: ...` 
    type CompletedF[X] <: F[X]       // but dotty doesn't like structural records.
    type CompletedA    <: A
    def elim[B](f: CompletedF[CompletedA] => B): B
  }
  type Completed[F[_], A] <: Test[F, A];
}

/* ... then, separately, the implementation. */
object TestImpl extends TestIntf {
  type Completed[F[_], A] = Test[F, A] {      // should be `... = Test[F, A] & ...`
    type CompletedF[X] = F[X]
    type CompletedA    = A
    def elim[B](f: CompletedF[CompletedA] => B): B
  }

  def Completed[F[_], A](fa: F[A]): Completed[F, A] = new Test[F, A] {
    type CompletedF[X] = F[X]
    type CompletedA    = A
    def elim[B](f: CompletedF[CompletedA] => B): B = f(fa)
  }

  def foo[F[_], A, B](x: Test[F, A], completed: F[A] => B): B = x.elim {
      fa => completed(fa)
    }
}

Note that I separated interface from implementation so that we can see what the constraints are that we set up and where the implementation breaks those.

I tried to get rid of pattern matching, traits and refinement, but unfortunately, dotty won't quite accept my bare-bones (HK-)DOT code because it doesn't like raw structural records (at least not with polymorphic methods in it). This is why Test is still a trait (rather than just an abstract type with an upper bound). But Completed is completely abstract.

Note that there is no pattern matching in this code. Instead, there is an abstract elim method in Test that has to be implemented by the concrete "subclasses" to implement pattern matching for the various constructors. Remember that Test is a sealed trait, so we know all it's constructors. When that's the case, we can always define an eliminator, i.e. a method that takes a bunch of functions arguments that correspond to the various clauses in a pattern match. In a more complex setting, elim would take one argument for each constructor of Test, each of which would be a function that takes as its inputs the argument of that constructor. Here there's only one such constructor (namely Completed) so elim only takes one function (for scrutinizing the fa argument of Completed). Because the type parameters F and A don't necessarily correspond to the actual parameters in subclasses (remember this is a GADT), we need to also include two abstract types that represent the eventual concrete type parameters in Completed. Because F and A are covariant type arguments, they are upper bounds of the eventual types. Alternatively, these abstract types could have been type parameters of elim itself, but I think the current representation is more intuitive.

The implementation is then quite straightforward: we fix a concrete representation for Completed and implement its constructor. Again, I need to cheat a bit and use a refinement instead of new and a bare structural type.

The crucial bit is the implementation of foo, which now uses elim instead of match. And here we can see the culprit: we're trying to pass an instance of F[A] to f, which expects an instance of Completed$F[Completed$A]. We know that Completed$F <: F and Completed$A <: A but that's not helpful here. Firstly, we'd need the constraint to go the other way; secondly, we'd need F and Complete$F to be covariant for things to work as expected.

I don't know whether this clarifies much.

I guess the more low-level version exposes some tacit assumptions that are not plainly visible in the original. For example that there are constraints between the type parameters of Test and Completed, or that pattern matching introduces some contravariance. In any case, I'm now pretty convinced that the original error is correct.

Let me know if this helped.

@sstucki thanks for the detailed rundown. AFAIK your understanding of GADTs in terms of type members refined during pattern matching is precisely how the compiler is _supposed to_ reason about them — see also our short paper at the Scala symposium (Towards Improved GADT Reasoning in Scala). So this reasoning was apparently made buggy by recent changes in constraints representation/handling.

Excellent, thanks for the pointer!

A patchwork fix was applied in https://github.com/lampepfl/dotty/pull/9140. It turned out that the culprit was GADT member lookup healing + Dotty idiosyncracies. Still, I'm leaving this issue open, since I do want to take a look into the original code _at some point_ and figure out whether it's sound or not.

@sstucki thanks for your detailed answer! As Lionel said, this looks close to how I tried to think about GADTs and it definitely helps to have confirmation from another angle that this kind of thinking makes sense.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

julienrf picture julienrf  Â·  3Comments

dwijnand picture dwijnand  Â·  3Comments

noti0na1 picture noti0na1  Â·  3Comments

ohze picture ohze  Â·  3Comments

m-sp picture m-sp  Â·  3Comments