Dotty: Scala Wart: For-comprehensions syntax restrictions

Created on 29 May 2017  路  8Comments  路  Source: lampepfl/dotty

Opening this issue, as suggested by Martin, to provide a place to discuss the individual warts brought up in the blog post Warts of the Scala Programming Language and the possibility of mitigating/fixing them in Dotty (and perhaps later in Scala 2.x). These are based on Scala 2.x behavior, which I understand Dotty follows closely, apologies in advance if it has already been fixed


As mentioned above, you cannot have defs, classes, or imperative
statements in the generators of a for-comprehension:

scala> for{
     |   i <- Seq(1)
     |   println(i)
     |   j <- Seq(2)
<console>:4: error: '<-' expected but ';' found.
  j <- Seq(2)
^

This is a rather arbitrary restriction, and as far as I can tell doesn't serve
any purpose, and forces you to put random _ = prefixes on your statements to
make things compile:

scala> for{
     |   i <- Seq(1)
     |   _ = println(i)
     |   j <- Seq(2)
     | } yield j
1
res0: Seq[Int] = List(2)

There really isn't any reason that this shouldn't work out-of-the-box, and
convert say:

for{
  i <- Seq(1)
  def debug(s: Any) = println("Debug " + s)
  debug(i)
  j <- Seq(2)
  debug(j)
  k <- Seq(3)
} yield i + j + k

Into

Seq(1).flatMap{ i => 
  def debug(s: Any) = println("Debug " + s)
  debug(i)
  Seq(2).flatMap{ j =>
    debug(j)
    Seq(3).map{ k => 
      i + j + k
    }
  }
}
revisit

Most helpful comment

see #1982, too

All 8 comments

see #1982, too

One question is whether it would be more useful to make this behave more like do notation in Haskell-land. Rather than being equivalent to _ = ${expr}, bare expressions would be equivalent to _ <- ${expr}.

That is, a more useful variant of bare words for our use cases would be to change:

for {
  _ <- uploadFile(filename, ...)
  _ <- validateFile(filename, ...)
  dbRow <- registerFile(filename, ...)
} yield dbRow

where uploadFile, validateFile and registerFile return Future[_] - so that it would instead look like:

for {
  uploadFile(filename, ...)
  validateFile(filename, ...)
  dbRow <- registerFile(filename, ...)
} yield dbRow

The only reason I mention this here is because it gives an alternate semantics to 'bare expressions' at the root level of a for expression.

I don't dispute that it's occasionally useful to be able to run simple imperative expressions in the middle of a for comprehension (indeed debugging is the most common reason I've done this), but far more often than that I would like to be able to recommend to new Scala developers on my team that when it comes to Future (or Option, or Try, or ...) composition they simply write:

for {
  callFirstFuture(...)
  callSecondFuture(...)
  result <- callThirdFuture(...)
} yield result

Which desugars to:

callFirstFuture(...).flatMap{ _ => 
  callSecondFuture(...).flatMap{ _ =>
    callThirdFuture(...).map{ result => 
      result
    }
  }
}

@tekacs yeah I wouldn't mind that semantic either. Both are possibly useful, and we could possibly allow both if we differentiate them via some syntax. Allowing bare expressions to be equivalent to _ <- would clean up a lot of nasty monadic code:

https://scala.fluentcode.com/?squery=_%20%3C-#L22

def apply(byUserId: User.ID, study: Study, invitedUsername: String, socket: ActorRef): Funit = for {
    _ <- !study.isOwner(byUserId) ?? fufail[Unit]("Only study owner can invite")
    _ <- (study.nbMembers >= maxMembers) ?? fufail[Unit](s"Max study members reached: $maxMembers")
    inviter <- UserRepo.named(byUserId) flatten "No such inviter"
    invited <- UserRepo.named(invitedUsername) flatten "No such invited"
    _ <- study.members.contains(invited) ?? fufail[Unit]("Already a member")
    relation <- getRelation(invited.id, byUserId)
    _ <- relation.has(Block) ?? fufail[Unit]("This user does not want to join")
    isPresent <- socket ? HasUserId(invited.id) mapTo manifest[Boolean]
    _ <- if (isPresent) funit else getPref(invited).map(_.studyInvite).flatMap {
      case Pref.StudyInvite.ALWAYS => funit
      case Pref.StudyInvite.NEVER => fufail("This user doesn't accept study invitations")
      case Pref.StudyInvite.FRIEND =>
        if (relation.has(Follow)) funit
        else fufail("This user only accept study invitations from friends")
    }
    _ <- studyRepo.addMember(study, StudyMember make invited)
    shouldNotify = !isPresent && (!inviter.troll || relation.has(Follow))
    _ <- shouldNotify ?? {
      val notificationContent = InvitedToStudy(
        InvitedToStudy.InvitedBy(inviter.id),
        InvitedToStudy.StudyName(study.name.value),
        InvitedToStudy.StudyId(study.id.value)
      )
      val notification = Notification.make(Notification.Notifies(invited.id), notificationContent)
      notifyApi.addNotification(notification)
    }
  } yield ()

```scala
def inout(): Unit = {
def calc(in: MVar[Int], out: MVar[Int]): IO[Unit] =
for {
a <- in.take
b <- in.take
_ <- out.put(a * b)
} yield ()

val io =
  for {
    in  <- newMVar(6)
    out <- newEmptyMVar[Int]
    _   <- forkIO(calc(in, out))
    _   <- in.put(7)
    a   <- out.take
  } yield a
assert(io.unsafePerformIO === 42)

}

```scala
        for {
          _ <- createTable(journalStatements, tableExists(journalCfg.schemaName, journalCfg.tableName))
          _ <- createTable(snapshotStatements, tableExists(snapshotCfg.schemaName, snapshotCfg.tableName))
        } yield Done.getInstance()

@lihaoyi Yup definitely! I originally added some extra examples like the above to my post and then dropped them (I wanted to keep to the 'confusion for newcomers' theme and Future/Task/... composition has been a real pain point for newcomers I've brought on board).

That said it would be much easier for me to write monadic APIs like the above in good faith if I knew downstream users could consume them with this less baroque syntax.

:+1: for allowing both _ <- and _ = using some syntax, so long as the result is still easy to read and write.

The main reason for not allowing imperative statements in for-comprehensions is that it would render the grammar ambiguous when it comes to LL parsing (which is the technique of choice of scalac and dotty). If you see

for {
   x <- xs
   E

then E could either be an expression or a pattern, and you find that out only once you have parsed E. If E is followed by <-, it must be a pattern, otherwise it must be an expression. This ambiguity would complicate the recursive descent parser quite a bit. Overall, I don't think the in my mind very modest gain in convenience is worth messing up the parser to that degree.

So, I would not classify this as a wart according to the original definition because there's a good reason after all why things are done the way they are.

then E could either be an expression or a pattern, and you find that out only once you have parsed E. If E is followed by <-, it must be a pattern, otherwise it must be an expression. This ambiguity would complicate the recursive descent parser quite a bit.

Don't we already have such ambiguity? If the parser sees

foo{ (x: Int, y: String)

We don't know if x and y are a tuple expression, or the LHS of a lambda. Perhaps we don't want more ambiguity, but I don't think the ambiguity causes any problem for users. IIRC I had to deal with precisely this bit of ambiguity in ScalaParse to avoid worst-case-exponential parsing behaviour, but I never see any users confused about this.

Given how much I see _ <-s in Scala code "in the wild" (some copied above), I'd argue it's a non-trivial gain in convenience that perhaps might be worth the internal complexity, but that's of course a subjective opinion

Don't we already have such ambiguity? If the parser sees
foo{ (x: Int, y: String)
We don't know if x and y are a tuple expression, or the LHS of a lambda.

Correct. It's handled in a very ad-hoc and somewhat messy way. We parse it as an expression, and then when we see the => we try to convert the tree resulting from an expression to a parameter list. We could do the same for patterns, but I believe that would be much harder, since patterns have much more complex syntax than parameter lists and are at the same time subtly different from expressions.

EDIT: It's actually even worse than that. For parameter lists, we know that the syntax of lambda parameter lists is a sublanguage of the syntax of expressions (modulo first implicit modifier, which we handle with a one-token lookahead). So we can parse expressions and then try to convert to parameter lists. But the syntax of patterns is not a sublanguage of the syntax of expressions. So we'd have to invent a new syntactic category and parser for the union of expressions and patterns. That's a significant complication!

I don't see a lot of appetite to tackle this, since it would require major changes to the principle and implementation of parsing. So it would be a high effort project that also comes with high risk of failure.

Was this page helpful?
0 / 5 - 0 ratings