Cats: `cats.Traverse` should implement `cats.Foldable`

Created on 6 Feb 2015  ·  23Comments  ·  Source: typelevel/cats

enhancement ready

Most helpful comment

Hey @LukaJCB, a fellow redditor helped me out : my foldLeft was actually folding right 😅 . Here's what I ended up with, laws are all green : https://gist.github.com/Baccata/0b972953cd65421d6a475b37aadad65b#file-foldablefromtraverse-scala

All 23 comments

@adelbertc could you please give some more detail? Traverse extends Foldable source, so it _does_ implement Foldable, doesn't it?

Hm looks like the current definition of Foldable requires foldLeft and foldRight, a bit different from Scalaz's. Anyways, can't Traverse implement Foldable in terms of the traverse operation by picking the Applicative to be Const? That gets you foldMap which I think gets you foldRight and by extension foldLeft. Though that probably isn't very performant?

The main issue is that Const is in data

The data module has been merged into the core module, so I think this should be fairly straightforward to address now.

I will do this as soon as PR #328 (Tests for traverse and foldable) is merged.

An issue I had when I tried tackling this was Foldable requires foldLeft and foldRight. The "usual" means of defining Foldable in terms of Traverse is to fix the Applicative to be Const[X : Monoid, ?] which gives you foldMap, from which you can derive fold{Left, Right}. However for efficiency and laziness reasons it's not so simple as just defining them as stated above.

I don't think it's possible to define foldRight [or partialFold which it requires] properly in terms of traverse as-is because we can't do a partial traversal.

Certainly we could make the default implementation eager (and override in the instances that support partial traversals).

Look for example at stream.scala - traverse is implemented by forcing a foldRight! There's no way this will not inspect the whole stream.

Does this indicate that Traverse should not imply Foldable? or do we split LazyFoldable out into a separate typeclass?

We actually have a partialFold method precisely to allow composition of lazy folds, check it out!

I'm not 100% sure if it will solve your problem, but I think it might.

No, I need to _implement_ partialFold from traverse to successfully resolve this - the point is to have a default Foldable instance given Traverse.

Aha, OK, I see what you mean. Sorry, ignore me!

I suppose you could have a Foldable[F[_]] that has strict semantics which Traverse[F[_]] can implement fully, and then have a LazyFoldable[F[_]] extends Foldable[F] for the nice lazy fold stuff? Seems a bit strange though.

I'm beginning to wonder if this is actually pretty closely related to #1473.

In that issue, @johnynek said:

I'm trying to think of some other ways around this, but, especially for StateT it seems hard to make a sufficiently lazy map2Eval for this to work.

Let's consider the case of Stream. Here is its Traverse implementation:

      override def traverse[G[_], A, B](fa: Stream[A])(f: A => G[B])(implicit G: Applicative[G]): G[Stream[B]] = {
        // We use foldRight to avoid possible stack overflows. Since
        // we don't want to return a Eval[_] instance, we call .value
        // at the end.
        foldRight(fa, Always(G.pure(Stream.empty[B]))){ (a, lgsb) =>
          G.map2Eval(f(a), lgsb)(_ #:: _)
        }.value
      }

As long as that G.map2Eval call is sufficiently lazy, the traverse is lazy.

I tried implementing my own lazy foldRight in terms of traverse, and it came out like this:

  def foldRightTemp[A, B](fa: F[A], lb: Eval[B])(f: (A, Eval[B]) => Eval[B]): Eval[B] = {
    val x: State[Eval[B], F[Unit]] = traverse[State[Eval[B], ?], A, Unit](fa)(a =>
      StateT.modify(lb => f(a, lb))
    )
    x.runS(lb).flatMap(identity)
  }

This compiles and is properly lazy on the lb input, but it traverses the entire structure. I'm thinking that if we could get a sufficiently lazy map2Eval for State then this would short-circuit the traversal. Or, if this isn't possible with map2Eval for StateT, perhaps something like what's being discussed in #1476 is the answer to this problem.

It's also possible that I just don't know what I'm talking about and that this can't be done.

Here's my opinion.

Let Traverse extend Foldable without using traverse to implement the fold operators. They'd be inefficient anyway, even if we could implement them. I think what we can see in your example is that we can't make State lazily traverse because we would miss state updates, because if the state update has been evaluated then so has the value (though if that lazy map2Eval is evaluated I'm proven wrong).

Hello everyone,

I'm current using recursion schemes patterns at work. I'm re-implementing the few functions the project uses (rather than depending on Matryoshka), using cats typeclasses in place of the Scalaz ones.

Recursion schemes rely on hand-made "pattern functors", which usually means the developer has to implement Functor/Traverse instances. The fact that the Traverse interface pushes me to provide implementations of the Foldable contract (when I don't necessarily need a Foldable) is somewhat annoying. In the sake of faster prototyping, I decided to provide my own default TraverseFoldable abstract class which my Traverse instances can depend on, and which implements foldLeft and foldRight in terms of traverse.

Here's what I have atm:

abstract class TraverseFoldable[F[_]] extends Traverse[F] {

  override def foldMap[A, B: Monoid](fa: F[A])(f: A => B): B =
    traverse[Const[B, ?], A, B](fa)(a => Const(f(a))).getConst

  private def endomorphismMonoid[A]: Monoid[A => A] = new Monoid[A => A] {
    def combine(f: A => A, g: A => A) = f compose g
    def empty: A => A                 = (a: A) => a
  }

  private def defer[B](f: Eval[B] => Eval[B]): Eval[B] => Eval[B] = evalB => Eval.defer(f(evalB))

  override def foldRight[A, B](fa: F[A], lb: Eval[B])(f: (A, Eval[B]) => Eval[B]): Eval[B] = {
    foldMap(fa)(f.curried andThen defer)(endomorphismMonoid).apply(lb)
  }

  override def foldLeft[A, B](fa: F[A], b: B)(f: (B, A) => B): B =
    foldRight(fa, Eval.now(b)) {
      case (a, evalB) => Eval.now(f(evalB.value, a))
    }.value

}

A few naive tests show it sort of works. However, I'm trying to check the TraverseLaws on the following :

  implicit val ListTraverse = new TraverseFoldable[List] {
    override def traverse[G[_], A, B](
      fa: List[A]
    )(f: A => G[B])(implicit G: Applicative[G]): G[List[B]] = fa match {
      case head :: tail => G.map2(f(head), traverse(tail)(f)) { case (a, b) => a :: b }
      case Nil               => G.pure(Nil)
    }
  }

The foldM identity and collectFirstSome reference tests fail consistently (if anybody can point me to their own implementation of foldRight, foldLeft so that I can understand what I'm doing wrong, I'd really appreciate it).

Anyhow, this leads me to the following question : if Traverse extends Foldable without providing default foldLeft and foldRight implementations because of a totally understandable performance concern , how about providing a default implementation on the side (in the companion object of Traverse for instance) that people can use when implementing their own Traverse if they don't care about the Foldable aspect of their Traverse (or want faster prototyping) ?

Hi @Baccata, I'm not sure about the law failure, but I agree with you that we should offer a way to derive Foldable by just implementing traverse. 👍

Hey @LukaJCB, a fellow redditor helped me out : my foldLeft was actually folding right 😅 . Here's what I ended up with, laws are all green : https://gist.github.com/Baccata/0b972953cd65421d6a475b37aadad65b#file-foldablefromtraverse-scala

An interesting perspective on the relationship between Traverse and Foldable here and the linked Haskell mailing list thread.

@Baccata defer did not save the save the stack safety

val F = new FoldableFromTraverse[List]{
    override def traverse[G[_]:Applicative, A, B](fa: List[A])(f: A => G[B]): G[List[B]] = 
        Traverse[List].traverse(fa)(f)
}
val llarge = List.range(0,100000)
F.foldRight(llarge, Now(0))((buff, eval) => eval.map(_ + buff)).value

fails with

java.lang.StackOverflowError
  scala.Function1$$anonfun$compose$1.apply(Function1.scala:44)
  scala.Function1$$anonfun$compose$1.apply(Function1.scala:44)
  scala.Function1$$anonfun$compose$1.apply(Function1.scala:44)
  scala.Function1$$anonfun$compose$1.apply(Function1.scala:44)
  scala.Function1$$anonfun$compose$1.apply(Function1.scala:44)
  scala.Function1$$anonfun$compose$1.apply(Function1.scala:44)

@letalvoj, looks like you managed to implement it. From what I gather, Function1.andThen is not stack safe and cats has a (although private) implementation that uses constant stack space which you leveraged to solve the stack safety problem. Am I correct ?

Do you have a standalone snippet by any chance ? I'd be really interested to see the stack-safe solution.

Well there is the AndThen impl which is supposed to solve that. Yet it is not commutative ;-) Unfortunatelly it happened to me that the underlying foldMap[F[_],A,B:Monioid], where B was AndThen interleaved the aggregated sequence with Monoid.empty - which does not affect the outcome, yet it invalidates the internal counter of AndThen.

I solved that by ignoring the zeros with - by implementing a monoid which ignores zeros. Plus bunch of other hacks, i.e. combine(x,y) = y andThen x due to the order in which the foldMap provides the value and the current accumulator.

@implicitNotFound("Could not derive an instance of Traverse[${F}]")
trait MkTraverse[F[_]] extends Traverse[F] {

  override def traverse[G[_] : Applicative, A, B](fa: F[A])(f: A => G[B]): G[F[B]] =
    safeTraverse(fa)(a => Eval.now(f(a))).value

  def safeTraverse[G[_] : Applicative, A, B](fa: F[A])(f: A => Eval[G[B]]): Eval[G[F[B]]]

  override def foldMap[A, B: Monoid](fa: F[A])(f: A => B): B =
    traverse[cats.data.Const[B, ?], A, B](fa) { a => cats.data.Const(f(a)) } getConst

  def foldRightSafe[A, B](fa: F[A], lb: B)(f: (A, B) => B): B = foldRight(fa, Later(lb)) {
    case (a, eb) => eb.map(b => f(a, b))
  }.value

  type AndThenEndo[T] = AndThen[T, T]

  override def foldRight[A, B](fa: F[A], lb: Eval[B])(f: (A, Eval[B]) => Eval[B]): Eval[B] =
    foldMap[A, IgnoreZeros[AndThenEndo[Eval[B]]]](fa) { (a: A) =>
      NonZero(AndThen { (b: Eval[B]) => Eval.defer(f(a, b)) })
    }(IgnoreZeros.monoid(andThenEndoMonoid)).value(lb)

  override def foldLeft[A, B](fa: F[A], b: B)(f: (B, A) => B): B =
    foldMap[A, Endo[B]](fa) { a => b => f(b, a) }(dual(MonoidK[Endo].algebra))(b)

  private def andThenEndoMonoid[B]: Monoid[AndThenEndo[Eval[B]]] = new Monoid[AndThenEndo[Eval[B]]] {
    override def empty: AndThenEndo[Eval[B]] = AndThen(identity[Eval[B]])
    override def combine(x: AndThenEndo[Eval[B]], y: AndThenEndo[Eval[B]]): AndThenEndo[Eval[B]] = y andThen x
  }

  private def dual[A](m: Monoid[A]): Monoid[A] = new Monoid[A] {
    def combine(x: A, y: A): A = m.combine(y, x)
    def empty: A = m.empty
  }

}

trait IgnoreZeros[T] {
  def value: T
}
case class Zero[T](value: T) extends IgnoreZeros[T]
case class NonZero[T](value: T) extends IgnoreZeros[T]

object IgnoreZeros {
  def monoid[B](mb: Monoid[B]): Monoid[IgnoreZeros[B]] = new Monoid[IgnoreZeros[B]] {
    override def empty: IgnoreZeros[B] = Zero(mb.empty)

    override def combine(x: IgnoreZeros[B], y: IgnoreZeros[B]): IgnoreZeros[B] = (x, y) match {
      case (NonZero(xx), NonZero(yy)) => NonZero(mb.combine(xx, yy))
      case (_: Zero[B], _) => y
      case (_, _: Zero[B]) => x
    }
  }
}

It could be fixed by making the AndThen commutative (w.r.t to the stack safety property). Yet it might be trickier than it seems. Currently there is no reasoning regarding _are we joining two AndThens together_ - it works only in a case of well behaved fold.

Small note, I think it would be a bad idea to make the above the default on the typeclass.

The performance of the above appears to be (though we should benchmark) seems likely to be poor.

If we add this, we could add something like trait DefaultTraverse[F[_]] extends Traverse[F] (as you have done here with MkTraverse, but we should comment it is likely to give much worse performance than implementing all the methods yourself.

Yeah, well it could be optimized a bit yet the amount of unnecessary calls
to empty which I observed was in my particular usecase was huge. Plus the
double stack safety handing in both folds and reverse...
On Thu, 6 Sep 2018 at 19:53, P. Oscar Boykin notifications@github.com
wrote:

Small note, I think it would be a bad idea to make the above the default
on the typeclass.

The performance of the above appears to be (though we should benchmark)
seems likely to be poor.

If we add this, we could add something like trait DefaultTraverse[F[_]]
extends Traverse[F] (as you have done here with MkTraverse, but we should
comment it is likely to give much worse performance than implementing all
the methods yourself.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/typelevel/cats/issues/107#issuecomment-419184383, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AFBL8bBwrgboncjpn7SUN9Rs6THBCPokks5uYWEWgaJpZM4DdBAV
.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

peterneyens picture peterneyens  ·  5Comments

tg44 picture tg44  ·  4Comments

LukaJCB picture LukaJCB  ·  4Comments

SimY4 picture SimY4  ·  4Comments

kailuowang picture kailuowang  ·  3Comments