Cats: Spark 3 / Cats 2.2.0 classpath issue

Created on 6 Oct 2020  Â·  7Comments  Â·  Source: typelevel/cats

A small project with minimal dependencies as well as instructions on how to reproduce the issue is available at:

https://github.com/samidalouche/spark3-cats220

Executing this code works fine with cats 2.1.1 but fails with cats 2.2.0, which is quite surprising since the spark and cats dependencies are pretty much distinct from each other.

java.lang.NoSuchMethodError: 'void cats.kernel.CommutativeSemigroup.$init$(cats.kernel.CommutativeSemigroup)'
  at cats.UnorderedFoldable$$anon$1.<init>(UnorderedFoldable.scala:78)
  at cats.UnorderedFoldable$.<init>(UnorderedFoldable.scala:78)
  at cats.UnorderedFoldable$.<clinit>(UnorderedFoldable.scala)
  at cats.data.NonEmptyListInstances$$anon$2.<init>(NonEmptyList.scala:539)
  at cats.data.NonEmptyListInstances.<init>(NonEmptyList.scala:539)
  at cats.data.NonEmptyList$.<init>(NonEmptyList.scala:458)
  at cats.data.NonEmptyList$.<clinit>(NonEmptyList.scala)
  at catsspark.Boom$.assumeValid_$bang(boom.scala:19)
  at catsspark.Boom$.boom(boom.scala:14)
  ... 47 elided

Thanks in advance for looking into this.

I submitted the same issue to spark's bug tracker: https://issues.apache.org/jira/browse/SPARK-33077

Most helpful comment

Looks like spark-mllib depends on breeze 1.0 which depends on a milestone version of Spire and so on.

All 7 comments

Spark seems to have a (possibly transitive) dependency on cats-kernel 2.0.0-M4 (!) which seems to be confirmed by that stacktrace and this list of dependencies.

@Jasper-M Thanks, you are right indeed. Spark 3 has a dependency on cats-kernel_2.12-2.0.0-M4.jar that spark 2.4 did not have. Thanks for looking into this.

Looks like spark-mllib depends on breeze 1.0 which depends on a milestone version of Spire and so on.

This definitely needs to be reopened on Spark's issue tracker. Cats doesn't guarantee binary compatibility with milestones. If they can update their dependency to at least 2.0.0, the problem will go away. In the short term, users can probably work around this by excludeing the transitive dependency coming in via Spark.

You're definitely right that it's a Spark issue, but I'm afraid it's more complicated than backwards compatibility. When invoking spark-submit or spark-shell, all classes on Spark's fixed classpath will be preferred over your own jar. I think that the only workaround is to shade your own version of cats. Or somehow avoid having to go through those spark-submit scripts, but don't know if that's doable.

Holy crap…

Better double check that Spark is working on it since the filed issue (Spark side) is currently closed.

Was this page helpful?
0 / 5 - 0 ratings