Cats: NonEmptySet data structure?

Created on 8 Feb 2015  Â·  24Comments  Â·  Source: typelevel/cats

I don't have a good sense for whether something like this belong's in cats, but I have a quick and dirty scalaz-based NonEmptySet[A] = OneAnd[A,Set] implementation that I'd be happy to contribute:
https://gist.github.com/refried/9b68a08ff24640b86fb2

Obvious issues that I can do something about:

  • Uses .equals rather than Equals/Eq
  • Uses scala Set (although I noticed that cats.data.NonEmptyList uses std List so maybe this is ok?)

    • scala Set is not parametric. I have a NonEmptyOrderedSet too, backed by scalaz ISet[A:Ordered], but we don't have that yet

  • Because it's OneAnd, syntax is separate; I could use guidance on how to catsify this.
  • It should implement .equals or have an Eq instance or both
  • I need to write some laws.

Let me know what you think.

I've been using it with that argmaxes method, and to back a categorical probability distribution (generalized Bernoulli distribution) (must have at least one element) which I'm also happy to contribute if it makes sense here.

enhancement

All 24 comments

an issue with NonEmptySet[A] = OneAnd[A,Set] is that you can get the same value twice e.g.

NonEmptySet(1, Set(1))

@julien-truffaut if you take a look at the gist that @refried linked, the NonEmptySet apply methods eliminate the head element from the tail.

I think something like this could be useful. I think I would prefer that it is its own class that could potentially use OneAnd[A, Set] internally. That way, we can guarantee that these are always correct by construction. Otherwise people could use OneAnd.apply instead of NonEmptySet.apply and get something typed as NonEmptySet[A] (since it's just a type alias) that in fact had duplicate elements.

I could see two siblings of this existing: one for std Set based on universal equality/hash and one for a (currently nonexistent) cats set based on Order.

I'm curious what others think about this. Have you run into the need for a non-empty set?

@ceedubs ah good point, I missed the constructor logic. I agree with you, I think we should not expose that NonEmptySet is implemented in terms of OneAnd (if we go for this impl).

@ceedubs whoops, yeah; it didn't occur to me that the constructor logic could be side/stepped. I had just written it this way in my application to use all the OneAnd instances for free.

Having a real class addresses the syntax quirk too.

If we aren't going to expose the underlying structure, is an implementation using OneAnd(a, set) better than just using Set(...) with invariants?

Sorry, I didn't understand the second part (just using Set with invariants); would you elaborate?

Do you mean just duplicate the functionality of OneAnd(a,set) (and its instances) rather than delegating? I guess it doesn't matter to me. I was using NonEmptyList as a model since I'm new to cats.

But, it think that getting the internally-OneAnd version and some laws would be a reasonable first start, and then if there's an efficiency argument to make, at least the tests will already be in place?

No wait, maybe you mean, just use Set, and use the new constructor to require an element. Maybe the only difference in that case would be reusing OneAnd instances vs not.

Right I meant something like:

class NonEmptySet[A] private[cats] (val toSet: Set[A]) extends Function1[A, Boolean] {
  def size: Int = toSet.size
  def apply(a: A): Boolean = toSet(a)
  def map[B](f: A => B): NonEmptySet[B] = new NonEmptySet(toSet.map(f))
  def filter(f: A => Boolean): Option[NonEmptySet[A]] = NonEmptySet(toSet.filter(f))
  def +(a: A): NonEmptySet[A] = new NonEmptySet(toSet + a)
  def -(a: A): Option[NonEmptySet[A]] = NonEmptySet(toSet - a)
  def |(that: NonEmptySet[A]): NonEmptySet[A] = new NonEmptySet(toSet | that.toSet)
  def &(that: NonEmptySet[A]): Option[NonEmptySet[A]] = NonEmptySet(toSet & that.toSet)
  // and so on...
}

object NonEmptySet {
  def apply[A](as: A*): Option[NonEmptySet[A]] =
    if (as.isEmpty) None else Some(new NonEmptySet(as))
}

Since Set[A] doesn't have many instances anyway it seemed easy to do something like this, and it's arguably a bit easier to reason about.

On the other hand, using OneAnd is fine -- I've just never really loved basing types off of that pattern myself. So this could just be a matter of taste.

what the benefit of having NonEmptySet extend Function1? isn't it enough to have a contains method?

@non I'll take a crack at it. Maybe have both? remove vs removeOption
@julien-truffaut I agree

145 is a great idea. Good suggestion, @refried!

@julien-truffaut The only real advantage I saw was that you can use sets directly in place of a A => Boolean predicate, rather than having to instantiate a function. I guess the concern is inheriting members from Function1? It's not a big deal either way, although i do like using apply instead of just contains (since set membership tests are the main thing you do with these).

I have nothing against having an applymethod as an alias for contains but in general, I would avoid inheritance except if there a real performance win. Maybe it is only my aversion for inheritance talking.

What is the status of this? I see that the issue is closed but I don't quite follow why. I encounter NonEmptySets all the time.

BTW, I think it's possible to just use extends AnyVal. Why not? Like:

  class NonEmptySet[T] private(private val underying: immutable.Set[T]) extends AnyVal {
    // def methods here...
  }

  object NonEmptySet {
    def apply[T](set: Set[T]): Option[NonEmptySet[T]] = {
      if (set.isEmpty) {
        None
      } else {
        Some(new NonEmptySet(set))
      }
    }
    def apply[T](elem: T) = new NonEmptySet[T](Set(elem))
  }

I didn't care about any convariant/contravariant stuff in this code (T+ T-), just a quick sample.

@boggle the issue is open, check out the upper-left corner.

Oh indeed, I confused it with the closed status of the quoted other issue

I'd like to give this another shot, what does everyone think about using Eq vs. Universal equality?

Hi,
@lancewalton wrote this version some time ago (with a bunch of tests).
Is that useful?

Hi @channingwalton, I think it looks really good, maybe we could wrap a SortedSet with an Order[A] instead of the Set based on universal equality :)

Hi.

That’s a good idea. I’ll do that.

Regards,

Lance

On 5 Nov 2017, at 15:26, LukaJCB notifications@github.com wrote:

Hi @channingwalton https://github.com/channingwalton, I think it looks really good, maybe we could wrap a SortedSet with an Order[A] instead of the Set based on universal equality :)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/typelevel/cats/issues/123#issuecomment-341981299, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcMr-ARWySVVsZAAwCIWZcgQ4hPHN-4ks5szdOqgaJpZM4DdUFQ.

Hi.

I've looked at the code for NonEmptyList and we probably want to make NonEmptySet consistent with that. I'll do some work on this over the next week or so.

@lancewalton This looks awesome! Thanks so much for working on it.

This was resolved (at least for sorted sets) by #2143.

Was this page helpful?
0 / 5 - 0 ratings