Guava: use PECS principle in TreeTraverser

Created on 12 Apr 2017  路  11Comments  路  Source: google/guava

Currently, TreeTraverser<T> has a method Iterable<T> children(T root). Since for traversal, the returned Iterable is only used as a producer, the return type can be Iterable<? extends T> (and the using() static function can take a Function<T, ? extends Iterable<? extends T>>).

The use case is a tree model where at some level only leaf nodes exist, so there is for example a List<LeafNode> (where LeafNode extends T). With the current implementation, this has to be copied into a new list in order to compile (or using some ugly unsafe double-casting).

package=collect package=graph

Most helpful comment

@kevinb9n,
Personally I feel that users who test with such fine granularity have to accept the extra brittleness, certainly for an API that is in beta. After all the fix is trivial (just change the variable type). Of course, they could also just pass a Java8 method reference to TreeTraverser.using(), and then test the method directly.

All 11 comments

I think you're right, and that the change would be compatible for conventional consumers of this API. Unfortunately, it appears that a fair proportion of users unit-test their children() method directly, with some fraction of them storing its result in a variable, so we would break those callers.

We're allowed to do that, because TreeTraverser is @Beta, but it is something we still try to avoid when we can. In general, we have learned to tolerate the ugly double-cast-and-suppress construct as long as it's something that is needed only infrequently.

We might be able to fix this for common.graph.{Suc,Prede}cessorsFunction, though, which has no compatibility issues yet.

@kevinb9n,
Personally I feel that users who test with such fine granularity have to accept the extra brittleness, certainly for an API that is in beta. After all the fix is trivial (just change the variable type). Of course, they could also just pass a Java8 method reference to TreeTraverser.using(), and then test the method directly.

@HermanBovens we are discussing this internally now, thanks for starting the conversation.

For context: SuccessorsFunction<N> is a new common.graph API that, in conjunction with some other enhancements to the common.graph utilities, we expect to replace TreeTraverser, as part of unifying the graph-related Guava code under common.graph. We're working out the details now, and it will be a while before TreeTraverser goes away entirely, but that's the plan, in broad strokes.

In any event: I'm not sure I understand your use case.
IIUC, you have a heterogeneous structure composed of InternalNodes and LeafNodes, and you want to traverse it, as type-safely as possible. Let's say that InternalNode and LeafNode each extend TreeNode.

given
someGraphFunction(SuccessorsFunction<N> sf) { ... }
and
class TreeNode { List<TreeNode> getChildren() }
you want to call it like this:

someGraphFunction(TreeNode::getChildren)

This will already fail because SuccessorsFunction.successors() returns a Set<N>, not an Iterable<N>, so instead you'll need to do this:

someGraphFunction(n -> ImmutableSet.copyOf(n.getChildren());

which (copying) is one of the specific issues you were concerned about.
Note that getChildren() in this scenario returns a List<TreeNode> (not a List<? extends TreeNode>), so it dodges your original question; my purpose in starting here is to emphasize that (given the specifics of your scenario as I understand them) you already may have a reason that you need to copy.

(Why Set? There was a lot of internal discussion about this, but what it came down to is that it has the right semantics, i.e., unique elements, as well as (typically) fast contains() implementations.)

But suppose getChildren() returned a List<? extends TreeNode> instead: how does that help anything?
getChildren() has to be able to return both leaves and internal nodes because it can't know a priori whether the nodes it's returning will be leaves, internal nodes, or both. So it has to return a collection of TreeNode, and the wildcard seems useless. You can't even cast it, because you don't know what's in it.

Am I missing something blindingly obvious?

@jrtom I can imagine one would like to model some kind of classification in a type safe way. Let's take taxonomic hierarchy as an example (https://en.wikipedia.org/wiki/Taxonomic_rank).

public abstract class Rank {
    public Set<? extends Rank> getChildren() {
        return ImmutableSet.of();
    }
}

public class Life extends Rank {
    private Set<Domain> children;

    @Override
    public Set<Domain> getChildren() {
        return children;
    }
}

public class Domain extends Rank {
    // private Set<Kingdom> children;
    // and so on...
}

// ... and on...

If PECS was applied, you could easily traverse entire Life tree (tree of life) by using TreeTraverser.<Rank>using(Rank::getChildren).preOrderTraversal(tree). Now it's impossible. BTW, no casting/copying involved.

Another example, this time from real... life: Country / State / ... / City / Street (this hierarchy is locale-dependent, so I can't translate 1:1).

The point is, sometimes you really know what kind/class of children you can expect/return and where.

I hope that this post explains at least a little.

Indeed, my uses cases were along the lines of the example that @perceptron8 has shown. Another possibility (besides having the wildcard in the superclass's return type for getChildren()) is that the function to map parent to children uses instanceof and a (clean, safe) cast to distinguish between different types, some of which may return a specific type of children.

Regarding the use of Set vs Iterable: the fact that one needs to copy the children in case the model doesn't return a Set is one thing that worries me a bit. The fact that one will need to copy them specifically into a LinkedHashSet (or similar Set implementation) in order to preserve the iteration order of the model, is another thing. Uses cases where the order is important: DOM traversal, UI component hierarchy, ...

I would like to argue against using Set as return type in a TreeTraverser replacement. We use it for abstract syntax trees, and if given something like a + a, we certainly expect it to visit a twice.

Btw., our abstract syntax trees are another example where <? extends T> would be useful. Ours looks like this:

class ASTNode {
  List<? extends ASTNode> children();
}
class Expression extends ASTNode {
  List<? extends Expression> children();
}
// many more subclasses of ASTNode and Expression...

Currently we use several of the ugly unchecked casts to solve this.

FYI, {Predecessors,Successors}Function will be released in Guava 23 (are currently in the snapshot) and the return type of the method has been changed to <? extends Iterable>: https://github.com/google/guava/blob/master/android/guava/src/com/google/common/graph/SuccessorsFunction.java

The fate of TreeTraverser is still being worked out, but I believe that we're fairly solid on the overall (incorporating that functionality into common.graph).

Calling this resolved now that we've settled on a design for Traverser and for SuccessorsFunction that will resolve this issue going forward. (Existing uses of TreeTraverser should be relatively easy to port to using the new Traverser class.)

@HermanBovens I believe Traverser is already included in the latest release, 23.2, under com.google.common.graph.Traverser.

@jbduncan I just noticed this so had deleted my comment, but thanks anyway.

@HermanBovens Okay, thanks for letting me know. For the future, it's often best not to delete comments outright, as it leaves conversations disjointed. When I personally do it or have seen others do it, I find it's more helpful to just leave an edit explaining why the comment is no longer applicable. :)

Was this page helpful?
0 / 5 - 0 ratings