The Splitter class does two things really well: Declaratively define a way to process a sequence of characters, and process the sequence of characters lazily. The shortcoming that I've encountered is that the sequence of characters submitted to the Splitter instance is most easily (but not strictly necessarily) rendered eagerly.
I'm proposing an API enhancement to allow the Splitter class to more easily be used with lazy inputs.
Options in order of preference:
Iterable<String> split(CharSource) throws IOException method. The Iterable yields Iterators that close the Reader yielded by the CharSource when the Reader (and hence the Iterator) is exhausted.Iterable<String> split(Reader) throws IOException method. The user is responsible for closing the reader, perhaps through a try-with-resources. Alternatively (or in addition) the Iterable can yield self-closing Iterators as in Option 1.I would be happy to work on this enhancement.
I'd like to propose a fourth option:
Stream<String> split(CharSource) throws IOException method. The user is responsible for closing the stream using try-with-resources, Closeable or manually calling .close() within a finally block.The advantage of this option over option (1) is that it doesn't suffer from a big problem that (1) has, where if a yielded iterator is _not_ exhausted, then the corresponding Reader can never be closed.
The main disadvantage of this approach, however, is it's not compatible with the forthcoming Android port of Guava, since it's dependent on Java 8 to work.
(It's also a bit more verbose than option (1) because of the need to wrap the stream in a try-with-resources or equivalent. So this could be combined with option (1) to give the user a choice depending on their needs.)
I find @jbduncan's solution much nicer than the others. This also occurs in the JDK (Pattern.splitAsStream).
@jbduncan's solution makes a lot of sense. It's somewhat unfortunate that the need to close the stream is not baked into the type system but it's certainly better than Iterable, which is not Closeable at all. I also find it more pleasant to work with stream than Iterables -- given the big push to sprinkle methods returning Stream all over the JDK, I'd guess that's where Java is trying to go anyhow.
How much of a problem does this present given the coming Android port? Guava 21 is already full of dependencies on JDK8. Are there rules limiting this to the functional and collections classes?
split(Reader) is certainly possible, even if it blurs the line with I/O a bit.
But we wouldn't be able to add split(CharSource) to Splitter because io depends on base and we don't allow package cycles. It _might_ be possible to add CharSource.split(Splitter), but without thinking deeply about it I'm somewhat dubious we could do it efficiently without exposing some APIs we generally wouldn't want users to actually use on Splitter.
Most helpful comment
I'd like to propose a fourth option:
Stream<String> split(CharSource) throws IOExceptionmethod. The user is responsible for closing the stream using try-with-resources,Closeableor manually calling.close()within afinallyblock.The advantage of this option over option (1) is that it doesn't suffer from a big problem that (1) has, where if a yielded iterator is _not_ exhausted, then the corresponding Reader can never be closed.
The main disadvantage of this approach, however, is it's not compatible with the forthcoming Android port of Guava, since it's dependent on Java 8 to work.
(It's also a bit more verbose than option (1) because of the need to wrap the stream in a try-with-resources or equivalent. So this could be combined with option (1) to give the user a choice depending on their needs.)