Guava: Splitter should accept a CharSource or Reader

Created on 12 Feb 2017  路  4Comments  路  Source: google/guava

The Splitter class does two things really well: Declaratively define a way to process a sequence of characters, and process the sequence of characters lazily. The shortcoming that I've encountered is that the sequence of characters submitted to the Splitter instance is most easily (but not strictly necessarily) rendered eagerly.

I'm proposing an API enhancement to allow the Splitter class to more easily be used with lazy inputs.
Options in order of preference:

  1. Add a Iterable<String> split(CharSource) throws IOException method. The Iterable yields Iterators that close the Reader yielded by the CharSource when the Reader (and hence the Iterator) is exhausted.
  2. Add a Iterable<String> split(Reader) throws IOException method. The user is responsible for closing the reader, perhaps through a try-with-resources. Alternatively (or in addition) the Iterable can yield self-closing Iterators as in Option 1.
  3. Have CharSource extend CharSequence so it can be used with the existing Splitter API. This is not recommended because CharSource does not easily satisfy the uniform-access requirement of the CharSequence API.

I would be happy to work on this enhancement.

P3 package=base package=io status=triaged type=addition

Most helpful comment

I'd like to propose a fourth option:

  1. Add a Stream<String> split(CharSource) throws IOException method. The user is responsible for closing the stream using try-with-resources, Closeable or manually calling .close() within a finally block.

The advantage of this option over option (1) is that it doesn't suffer from a big problem that (1) has, where if a yielded iterator is _not_ exhausted, then the corresponding Reader can never be closed.

The main disadvantage of this approach, however, is it's not compatible with the forthcoming Android port of Guava, since it's dependent on Java 8 to work.

(It's also a bit more verbose than option (1) because of the need to wrap the stream in a try-with-resources or equivalent. So this could be combined with option (1) to give the user a choice depending on their needs.)

All 4 comments

I'd like to propose a fourth option:

  1. Add a Stream<String> split(CharSource) throws IOException method. The user is responsible for closing the stream using try-with-resources, Closeable or manually calling .close() within a finally block.

The advantage of this option over option (1) is that it doesn't suffer from a big problem that (1) has, where if a yielded iterator is _not_ exhausted, then the corresponding Reader can never be closed.

The main disadvantage of this approach, however, is it's not compatible with the forthcoming Android port of Guava, since it's dependent on Java 8 to work.

(It's also a bit more verbose than option (1) because of the need to wrap the stream in a try-with-resources or equivalent. So this could be combined with option (1) to give the user a choice depending on their needs.)

I find @jbduncan's solution much nicer than the others. This also occurs in the JDK (Pattern.splitAsStream).

@jbduncan's solution makes a lot of sense. It's somewhat unfortunate that the need to close the stream is not baked into the type system but it's certainly better than Iterable, which is not Closeable at all. I also find it more pleasant to work with stream than Iterables -- given the big push to sprinkle methods returning Stream all over the JDK, I'd guess that's where Java is trying to go anyhow.

How much of a problem does this present given the coming Android port? Guava 21 is already full of dependencies on JDK8. Are there rules limiting this to the functional and collections classes?

split(Reader) is certainly possible, even if it blurs the line with I/O a bit.

But we wouldn't be able to add split(CharSource) to Splitter because io depends on base and we don't allow package cycles. It _might_ be possible to add CharSource.split(Splitter), but without thinking deeply about it I'm somewhat dubious we could do it efficiently without exposing some APIs we generally wouldn't want users to actually use on Splitter.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

edwardlee03 picture edwardlee03  路  4Comments

thecoop picture thecoop  路  4Comments

gissuebot picture gissuebot  路  3Comments

PhilippWendler picture PhilippWendler  路  4Comments

philgebhardt picture philgebhardt  路  3Comments