Throwable #1: ElasticsearchParseException[Error parsing document in field [source_field]]; nested: TikaException[Unexpected RuntimeException from org.apache.tika.parser.txt.TXTParser@5bdd2e62]; nested: IllegalStateException[Can't overwrite cause with org.apache.tika.sax.WriteOutContentHandler$WriteLimitReachedException: Your document contained more than 19 characters, and so your requested limit has been reached. To receive the full text of the document, increase your limit. (Text up to the limit is however available).]; nested: TaggedSAXException[Your document contained more than 19 characters, and so your requested limit has been reached. To receive the full text of the document, increase your limit. (Text up to the limit is however available).]; nested: WriteLimitReachedException[Your document contained more than 19 characters, and so your requested limit has been reached. To receive the full text of the document, increase your limit. (Text up to the limit is however available).];
> at __randomizedtesting.SeedInfo.seed([F36010291A3A195A:F8D959C10745D812]:0)
> at org.elasticsearch.ingest.attachment.AttachmentProcessor.execute(AttachmentProcessor.java:106)
> at org.elasticsearch.ingest.attachment.AttachmentProcessorTests.parseDocument(AttachmentProcessorTests.java:285)
> at org.elasticsearch.ingest.attachment.AttachmentProcessorTests.parseDocument(AttachmentProcessorTests.java:275)
> at org.elasticsearch.ingest.attachment.AttachmentProcessorTests.testIndexedChars(AttachmentProcessorTests.java:296)
Output of java -version:
openjdk version "11-ea" 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11-ea+17)
OpenJDK 64-Bit Server VM 18.9 (build 11-ea+17, mixed mode)
Reproduce with:
./gradlew :plugins:ingest-attachment:test -Dtests.class=org.elasticsearch.ingest.attachment.AttachmentProcessorTests -Dtests.method="testIndexedChars"
Note that running with JDK 11 requires some changes before one can get here. Those can be found here:
https://github.com/atorok/elasticsearch/tree/upgrade/jdk_11
Pinging @elastic/es-core-infra
Thank you for raising this. @dadoonet, thank you for the ping. I can reproduce it in tika-core with Java 11-ea. I've opened: TIKA-2668.
This is now fixed and will be available in Tika 1.19.
Awesome! Thanks @tballison!
As mentioned on other tickets, Tika 1.19 is now available. Let us know what you find.
Resolved by #33896