While stress-testing rubix cache we noticed intermittent failures because data read from S3 stream backed by cache could not be deserialized as valid ORC content.
Query failed with following exception:
io.prestosql.spi.PrestoException: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:127)
at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.lambda$createOrcPageSource$4(OrcPageSourceFactory.java:282)
at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:83)
at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
at io.prestosql.plugin.hive.orc.OrcPageSource$SourceColumn$MaskingBlockLoader.load(OrcPageSource.java:266)
at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
at io.prestosql.spi.Page.getLoadedPage(Page.java:273)
at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:305)
at io.prestosql.operator.Driver.processInternal(Driver.java:379)
at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
at io.prestosql.operator.Driver.processFor(Driver.java:276)
at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.prestosql.orc.OrcCorruptionException: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
at io.prestosql.orc.OrcZlibDecompressor.decompress(OrcZlibDecompressor.java:59)
at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:128)
at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:96)
at io.prestosql.orc.stream.OrcInputStream.readFully(OrcInputStream.java:121)
at io.prestosql.orc.stream.ByteArrayInputStream.next(ByteArrayInputStream.java:43)
at io.prestosql.orc.reader.SliceDirectColumnReader.readBlock(SliceDirectColumnReader.java:218)
at io.prestosql.orc.reader.SliceColumnReader.readBlock(SliceColumnReader.java:77)
at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:76)
... 20 more
Test query in use:
Select
count(l.orderkey),
count(l.partkey),
count(l.suppkey),
count(l.linenumber),
count(l.quantity),
count(l.extendedprice),
count(l.discount),
count(l.tax),
count(l.returnflag),
count(l.linestatus),
count(l.shipdate),
count(l.commitdate),
count(l.receiptdate),
count(l.shipinstruct),
count(l.shipmode),
count(l.comment)
From "hive"."tpch_sf1000_orc".lineitem l
Environment:
for i in `seq 1 4`; do
mkfs.ext4 /dev/nvme$${i}n1
mkdir /media/nv$${i}
mount /dev/nvme$${i}n1 /media/nv$${i}
mkdir /media/nv$${i}/presto-cache
chmod 777 /media/nv$${i}/presto-cache
done
/media/nv1/presto-cache was mounted as hostPath volume for coordinator Pod. /media/nv2/presto-cache was mounted for worker Pod."hive"."tpch_sf1000_orc".lineitem table is residing in S3. The directory contains 5 files. Each roughly 33GB. ORC files are zlib compressed.More snippets from worker node logs (this is the only worker node in the cluster):
Before failure. Those were appearing in the logs for 13 minutes before query failed. 4 other queries were successfully run in that period.
2020-04-20T21:20:53.273Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -286261248
2020-04-20T21:22:40.252Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1084227584
2020-04-20T21:22:40.252Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1163919360
2020-04-20T21:22:40.252Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1883242496
2020-04-20T21:22:40.252Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1426063360
2020-04-20T21:22:40.252Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1955594240
2020-04-20T21:23:27.850Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader ReadData didn't match with requested value. RequestedData: 2366636032 ReadData: -1928331264
2020-04-20T21:23:39.748Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader ReadData didn't match with requested value. RequestedData: 2559574016 ReadData: -1735393280
2020-04-20T21:23:39.755Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -329901701
Exception which failed query:
This is repeated for multiple SplitRunners.
2020-04-20T21:23:39.777Z ERROR SplitRunner-91-151 io.prestosql.execution.executor.TaskExecutor Error processing Split 20200420_212233_00006_n3ygz.1.0-1686 {path=s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a, start=19730006016, length=67108864, fileSize=36236363905, hosts=[172.40.247.208], database=tpch_sf1000_orc, table=lineitem, forceLocalScheduling=false, partitionName=<UNPARTITIONED>, s3SelectPushdownEnabled=false} (start = 4.032926451763E7, wall = 1703 ms, cpu = 269 ms, wait = 84 ms, calls = 2): HIVE_BAD_DATA: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
io.prestosql.spi.PrestoException: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:127)
at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.lambda$createOrcPageSource$4(OrcPageSourceFactory.java:282)
at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:83)
at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
at io.prestosql.plugin.hive.orc.OrcPageSource$SourceColumn$MaskingBlockLoader.load(OrcPageSource.java:266)
at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
at io.prestosql.spi.Page.getLoadedPage(Page.java:273)
at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:305)
at io.prestosql.operator.Driver.processInternal(Driver.java:379)
at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
at io.prestosql.operator.Driver.processFor(Driver.java:276)
at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.prestosql.orc.OrcCorruptionException: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
at io.prestosql.orc.OrcZlibDecompressor.decompress(OrcZlibDecompressor.java:59)
at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:128)
at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:96)
at io.prestosql.orc.stream.OrcInputStream.readFully(OrcInputStream.java:121)
at io.prestosql.orc.stream.ByteArrayInputStream.next(ByteArrayInputStream.java:43)
at io.prestosql.orc.reader.SliceDirectColumnReader.readBlock(SliceDirectColumnReader.java:218)
at io.prestosql.orc.reader.SliceColumnReader.readBlock(SliceColumnReader.java:77)
at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:76)
... 20 more
More snippets AFTER query failure:
2020-04-20T21:23:39.783Z ERROR SplitRunner-8-68 io.prestosql.execution.executor.TaskExecutor Error processing Split 20200420_212233_00006_n3ygz.1.0-1715 {path=s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2, start=22548578304, length=67108864, fileSize=37290938397, hosts=[172.40.247.208], database=tpch_sf1000_orc, table=lineitem, forceLocalScheduling=false, partitionName=<UNPARTITIONED>, s3SelectPushdownEnabled=false} (start = 4.0330905983724E7, wall = 67 ms, cpu = 0 ms, wait = 0 ms, calls = 1): HIVE_CURSOR_ERROR: Failed to read ORC file: s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
io.prestosql.spi.PrestoException: Failed to read ORC file: s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:129)
at io.prestosql.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:103)
at io.prestosql.plugin.hive.HivePageSource.getNextPage(HivePageSource.java:244)
at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:302)
at io.prestosql.operator.Driver.processInternal(Driver.java:379)
at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
at io.prestosql.operator.Driver.processFor(Driver.java:276)
at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.UnsupportedOperationException: Slice is not backed by a byte array
at io.airlift.slice.Slice.checkHasByteArray(Slice.java:333)
at io.airlift.slice.Slice.byteArray(Slice.java:352)
at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:129)
at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:96)
at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.tryRefillBuffer(CodedInputStream.java:2715)
at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.isAtEnd(CodedInputStream.java:2654)
at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.readTag(CodedInputStream.java:2004)
at io.prestosql.orc.proto.OrcProto$StripeFooter.<init>(OrcProto.java:15115)
at io.prestosql.orc.proto.OrcProto$StripeFooter.<init>(OrcProto.java:15084)
at io.prestosql.orc.proto.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:16709)
at io.prestosql.orc.proto.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:16704)
at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:91)
at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:96)
at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at io.prestosql.orc.protobuf.GeneratedMessageV3.parseWithIOException(GeneratedMessageV3.java:311)
at io.prestosql.orc.proto.OrcProto$StripeFooter.parseFrom(OrcProto.java:15517)
at io.prestosql.orc.metadata.OrcMetadataReader.readStripeFooter(OrcMetadataReader.java:166)
at io.prestosql.orc.metadata.ExceptionWrappingMetadataReader.readStripeFooter(ExceptionWrappingMetadataReader.java:84)
at io.prestosql.orc.StripeReader.readStripeFooter(StripeReader.java:383)
at io.prestosql.orc.StripeReader.readStripe(StripeReader.java:119)
at io.prestosql.orc.OrcRecordReader.advanceToNextStripe(OrcRecordReader.java:504)
at io.prestosql.orc.OrcRecordReader.advanceToNextRowGroup(OrcRecordReader.java:446)
at io.prestosql.orc.OrcRecordReader.nextPage(OrcRecordReader.java:367)
at io.prestosql.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:99)
... 13 more
then repeated multiple times for different files:
2020-04-20T21:23:39.827Z ERROR 20200420_212233_00006_n3ygz.1.0-1690-130 com.qubole.rubix.core.CachedReadRequestChain Fall back to read from object store for /var/lib/presto/cache/fcache/<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2 .Could not read data from cached file :
com.qubole.rubix.core.ReadRequestChain$CancelledException: com.qubole.rubix.core.CachedReadRequestChain Cancelled
at com.qubole.rubix.core.ReadRequestChain.propagateCancel(ReadRequestChain.java:107)
at com.qubole.rubix.core.CachedReadRequestChain.call(CachedReadRequestChain.java:102)
at com.qubole.rubix.core.CachedReadRequestChain.call(CachedReadRequestChain.java:37)
at com.google.shaded.shaded.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.shaded.shaded.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.shaded.shaded.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
And also multiple times:
2020-04-20T21:23:40.793Z WARN 20200420_212233_00006_n3ygz.1.0-1709-116 com.qubole.rubix.bookkeeper.FileMetadata Evicting s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c
53dab105d0 due to EXPLICIT
2020-04-20T21:23:41.044Z ERROR 20200420_212233_00006_n3ygz.1.0-1596-126 com.qubole.rubix.core.CachingInputStream Could not get FileInfo for s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_df899620
-c77b-4012-a32c-1eb8cd1119ef. Fetching FileStatus from remote file system :
java.util.MissingFormatArgumentException: Format specifier '%s'
at java.base/java.util.Formatter.format(Formatter.java:2672)
at java.base/java.util.Formatter.format(Formatter.java:2609)
at java.base/java.lang.String.format(String.java:2897)
at com.qubole.rubix.bookkeeper.BookKeeper.getFileInfo(BookKeeper.java:415)
at com.qubole.rubix.spi.LocalBookKeeperClient.getFileInfo(LocalBookKeeperClient.java:76)
at com.qubole.rubix.core.CachingInputStream.<init>(CachingInputStream.java:106)
at com.qubole.rubix.core.CachingFileSystem.open(CachingFileSystem.java:168)
at org.apache.hadoop.fs.PrestoFileSystemCache$FileSystemWrapper.open(PrestoFileSystemCache.java:327)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:899)
at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.lambda$createOrcPageSource$0(OrcPageSourceFactory.java:198)
at io.prestosql.plugin.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
at io.prestosql.plugin.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:89)
at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.createOrcPageSource(OrcPageSourceFactory.java:198)
at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.createPageSource(OrcPageSourceFactory.java:147)
at io.prestosql.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:184)
at io.prestosql.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:102)
at io.prestosql.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:57)
at io.prestosql.split.PageSourceManager.createPageSource(PageSourceManager.java:64)
at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:299)
at io.prestosql.operator.Driver.processInternal(Driver.java:379)
at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
at io.prestosql.operator.Driver.processFor(Driver.java:276)
at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Some (not all) followup queries failed with:
io.prestosql.spi.PrestoException: Failed to read ORC file: s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_df899620-c77b-4012-a32c-1eb8cd1119ef
at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:129)
at io.prestosql.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:103)
at io.prestosql.plugin.hive.HivePageSource.getNextPage(HivePageSource.java:244)
at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:302)
at io.prestosql.operator.Driver.processInternal(Driver.java:379)
at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
at io.prestosql.operator.Driver.processFor(Driver.java:276)
at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.UnsupportedOperationException: Slice is not backed by a byte array
at io.airlift.slice.Slice.checkHasByteArray(Slice.java:333)
at io.airlift.slice.Slice.byteArray(Slice.java:352)
at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:129)
at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:96)
at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.tryRefillBuffer(CodedInputStream.java:2715)
at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.isAtEnd(CodedInputStream.java:2654)
at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.readTag(CodedInputStream.java:2004)
at io.prestosql.orc.proto.OrcProto$StripeFooter.<init>(OrcProto.java:15115)
at io.prestosql.orc.proto.OrcProto$StripeFooter.<init>(OrcProto.java:15084)
at io.prestosql.orc.proto.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:16709)
at io.prestosql.orc.proto.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:16704)
at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:91)
at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:96)
at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at io.prestosql.orc.protobuf.GeneratedMessageV3.parseWithIOException(GeneratedMessageV3.java:311)
at io.prestosql.orc.proto.OrcProto$StripeFooter.parseFrom(OrcProto.java:15517)
at io.prestosql.orc.metadata.OrcMetadataReader.readStripeFooter(OrcMetadataReader.java:166)
at io.prestosql.orc.metadata.ExceptionWrappingMetadataReader.readStripeFooter(ExceptionWrappingMetadataReader.java:84)
at io.prestosql.orc.StripeReader.readStripeFooter(StripeReader.java:383)
at io.prestosql.orc.StripeReader.readStripe(StripeReader.java:119)
at io.prestosql.orc.OrcRecordReader.advanceToNextStripe(OrcRecordReader.java:504)
at io.prestosql.orc.OrcRecordReader.advanceToNextRowGroup(OrcRecordReader.java:446)
at io.prestosql.orc.OrcRecordReader.nextPage(OrcRecordReader.java:367)
at io.prestosql.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:99)
... 13 more
Question: Could it be because coordinator and worker are sharing same cache directory? It seems like coordinator should not touch the content of the directory but maybe it does and it is a reason of read failures on worker?
Reconfigured Presto so coordinator and worker do not share cache directory.
I did not get failed query yet but I see NegativeArraySizeException errors in worker log:
2020-04-20T23:01:40.136Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader ReadData didn't match with requested value. RequestedData: 6786706824 ReadData: -1803227768
2020-04-20T23:01:40.136Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1247805440
2020-04-20T23:01:40.136Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1515192320
2020-04-20T23:01:40.136Z ERROR RemoteFetchProcessor RUNNING com.qubole.rubix.bookkeeper.FileDownloader java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -931135488
[.... 5 minutes later...]
Already got a failed query:
io.prestosql.spi.PrestoException: Failed to read ORC file: s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a
at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:129)
at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.lambda$createOrcPageSource$4(OrcPageSourceFactory.java:282)
at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:83)
at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
at io.prestosql.plugin.hive.orc.OrcPageSource$SourceColumn$MaskingBlockLoader.load(OrcPageSource.java:266)
at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
at io.prestosql.spi.Page.getLoadedPage(Page.java:273)
at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:305)
at io.prestosql.operator.Driver.processInternal(Driver.java:379)
at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
at io.prestosql.operator.Driver.processFor(Driver.java:276)
at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
at io.prestosql.$gen.Presto_332____20200420_225629_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.UnsupportedOperationException: Slice is not backed by a byte array
at io.airlift.slice.Slice.checkHasByteArray(Slice.java:333)
at io.airlift.slice.Slice.byteArray(Slice.java:352)
at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:129)
at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:83)
at io.prestosql.orc.stream.LongInputStreamV2.readValues(LongInputStreamV2.java:64)
at io.prestosql.orc.stream.LongInputStreamV2.next(LongInputStreamV2.java:366)
at io.prestosql.orc.reader.SliceDirectColumnReader.readBlock(SliceDirectColumnReader.java:146)
at io.prestosql.orc.reader.SliceColumnReader.readBlock(SliceColumnReader.java:77)
at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:76)
... 20 more
I will try to repro this to debug. Can you help with the repro steps @losipiuk . Just running the above mentioned query leads to this? Or need to run concurrent instances of it? Or something else?
Thanks @stagraqubole. I edited the issue description adding Environment section which describes my test setup. To reproduce that I needed to run query couple of times, but not too many. Typically for 10 runs I had 3+ failed.
I am suspecting a integer overflow issue here. With the single file sized 33GB, we might be trying to store the position in int. Looking if I have files of larger sizes.
@stagraqubole if this is file offset overflowing, you need files only over 2GB
@stagraqubole FYI that with hive.cache.parallel-warmup-enabled=false I was not able to reproduce this. I run 20+ queries in the loop and all completed successfully and no errors were logged. The execution speed variance was huge though (1.5-3 minutes in wall time and 50-70 minutes CPU time)
This is an issue in the parallel warmup code, related to integer overflow. It can happen with big files (>2GB) but not necessarily always. It happens when we accumulate requests of more than 2GB of contiguous data to be downloaded in a period of 10secs.
Thanks @stagraqubole for looking into that. Can you post a tracking issue here so we have a visibility?
Also what changes in Rubix operation with and without parallel warmup? Can you maybe write a short summary. Or maybe there is an article to read which explains that?
Created https://github.com/qubole/rubix/issues/361 and mentioned a short summary of issue there.
The main difference in case of parallel warmup is that the read issued by client in this case is fulfilled directly from cloud store while requests to download the data are registered with the server. The server keeps checking these requests every second and handles all the requests that are more than 10sec old and start the download of the data. All this happen asynchronously wrt to client (Presto readers in our case).
In case parallel warmup is disabled, during the same read that client issued we download the data and warmup the cache. Hence the higher latency but overall data read is half of what we read with parallel warmup.
So "parallel warmup is disabled" => cache is read-through.
and with parallel warmup the cache is populated asynchronously.
Is my understanding correct?
That is correct @findepi
Fixed by: https://github.com/prestosql/presto/pull/3772 (as it upgrades Rubix so that it has https://github.com/qubole/rubix/pull/368).
Most helpful comment
I will try to repro this to debug. Can you help with the repro steps @losipiuk . Just running the above mentioned query leads to this? Or need to run concurrent instances of it? Or something else?