Presto: Malformed table data read if Rubix Cache is in use

Created on 20 Apr 2020  路  14Comments  路  Source: prestosql/presto

While stress-testing rubix cache we noticed intermittent failures because data read from S3 stream backed by cache could not be deserialized as valid ORC content.
Query failed with following exception:

io.prestosql.spi.PrestoException: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
    at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:127)
    at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.lambda$createOrcPageSource$4(OrcPageSourceFactory.java:282)
    at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:83)
    at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
    at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
    at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
    at io.prestosql.plugin.hive.orc.OrcPageSource$SourceColumn$MaskingBlockLoader.load(OrcPageSource.java:266)
    at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
    at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
    at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
    at io.prestosql.spi.Page.getLoadedPage(Page.java:273)
    at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:305)
    at io.prestosql.operator.Driver.processInternal(Driver.java:379)
    at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
    at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
    at io.prestosql.operator.Driver.processFor(Driver.java:276)
    at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
    at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
    at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
    at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.prestosql.orc.OrcCorruptionException: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
    at io.prestosql.orc.OrcZlibDecompressor.decompress(OrcZlibDecompressor.java:59)
    at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:128)
    at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
    at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:96)
    at io.prestosql.orc.stream.OrcInputStream.readFully(OrcInputStream.java:121)
    at io.prestosql.orc.stream.ByteArrayInputStream.next(ByteArrayInputStream.java:43)
    at io.prestosql.orc.reader.SliceDirectColumnReader.readBlock(SliceDirectColumnReader.java:218)
    at io.prestosql.orc.reader.SliceColumnReader.readBlock(SliceColumnReader.java:77)
    at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:76)
    ... 20 more

Test query in use:

Select
             count(l.orderkey),
             count(l.partkey),
             count(l.suppkey),
             count(l.linenumber),
             count(l.quantity),
             count(l.extendedprice),
             count(l.discount),
             count(l.tax),
             count(l.returnflag),
             count(l.linestatus),
             count(l.shipdate),
             count(l.commitdate),
             count(l.receiptdate),
             count(l.shipinstruct),
             count(l.shipmode),
             count(l.comment)
             From "hive"."tpch_sf1000_orc".lineitem l

Environment:

  • I was testing using k8s on EKS. I do not know if it is a factor here and running Presto directly on VM would improve situation
  • EKS cluster was had 1 node backed by r5d16.xlarge
  • Presto cluster had coordinator and single worker node

    • Coordinator was assigned 3 CPUs (limit 4) and 40Gi of memory

    • Worker was assigned 59 CPUs (limit 60) and 440Gi of memory

  • Caching directory was pointing to local SSD partition (separate partition used for coordinator and worker).
  • SSD partitions on machine were formatted as ext4 and mounted using following bootstrap script:
for i in `seq 1 4`; do
  mkfs.ext4 /dev/nvme$${i}n1
  mkdir /media/nv$${i}
  mount /dev/nvme$${i}n1 /media/nv$${i}
  mkdir /media/nv$${i}/presto-cache
  chmod 777 /media/nv$${i}/presto-cache
done
  • Then /media/nv1/presto-cache was mounted as hostPath volume for coordinator Pod. /media/nv2/presto-cache was mounted for worker Pod.
  • "hive"."tpch_sf1000_orc".lineitem table is residing in S3. The directory contains 5 files. Each roughly 33GB. ORC files are zlib compressed.
bug

Most helpful comment

I will try to repro this to debug. Can you help with the repro steps @losipiuk . Just running the above mentioned query leads to this? Or need to run concurrent instances of it? Or something else?

All 14 comments

More snippets from worker node logs (this is the only worker node in the cluster):

Before failure. Those were appearing in the logs for 13 minutes before query failed. 4 other queries were successfully run in that period.

2020-04-20T21:20:53.273Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -286261248
2020-04-20T21:22:40.252Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1084227584
2020-04-20T21:22:40.252Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1163919360
2020-04-20T21:22:40.252Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1883242496
2020-04-20T21:22:40.252Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1426063360
2020-04-20T21:22:40.252Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1955594240
2020-04-20T21:23:27.850Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      ReadData didn't match with requested value. RequestedData: 2366636032 ReadData: -1928331264
2020-04-20T21:23:39.748Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      ReadData didn't match with requested value. RequestedData: 2559574016 ReadData: -1735393280
2020-04-20T21:23:39.755Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -329901701

Exception which failed query:
This is repeated for multiple SplitRunners.

2020-04-20T21:23:39.777Z        ERROR   SplitRunner-91-151      io.prestosql.execution.executor.TaskExecutor    Error processing Split 20200420_212233_00006_n3ygz.1.0-1686 {path=s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a, start=19730006016, length=67108864, fileSize=36236363905, hosts=[172.40.247.208], database=tpch_sf1000_orc, table=lineitem, forceLocalScheduling=false, partitionName=<UNPARTITIONED>, s3SelectPushdownEnabled=false} (start = 4.032926451763E7, wall = 1703 ms, cpu = 269 ms, wait = 84 ms, calls = 2): HIVE_BAD_DATA: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
io.prestosql.spi.PrestoException: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
        at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:127)
        at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.lambda$createOrcPageSource$4(OrcPageSourceFactory.java:282)
        at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:83)
        at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
        at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
        at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
        at io.prestosql.plugin.hive.orc.OrcPageSource$SourceColumn$MaskingBlockLoader.load(OrcPageSource.java:266)
        at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
        at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
        at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
        at io.prestosql.spi.Page.getLoadedPage(Page.java:273)
        at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:305)
        at io.prestosql.operator.Driver.processInternal(Driver.java:379)
        at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
        at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
        at io.prestosql.operator.Driver.processFor(Driver.java:276)
        at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
        at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
        at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
        at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.prestosql.orc.OrcCorruptionException: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a]
        at io.prestosql.orc.OrcZlibDecompressor.decompress(OrcZlibDecompressor.java:59)
        at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:128)
        at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
        at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:96)
        at io.prestosql.orc.stream.OrcInputStream.readFully(OrcInputStream.java:121)
        at io.prestosql.orc.stream.ByteArrayInputStream.next(ByteArrayInputStream.java:43)
        at io.prestosql.orc.reader.SliceDirectColumnReader.readBlock(SliceDirectColumnReader.java:218)
        at io.prestosql.orc.reader.SliceColumnReader.readBlock(SliceColumnReader.java:77)
        at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:76)
        ... 20 more

More snippets AFTER query failure:

2020-04-20T21:23:39.783Z        ERROR   SplitRunner-8-68        io.prestosql.execution.executor.TaskExecutor    Error processing Split 20200420_212233_00006_n3ygz.1.0-1715 {path=s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2, start=22548578304, length=67108864, fileSize=37290938397, hosts=[172.40.247.208], database=tpch_sf1000_orc, table=lineitem, forceLocalScheduling=false, partitionName=<UNPARTITIONED>, s3SelectPushdownEnabled=false} (start = 4.0330905983724E7, wall = 67 ms, cpu = 0 ms, wait = 0 ms, calls = 1): HIVE_CURSOR_ERROR: Failed to read ORC file: s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
io.prestosql.spi.PrestoException: Failed to read ORC file: s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
        at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:129)
        at io.prestosql.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:103)
        at io.prestosql.plugin.hive.HivePageSource.getNextPage(HivePageSource.java:244)
        at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:302)
        at io.prestosql.operator.Driver.processInternal(Driver.java:379)
        at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
        at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
        at io.prestosql.operator.Driver.processFor(Driver.java:276)
        at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
        at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
        at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
        at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.UnsupportedOperationException: Slice is not backed by a byte array
        at io.airlift.slice.Slice.checkHasByteArray(Slice.java:333)
        at io.airlift.slice.Slice.byteArray(Slice.java:352)
        at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:129)
        at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
        at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:96)
        at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.tryRefillBuffer(CodedInputStream.java:2715)
        at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.isAtEnd(CodedInputStream.java:2654)
        at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.readTag(CodedInputStream.java:2004)
        at io.prestosql.orc.proto.OrcProto$StripeFooter.<init>(OrcProto.java:15115)
        at io.prestosql.orc.proto.OrcProto$StripeFooter.<init>(OrcProto.java:15084)
        at io.prestosql.orc.proto.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:16709)
        at io.prestosql.orc.proto.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:16704)
        at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:91)
        at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:96)
        at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
        at io.prestosql.orc.protobuf.GeneratedMessageV3.parseWithIOException(GeneratedMessageV3.java:311)
        at io.prestosql.orc.proto.OrcProto$StripeFooter.parseFrom(OrcProto.java:15517)
        at io.prestosql.orc.metadata.OrcMetadataReader.readStripeFooter(OrcMetadataReader.java:166)
        at io.prestosql.orc.metadata.ExceptionWrappingMetadataReader.readStripeFooter(ExceptionWrappingMetadataReader.java:84)
        at io.prestosql.orc.StripeReader.readStripeFooter(StripeReader.java:383)
        at io.prestosql.orc.StripeReader.readStripe(StripeReader.java:119)
        at io.prestosql.orc.OrcRecordReader.advanceToNextStripe(OrcRecordReader.java:504)
        at io.prestosql.orc.OrcRecordReader.advanceToNextRowGroup(OrcRecordReader.java:446)
        at io.prestosql.orc.OrcRecordReader.nextPage(OrcRecordReader.java:367)
        at io.prestosql.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:99)
        ... 13 more

then repeated multiple times for different files:

2020-04-20T21:23:39.827Z        ERROR   20200420_212233_00006_n3ygz.1.0-1690-130        com.qubole.rubix.core.CachedReadRequestChain    Fall back to read from object store for /var/lib/presto/cache/fcache/<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2 .Could not read data from cached file :
com.qubole.rubix.core.ReadRequestChain$CancelledException: com.qubole.rubix.core.CachedReadRequestChain Cancelled
        at com.qubole.rubix.core.ReadRequestChain.propagateCancel(ReadRequestChain.java:107)
        at com.qubole.rubix.core.CachedReadRequestChain.call(CachedReadRequestChain.java:102)
        at com.qubole.rubix.core.CachedReadRequestChain.call(CachedReadRequestChain.java:37)
        at com.google.shaded.shaded.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
        at com.google.shaded.shaded.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
        at com.google.shaded.shaded.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)

And also multiple times:

2020-04-20T21:23:40.793Z        WARN    20200420_212233_00006_n3ygz.1.0-1709-116        com.qubole.rubix.bookkeeper.FileMetadata        Evicting s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c
53dab105d0 due to EXPLICIT
2020-04-20T21:23:41.044Z        ERROR   20200420_212233_00006_n3ygz.1.0-1596-126        com.qubole.rubix.core.CachingInputStream        Could not get FileInfo for s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_df899620
-c77b-4012-a32c-1eb8cd1119ef. Fetching FileStatus from remote file system :
java.util.MissingFormatArgumentException: Format specifier '%s'
        at java.base/java.util.Formatter.format(Formatter.java:2672)
        at java.base/java.util.Formatter.format(Formatter.java:2609)
        at java.base/java.lang.String.format(String.java:2897)
        at com.qubole.rubix.bookkeeper.BookKeeper.getFileInfo(BookKeeper.java:415)
        at com.qubole.rubix.spi.LocalBookKeeperClient.getFileInfo(LocalBookKeeperClient.java:76)
        at com.qubole.rubix.core.CachingInputStream.<init>(CachingInputStream.java:106)
        at com.qubole.rubix.core.CachingFileSystem.open(CachingFileSystem.java:168)
        at org.apache.hadoop.fs.PrestoFileSystemCache$FileSystemWrapper.open(PrestoFileSystemCache.java:327)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:899)
        at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.lambda$createOrcPageSource$0(OrcPageSourceFactory.java:198)
        at io.prestosql.plugin.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
        at io.prestosql.plugin.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:89)
        at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.createOrcPageSource(OrcPageSourceFactory.java:198)
        at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.createPageSource(OrcPageSourceFactory.java:147)
        at io.prestosql.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:184)
        at io.prestosql.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:102)
        at io.prestosql.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:57)
        at io.prestosql.split.PageSourceManager.createPageSource(PageSourceManager.java:64)
        at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:299)
        at io.prestosql.operator.Driver.processInternal(Driver.java:379)
        at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
        at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
        at io.prestosql.operator.Driver.processFor(Driver.java:276)
        at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
        at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
        at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
        at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)

Some (not all) followup queries failed with:

io.prestosql.spi.PrestoException: Failed to read ORC file: s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_df899620-c77b-4012-a32c-1eb8cd1119ef
    at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:129)
    at io.prestosql.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:103)
    at io.prestosql.plugin.hive.HivePageSource.getNextPage(HivePageSource.java:244)
    at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:302)
    at io.prestosql.operator.Driver.processInternal(Driver.java:379)
    at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
    at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
    at io.prestosql.operator.Driver.processFor(Driver.java:276)
    at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
    at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
    at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
    at io.prestosql.$gen.Presto_332____20200420_210231_2.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.UnsupportedOperationException: Slice is not backed by a byte array
    at io.airlift.slice.Slice.checkHasByteArray(Slice.java:333)
    at io.airlift.slice.Slice.byteArray(Slice.java:352)
    at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:129)
    at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
    at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:96)
    at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.tryRefillBuffer(CodedInputStream.java:2715)
    at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.isAtEnd(CodedInputStream.java:2654)
    at io.prestosql.orc.protobuf.CodedInputStream$StreamDecoder.readTag(CodedInputStream.java:2004)
    at io.prestosql.orc.proto.OrcProto$StripeFooter.<init>(OrcProto.java:15115)
    at io.prestosql.orc.proto.OrcProto$StripeFooter.<init>(OrcProto.java:15084)
    at io.prestosql.orc.proto.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:16709)
    at io.prestosql.orc.proto.OrcProto$StripeFooter$1.parsePartialFrom(OrcProto.java:16704)
    at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:91)
    at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:96)
    at io.prestosql.orc.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
    at io.prestosql.orc.protobuf.GeneratedMessageV3.parseWithIOException(GeneratedMessageV3.java:311)
    at io.prestosql.orc.proto.OrcProto$StripeFooter.parseFrom(OrcProto.java:15517)
    at io.prestosql.orc.metadata.OrcMetadataReader.readStripeFooter(OrcMetadataReader.java:166)
    at io.prestosql.orc.metadata.ExceptionWrappingMetadataReader.readStripeFooter(ExceptionWrappingMetadataReader.java:84)
    at io.prestosql.orc.StripeReader.readStripeFooter(StripeReader.java:383)
    at io.prestosql.orc.StripeReader.readStripe(StripeReader.java:119)
    at io.prestosql.orc.OrcRecordReader.advanceToNextStripe(OrcRecordReader.java:504)
    at io.prestosql.orc.OrcRecordReader.advanceToNextRowGroup(OrcRecordReader.java:446)
    at io.prestosql.orc.OrcRecordReader.nextPage(OrcRecordReader.java:367)
    at io.prestosql.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:99)
    ... 13 more

Question: Could it be because coordinator and worker are sharing same cache directory? It seems like coordinator should not touch the content of the directory but maybe it does and it is a reason of read failures on worker?

Reconfigured Presto so coordinator and worker do not share cache directory.
I did not get failed query yet but I see NegativeArraySizeException errors in worker log:

2020-04-20T23:01:40.136Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      ReadData didn't match with requested value. RequestedData: 6786706824 ReadData: -1803227768
2020-04-20T23:01:40.136Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1247805440
2020-04-20T23:01:40.136Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -1515192320
2020-04-20T23:01:40.136Z        ERROR   RemoteFetchProcessor RUNNING    com.qubole.rubix.bookkeeper.FileDownloader      java.util.concurrent.ExecutionException: java.lang.NegativeArraySizeException: -931135488

[.... 5 minutes later...]

Already got a failed query:

io.prestosql.spi.PrestoException: Failed to read ORC file: s3://<redacted>/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_9defbd28-e6c7-44fc-9ec8-a0c401c7133a
    at io.prestosql.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:129)
    at io.prestosql.plugin.hive.orc.OrcPageSourceFactory.lambda$createOrcPageSource$4(OrcPageSourceFactory.java:282)
    at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:83)
    at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
    at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
    at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
    at io.prestosql.plugin.hive.orc.OrcPageSource$SourceColumn$MaskingBlockLoader.load(OrcPageSource.java:266)
    at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:381)
    at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:360)
    at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:276)
    at io.prestosql.spi.Page.getLoadedPage(Page.java:273)
    at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:305)
    at io.prestosql.operator.Driver.processInternal(Driver.java:379)
    at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
    at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
    at io.prestosql.operator.Driver.processFor(Driver.java:276)
    at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
    at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
    at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
    at io.prestosql.$gen.Presto_332____20200420_225629_2.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.UnsupportedOperationException: Slice is not backed by a byte array
    at io.airlift.slice.Slice.checkHasByteArray(Slice.java:333)
    at io.airlift.slice.Slice.byteArray(Slice.java:352)
    at io.prestosql.orc.stream.CompressedOrcChunkLoader.nextChunk(CompressedOrcChunkLoader.java:129)
    at io.prestosql.orc.stream.OrcInputStream.advance(OrcInputStream.java:204)
    at io.prestosql.orc.stream.OrcInputStream.read(OrcInputStream.java:83)
    at io.prestosql.orc.stream.LongInputStreamV2.readValues(LongInputStreamV2.java:64)
    at io.prestosql.orc.stream.LongInputStreamV2.next(LongInputStreamV2.java:366)
    at io.prestosql.orc.reader.SliceDirectColumnReader.readBlock(SliceDirectColumnReader.java:146)
    at io.prestosql.orc.reader.SliceColumnReader.readBlock(SliceColumnReader.java:77)
    at io.prestosql.orc.OrcBlockFactory$OrcBlockLoader.load(OrcBlockFactory.java:76)
    ... 20 more

I will try to repro this to debug. Can you help with the repro steps @losipiuk . Just running the above mentioned query leads to this? Or need to run concurrent instances of it? Or something else?

Thanks @stagraqubole. I edited the issue description adding Environment section which describes my test setup. To reproduce that I needed to run query couple of times, but not too many. Typically for 10 runs I had 3+ failed.

I am suspecting a integer overflow issue here. With the single file sized 33GB, we might be trying to store the position in int. Looking if I have files of larger sizes.

@stagraqubole if this is file offset overflowing, you need files only over 2GB

@stagraqubole FYI that with hive.cache.parallel-warmup-enabled=false I was not able to reproduce this. I run 20+ queries in the loop and all completed successfully and no errors were logged. The execution speed variance was huge though (1.5-3 minutes in wall time and 50-70 minutes CPU time)

This is an issue in the parallel warmup code, related to integer overflow. It can happen with big files (>2GB) but not necessarily always. It happens when we accumulate requests of more than 2GB of contiguous data to be downloaded in a period of 10secs.

Thanks @stagraqubole for looking into that. Can you post a tracking issue here so we have a visibility?

Also what changes in Rubix operation with and without parallel warmup? Can you maybe write a short summary. Or maybe there is an article to read which explains that?

Created https://github.com/qubole/rubix/issues/361 and mentioned a short summary of issue there.

The main difference in case of parallel warmup is that the read issued by client in this case is fulfilled directly from cloud store while requests to download the data are registered with the server. The server keeps checking these requests every second and handles all the requests that are more than 10sec old and start the download of the data. All this happen asynchronously wrt to client (Presto readers in our case).

In case parallel warmup is disabled, during the same read that client issued we download the data and warmup the cache. Hence the higher latency but overall data read is half of what we read with parallel warmup.

So "parallel warmup is disabled" => cache is read-through.
and with parallel warmup the cache is populated asynchronously.
Is my understanding correct?

That is correct @findepi

Fixed by: https://github.com/prestosql/presto/pull/3772 (as it upgrades Rubix so that it has https://github.com/qubole/rubix/pull/368).

Was this page helpful?
0 / 5 - 0 ratings