Presto: Release notes for 342

Created on 9 Sep 2020  路  29Comments  路  Source: prestosql/presto

Dain Sundstrom

  • [x] all checked
  • 09-11 20:58:17 33d3cdcb16 Fully buffer small RC files
  • 09-11 20:58:17 474492526e Add getRetainedSize to OrcDataSource
  • 09-11 20:58:17 551c599daf Introduce FSDataInputStreamTail reads
  • 09-11 20:58:17 563dcca8b8 Cleanup warnings in TestOrcPageSourceMemoryTracking
  • 09-11 20:58:17 66d0a2e087 Cleanup ParquetPageSourceFactory
  • 09-11 20:58:17 75f426e9c5 Support CSE-KMS S3 object file size detection via tail read
  • 09-11 20:58:17 794b5e76e9 Add MemoryOrcDataSource
  • 09-11 20:58:17 7ffe98d7b7 Change Parquet MetadataReader to use ParquetDataSource
  • 09-11 20:58:17 95dedea726 Add readTail to OrcDataSource
  • 09-11 20:58:17 b1cc663071 Rename Hive fileSize to estimatedFileSize
  • 09-11 20:58:17 b22d7528fb Change ParquetDataSource readFully to return Slice
  • 09-11 20:58:17 cd6faef077 Remove unused method from ParquetDataSource
  • 09-11 20:58:17 eb80079f93 Add readTail to ParquetDataSource
  • 09-11 20:58:17 ffe79e2818 Add memory tests for both cached and uncached ORC
  • 09-17 10:18:57 2df6ccad79 Remove empty section
  • 09-17 10:18:57 42f7ac59eb Document support for insecure LDAP connection
  • 09-17 10:18:57 a477317483 Update https over http forwarding property name

    David Phillips

  • [x] all checked

  • 09-11 12:21:35 10f8a3645c Fix performance regression when hive SerDe doesn't prefer Writables
  • 09-16 14:03:43 1ec879ec3a Fix JAVA_HOME in container image
  • 09-16 14:03:43 2d97150678 Add new test to verify if JAVA_HOME works
  • 09-16 14:17:42 bd3ad39457 Add Docker requirement to README
  • 09-21 22:29:26 c6bd981e9a Update documentation for time and timestamp types
  • 09-22 18:39:29 a35e756f27 Update to Avro 1.9.2
  • 09-22 18:44:38 643bfd1b31 Support Domain Expressions for GlueHiveMetastore
  • 09-22 18:46:02 b070c0bae2 Cleanup warnings in AbstractTestIcebergSmoke
  • 09-23 11:58:30 fa0f7dbf61 Match Iceberg transforms for negative epoch values

    Grzegorz Kokosi艅ski

  • [ ] all checked

  • 09-22 02:20:56 5f9d7aebac Close Postgres test resources in order
  • 09-22 02:20:56 a223d021a9 Remove extra this

    Karol Sobczak

  • [x] all checked

  • 09-09 03:55:29 1feaa0f928 Support lazy dynamic filtering in hive connector
  • 09-09 03:55:29 9188bbf340 Move searchScanFilterAndProjectOperatorStats
  • 09-09 04:33:33 4e97cb091c Extract local variable
  • 09-09 04:33:33 519086aceb Make 128-bit addition use 2*64 bit values
  • 09-09 04:33:33 954611db7b Add unscaled values to decimal operators addition benchmark
  • 09-09 04:33:33 ae85047c6e Change way of checking sign in 128-bit arithmetic
  • 09-10 04:24:30 6803205ad7 Add benchmark for projected column reads
  • 09-10 04:24:30 92cfdd1a3f Support benchmark reads through HivePageSource
  • 09-10 04:24:30 d784071c33 Extract TestData as a top-level class
  • 09-10 04:27:15 ef5d7ae099 Make DynamicFilter future resilient to cancel
  • 09-10 04:39:58 61c72c4b17 Extract common function in TestMemorySmoke
  • 09-10 04:39:58 af4e8568f3 Implement dynamic filtering for semi-joins
  • 09-11 02:53:45 cb300dcec0 Add DynamicFilter#isAwaitable method
  • 09-11 03:02:30 99647e9580 Wait for final QueryInfo before using in test
  • 09-15 04:10:10 ad5a676ddf Remove duplicate tpc-ds queries
  • 09-15 04:39:48 0fb16ab9d9 Add tests for semi-join dynamic filtering in hive
  • 09-15 05:43:25 1e37382a08 Simplify TestCoordinatorDynamicFiltering tests
  • 09-15 05:43:25 22e5d39641 Support for lazy dynamic filters for replicated joins
  • 09-15 12:08:58 ebee077963 Change dynamic partition pruning tests to avoid failures
  • 09-16 01:41:24 d8cbadf637 Fix misuse of HashSet::new
  • 09-16 01:43:39 360bfcaa24 Use forEach on collection directly
  • 09-16 01:44:41 83bcdffc4c Use task status version for task status long polling notifications
  • 09-16 02:09:03 fc80e630d8 Fetch dynamic filters continuously
  • 09-16 02:57:38 db5be59bcc Revert "Fetch dynamic filters continuously"
  • 09-16 02:58:54 39a165fc6c Fetch dynamic filters continuously
  • 09-16 08:33:47 9faddbe9ad Remove OperatorStats related flaky assertions
  • 09-16 11:08:56 73fef2de98 Remove flaky assertion
  • 09-16 11:08:56 f06b7fe6ca Fix flaky TestSqlTask#testDynamicFilters
  • 09-17 03:39:05 25abaf0307 Choose join side with small tables as build side
  • 09-17 03:45:35 8dcd866655 Add SQL annotation to test method
  • 09-17 03:45:35 e0d1846bce Add createColorSequenceBlock in BlockAssertions
  • 09-17 03:45:35 f548a1e54f Implement collection of min/max values in DynamicFilterSourceOperator
  • 09-18 05:28:52 b05560192a Reuse Marker object
  • 09-19 11:46:37 cccbee4548 Fix using of session parameter in test
  • 09-21 04:43:16 d1f57a98ef Reduce size of serialized Range (and TupleDomain)
  • 09-21 07:27:39 3a8bf6438e Refactor more operators to use Page#getColumns where appropriate
  • 09-21 07:27:39 a9e9f07545 Add single int overload to Page#getColumns

    Martin Traverso

  • [x] all checked

  • 09-17 10:35:25 00c6d56ad5 Add rule for updating ApplyNode correlation list
  • 09-17 10:35:25 177c5a2eb3 Remove pruning of correlation list from project-off rule
  • 09-17 10:35:25 53c9d38344 Remove pruning of correlation list from project-off rule
  • 09-17 10:35:25 b6358261e8 Add rule for updating CorrelatedJoinNode correlation list
  • 09-22 13:31:15 353dba2003 Add from_iso8601_timestamp_nanos function

    Piotr Findeisen

  • [ ] all checked

  • 09-09 00:30:38 303436203c Fix class name to match code style
  • 09-09 00:30:38 cd33b044ec Remove some unnecessary usage of Number
  • 09-09 00:30:38 da50533cef Fix SQL type of property
  • 09-09 02:06:03 8732f675e6 Fix compiler error for lambda parameter with non-letter
  • 09-09 02:06:03 b848b843ed Remove bogus braces from regex
  • 09-09 02:06:03 ba1dc08167 Remove redundant constructor
  • 09-09 02:06:03 cc289c9c80 Fix indentation
  • 09-09 02:56:23 8046ff56f5 Update hive catalog configuration for development
  • 09-09 03:44:38 b1e46b9041 Update reference link
  • 09-09 07:34:52 7bb39ce7c1 Enable Cassandra insert test
  • 09-09 23:52:47 7442472701 Unimplement deprecated ConnectorSplitManager#getSplits variants
  • 09-09 23:52:47 a52dbbfa18 Use non-deprecated ConnectorSplitManager#getSplits overload in tests
  • 09-10 00:14:03 3013dcc2fe Remove duplicate assertion
  • 09-10 00:14:03 a82169e3fd Fix Atop predicate pushdown
  • 09-10 01:26:33 8e73b0eaea Remove incorrect default conversion
  • 09-10 05:25:42 2993a4ab9e Remove not applicable entry from release notes
  • 09-10 23:28:11 e7f8dd9629 Update some usages of deprecated TIMESTAMP_WITH_TIME_ZONE
  • 09-11 13:02:36 9e648a8ad8 Remove redundant supression
  • 09-11 13:05:05 fcea7c679f Add timestamp timezone configuration properties for HDP3 environment
  • 09-15 04:00:02 dba39ac894 Fix environment startup retries
  • 09-16 12:39:01 99bc355ed9 Hide setting HDFS user/group behind a toggle
  • 09-17 01:07:12 d39c3e3a79 Update docker images to version 33
  • 09-17 02:00:19 00f77552e4 Fix raw class usage
  • 09-17 02:00:19 7a6e3c0e90 Fix unused parameter in PostgreSQL test
  • 09-17 02:00:19 a8b4e79e43 Fix typo
  • 09-17 02:00:19 b250c2740f Add unit test for Session
  • 09-17 02:00:19 e04cdde8b3 Fix adding catalog property to session
  • 09-18 00:26:11 f71c2234fe Fix TestHiveAzureConfig
  • 09-18 01:28:48 b378f1fe9d Use AssertJ for better exception message
  • 09-18 01:28:48 d54e663db8 Rename TestngUtils to DataProviders
  • 09-18 01:28:48 f7159edfa9 Simplify collector definition
  • 09-18 01:29:37 60611517ff Require project version to be known
  • 09-18 01:34:16 2fe7727873 Deprecate isCharType, isVarcharType
  • 09-18 02:28:54 486db08df6 Deprecate isVarbinaryType
  • 09-19 01:31:19 27c4839f5f Validate char/varchar values read in JDBC connectors
  • 09-19 12:50:07 aeb6571ada Update deprecation notices
  • 09-21 05:40:45 8b17712066 Provide versionless link to server in product tests
  • 09-21 06:33:35 5bdd1c14ec Implement aggregation pushdown for SQL Server
  • 09-21 13:42:42 5a73b4a169 Remove commented out code
  • 09-22 00:23:00 093709c774 Use correct type in column declaration
  • 09-22 00:23:00 b23d2614d2 Deprecate JdbcTypeHandle constructor overload
  • 09-22 00:23:00 ee3cdc66b9 Add timestamp test case before epoch with fraction
  • 09-22 00:23:00 f6bf01bb3b Make JdbcTypeHandle.decimalDigits optional
  • 09-22 05:27:30 0701225c78 Document sqlserver aggregate function pushdown

    Praveen Krishna

  • [ ] all checked

  • 09-14 03:27:47 7ee94c9629 Use static imports for Preconditions

    Yuya Ebihara

  • [x] all checked

  • 09-08 19:18:30 2a2970c55b Remove unused WILDCARD_EXPRESSION
  • 09-10 00:37:03 b22fc1e0ad Resolve inconsistent error message of SHOW COLUMNS
  • 09-12 20:58:39 05b1226891 Allow INSERT null for SQL Server varbinary type
  • 09-16 01:50:12 24d59316a0 Do not copy constraint for temporary table on non-GTID MySQL

    艁ukasz Osipiuk

  • [x] all checked

  • 09-09 03:20:58 8b825dc49e Update caching limitations
  • 09-09 11:23:37 55ad4cb0ce Extract method
  • 09-09 11:23:37 5671a6ef79 Add support for precision for TIMESTAMP W/TZ in Postgresql type mapping
  • 09-09 13:23:18 4ad9ab9ccb Inline unneeded methods
  • 09-09 13:23:18 e15ff10a4a Rename TestOracleTypes to TestOracleTypeMapping
  • 09-10 05:17:02 0b806e0577 Simplify flow in timestamp with timezone mapping tests
  • 09-10 05:17:02 267ada1787 Do not use raw parametrized type
  • 09-10 05:17:02 7a17dcecd6 Rename method paremeter to express its meaning
  • 09-10 05:17:02 be2b97a848 Add support for precision for TIMESTAMP in PostgreSQL type mapping
  • 09-10 05:17:02 cc78151214 Use generic trueFalse data provider for timestamp tests
  • 09-10 05:17:02 db779fc369 Inline addArrayTimestampTestIfSupported
  • 09-10 05:17:02 ee7d359751 Remove not needed annotations
  • 09-10 05:17:02 effe2a0ff7 Remove redundant cast
  • 09-10 11:40:54 8b3f1fe204 Make Type column wider in Kafka columns documentation
  • 09-10 11:40:54 a8345e2de9 Refactor KafkaInternalFieldDescription to KafkaInternalFieldManager
  • 09-10 11:40:54 d4469590a5 Add header column to Kafka Connector
  • 09-11 05:55:55 2a32d3f2b9 Do not use oracleServer explicitly
  • 09-11 05:55:55 65b57afd58 Extract AbstractTestOracleTypeMapping
  • 09-11 05:55:55 c9c4305af7 Rename method
  • 09-11 11:05:29 7d4a7a4020 Fix JDBC driver compatibility regarding TIME WITH TIME ZONE
  • 09-13 23:38:06 08936026a4 Support temporal types in Kafka JSON encoder
  • 09-13 23:38:06 1fec6627fa Add tests for Kafka JSON date time types
  • 09-13 23:38:06 3ef8cfdef1 Remove support for illogical types in rfc2822 record decoder
  • 09-13 23:38:06 456e1fa054 Account for zone offset in record decoder TIME WITH TIME ZONE decoding
  • 09-13 23:38:06 5634658be3 Add more functions to DateTimeTestingUtils
  • 09-13 23:38:06 6f9f12d249 Expand record decoder time w/tz tests
  • 09-13 23:38:06 feecdc9897 Add docs for Kafka JSON temporal support
  • 09-14 03:34:01 25cff689fe Improve product tests environment startup
  • 09-14 03:34:01 31bd3503ce Configure launcher bin location
  • 09-14 03:34:01 54e9e28693 Remove redundant pruneEnvironment call
  • 09-14 03:34:01 5ee948f08d Refactor logCopyingListener initialization
  • 09-14 03:34:01 7d3971b268 Add DockerContainer logical name
  • 09-14 03:34:01 9291cd6fdb Pass startup retries to environment builder
  • 09-14 03:34:01 94c8a5d658 Removeme: copy logs from containers
  • 09-14 03:34:01 a0ce3084d5 Fix suite describe command printing
  • 09-14 03:34:01 a260eab925 Add container output handling modes
  • 09-14 03:34:01 a482e37d2d Print container stats on container shutdown
  • 09-14 03:34:01 a643e38813 Make EnvironmentDefaults class final
  • 09-14 03:34:01 ac86653963 Drop environment configuration comments
  • 09-14 03:34:01 b14016624f Make launcher commands callable
  • 09-14 03:34:01 f9e83fa48f List and copy log files from running container
  • 09-14 05:25:42 31f570b691 Test current JDBC driver against old Presto releases
  • 09-14 05:25:42 7896b5274d Allow specifying Presto server version to be tested via env
  • 09-14 05:25:42 85fc1ec0f0 Extract BaseTestJdbcResultSet
  • 09-14 05:25:42 bb460c0b3a Guard JDBC tests against Presto server version
  • 09-14 05:25:42 d17975cfb1 Rename presto-test-jdbc-compatibility to presto-test-jdbc-compatibility-old-driver
  • 09-14 05:25:42 df348ba227 Make TestJdbcResultSet multi threaded
  • 09-14 09:48:57 0ec519da1e Set host configuration of product tests containers
  • 09-14 09:48:57 463f0e142e Fix displaying test run duration
  • 09-14 09:48:57 5664993f37 Fix suite describe
  • 09-14 09:48:57 c0edf8e82c Allow case insensitive enum values in launcher
  • 09-14 09:48:57 cdbce1d655 Improve Suite toString method
  • 09-14 09:48:57 f8f0089a9b Add timeout for suite and test execution
  • 09-17 01:43:59 6149fa2cb3 Remove unnecessary variable propagation
  • 09-17 01:43:59 f9ef95e803 Allow to configure insert into Hive partition via configuration property
  • 09-17 04:21:24 432bf869c3 Clean up PrestoAzureConfigurationInitializer
  • 09-17 04:21:24 44e1bee64c Rename ABFS tests as ABFS access key tests
  • 09-17 04:21:24 655a19d3dd Share more setup code between hive test scripts
  • 09-17 04:21:24 839476ea45 Add superclass for tests using different ABFS authentication methods
  • 09-17 04:21:24 a381973ad2 Fix indentation and line continuations in hive test scripts
  • 09-17 04:21:24 be89b2bab9 Move TestHiveAzureConfig to azure package
  • 09-17 04:21:24 cbabd27280 Add tests for PrestoAzureConfigurationInitializer
  • 09-17 04:28:23 2b3a1c32e3 Intercept and log listener exceptions
  • 09-17 04:28:23 327d6edbda Create new network on environment startup
  • 09-17 04:28:23 6f417ae71c Fix starting multinode environment without presto
  • 09-17 04:28:23 70e23dbf3d Improve hadoop-master-2 container configuration
  • 09-17 04:28:23 8757e7f788 Make EnvironmentDown callable
  • 09-17 04:28:23 9ccc78fbb2 Improve environment shutdown
  • 09-17 04:28:23 cbd91311da Remove unused constant
  • 09-17 04:28:23 d548911f97 Fix displaying stats
  • 09-17 11:36:10 cc30d16a7f Add support for ABFS OAuth authentication
  • 09-18 02:05:40 0893543aef Refactor and improve coverage for Oracle integration tests
release-notes

All 29 comments

* Fix query failure when lambda expression references a table column containing a dot. ({issue}`5087`)

https://github.com/prestosql/presto/pull/5087

* Add property (``hive.dynamic-filtering-probe-blocking-timeout``) for delaying table scans
  until dynamic partition pruning can be performed more efficiently. ({issue}`4991`)

https://github.com/prestosql/presto/pull/4991

* Improve performance of queries that use decimal type. ({issue}`4886`)

https://github.com/prestosql/presto/pull/4886

Atop Connector Changes
* Fix incorrect query results when query contains predicates on `start_time` or `end_time` column. ({issue}`5125`)

https://github.com/prestosql/presto/pull/5125

* Make dynamic filter futures resilient to cancellation. ({issue}`5099`)

https://github.com/prestosql/presto/pull/5099

* Improve query performance by adding support for dynamic filtering and dynamic
  partition pruning to semi-join relational operator. ({issue}`5017`)

https://github.com/prestosql/presto/pull/5017

* Expose message headers as a ``_headers`` column of ``map(VARCHAR, array(VARBINARY))`` type. ({issue}`4462`)

https://github.com/prestosql/presto/pull/4462

* Add ``DynamicFilter#isAwaitable`` method that returns whether dynamic filter is not complete and can be
  awaited for via future. ({issue}`5043`)

https://github.com/prestosql/presto/pull/5043

* Extend type mapping to support variadic ``TIMESTAMP`` and ``TIMESTAMP WITH ZONE`` types. ({issue}`5124`, {issue}`5105`)

https://github.com/prestosql/presto/pull/5124
https://github.com/prestosql/presto/pull/5105

SQL Server
* Fix failure when inserting `NULL` to `VARBINARY` column. ({issue}`4846`)

https://github.com/prestosql/presto/pull/4846

* Add write support for ``TIME``, ``TIME WITH TIME ZONE``, ``TIMESTAMP`` and ``TIMESTAMP WITH TIME ZONE`` 
  for Kafka connector when JSON encoder is in use. ({issue}`4743`)

https://github.com/prestosql/presto/pull/4743

General/SPI
* Enable connectors to wait for dynamic filters derived from replicated join before generating splits. ({issue}`4685`)

https://github.com/prestosql/presto/pull/4685

* Improve performance of `INSERT` statement when MySQL instance isn't running with GTID mode. ({issue}`4995`)

https://github.com/prestosql/presto/pull/4995

* Improve dynamic partition pruning and query performance by reducing latency of dynamic filters collection. ({issue}`4988`)

https://github.com/prestosql/presto/pull/4988

* Disable matching the existing user and group of the table or partition when creating new files on HDFS.
  The functionality was added in 341 and is now disabled by default. You can enable it with `hive.fs.new-file-inherit-ownership`
  configuration property. ({issue}`5187`)

https://github.com/prestosql/presto/pull/5187

* Allow specifying what happens if data is inserted into existing Hive partition. 
  This can be done using ``hive.insert-existing-partitions-behavior`` config property. ({issue}`4999`)

https://github.com/prestosql/presto/pull/4999

* Improve join performance when cost-based optimizer has missing or inaccurate stats. ({issue}`5141`)

https://github.com/prestosql/presto/pull/5141

## SQL Server Connector Changes

* Improve performance of aggregation queries by computing aggregations within SQL Server database.
  Currently, the following aggregate functions are eligible for pushdown:
  ``count``,  ``min``, ``max``, ``sum`` and ``avg``. ({issue}`4139`)

https://github.com/prestosql/presto/issues/4139 https://github.com/prestosql/presto/pull/5196

the SQL server connector changes from @findepi above should change and just link to the docs and have a short sentence like

* Add :ref:`aggregate function pushdown <sqlserver-pushdown>` as performance improvement ({issue}`4139`)

see https://github.com/prestosql/presto/pull/5245

* Add support for ABFS OAuth authentication ({issue}`5052`)

https://github.com/prestosql/presto/pull/5052

* In JSON decoder drop decoding support for temporal types for nonsenical combinations of input-format-type/data-type.
  Following combination are no longer supported:
  - ``rfc2822``:  ``DATE``, ``TIME``, ``TIME WITH TIME ZONE``
  - ``milliseconds-since-epoch``: ``TIME WITH TIME ZONE``, ``TIMESTAMP WITH TIME ZONE``    
  - ``seconds-since-epoch``: ``TIME WITH TIME ZONE``, ``TIMESTAMP WITH TIME ZONE``    
  ({issue}`4743`)

https://github.com/prestosql/presto/pull/4743

## Hive
* Add support for S3 encrypted files. ({issue}`2536`)
* Improve performance of reading small file in RCFile format. ({issue}`2536`)

2536

* Reduce latency for queries where broadcast join is used and broadcasted table is large. ({issue}`5237`)

https://github.com/prestosql/presto/pull/5237

* Support reading timestamp with microsecond or nanosecond precision. This can be enabled with `hive.timestamp-precision`
  connector configuration property. ({issue}`4953`)

https://github.com/prestosql/presto/pull/4953
part of https://github.com/prestosql/presto/issues/3977

# Hive Connector Changes

* Improve performance when reading `JSON` and `CSV` file formats. ({issue}`5142`)

5142

# Hive Connector Changes

* Improve planning time for queries with non-equality filters on
  partition columns when using the Glue metastore. ({issue}`5060`)

5060

# Iceberg Connector Changes

* Fix partition transforms for temporal columns for dates before 1970. ({issue}`5273`)

5273

* Allow collection of dynamic filters for joins with large build side using the
  `enable-large-dynamic-filters` configuration property or the `enable_large_dynamic_filters`
  session property.
  The existing configuration properties `dynamic-filtering-max-per-driver-row-count`,
  `dynamic-filtering-max-per-driver-size`, `dynamic-filtering-range-row-limit-per-driver`
  and their corresponding session properties are now defunct.
  When large dynamic filters are enabled, limits on size of dynamic filters can be configured
  for each join distribution type using the configuration properties
  `dynamic-filtering.large-broadcast.max-distinct-values-per-driver`,
  `dynamic-filtering.large-broadcast.max-size-per-driver` and
  `dynamic-filtering.large-broadcast.range-row-limit-per-driver` and their equivalent for partitioned
  join distribution type.
  Similarly, limits for dynamic filters when `enable-large-dynamic-filters` is not enabled
  can be configured using configuration properties like
  `dynamic-filtering.large-partitioned.max-distinct-values-per-driver`. ({issue}`5262`)

https://github.com/prestosql/presto/pull/5262

This is way too long @sopel39 .. please move this into the docs and then link to it

Was this page helpful?
0 / 5 - 0 ratings

Related issues

JamesRTaylor picture JamesRTaylor  路  5Comments

fredabood picture fredabood  路  3Comments

dpolonsky picture dpolonsky  路  4Comments

ChethanUK picture ChethanUK  路  4Comments

theoretical-olive picture theoretical-olive  路  5Comments