Yugabyte-db: [YSQL] Do read restarts automatically on the server side for COUNT(*) type of queries

Created on 4 Sep 2019  路  8Comments  路  Source: yugabyte/yugabyte-db

This is a subset of #1237. Aggregation-style queries only need to send a small amount of information to the client, and they have to scan a lot of data before they do it, potentially blocking for a significant amount of time. If a read restart error happens during that scanning, it should be handled on the server side and not propagated to the client.

areysql priorithigh

All 8 comments

Here is where we are currently preventing an automatic server-side read restart:

[ts-1] I0905 14:12:34.816833 281662912 pg_txn_manager.cc:186] DEBUG mbautin: PreventRestart:
[ts-1]     @        0x10cdede56  yb::pggate::PgDocOp::GetResult()
[ts-1]     @        0x10cddf93b  yb::pggate::PgDml::Fetch()
[ts-1]     @        0x10cdcb628  yb::pggate::PgApiImpl::DmlFetch()
[ts-1]     @        0x10cdc4b98  YBCPgDmlFetch
[ts-1]     @        0x10c70da55  ybcIterateForeignScan
[ts-1]     @        0x10c703add  ForeignNext
[ts-1]     @        0x10c6e8be3  fetch_input_tuple
[ts-1]     @        0x10c6e7b37  ExecAgg
[ts-1]     @        0x10c6d5ffe  ExecutePlan
[ts-1]     @        0x10c6d5ee6  standard_ExecutorRun
[ts-1]     @        0x10d36d50f  ybpgm_ExecutorRun
[ts-1]     @        0x10c848004  PortalRunSelect
[ts-1]     @        0x10c847b7f  PortalRun
[ts-1]     @        0x10c846a49  exec_execute_message
[ts-1]     @        0x10c843b7f  PostgresMain
[ts-1]     @        0x10c7bbd4f  BackendRun
[ts-1]     @        0x10c7bb309  ServerLoop
[ts-1]     @        0x10c7b8051  PostmasterMain
[ts-1]     @        0x10c7243bf  PostgresServerProcessMain
[ts-1]     @        0x10c724602  main
[ts-1]     @     0x7fff7e12e3d5  start
[ts-1]     @               0x13  (unknown)

Current implementation logic is that once any data has been fetched, restarts are no longer possible.
Under RF=1, simple SELECT COUNT(*) FROM t form a 1000-records tables actually results in 8 fetch operations under the hood, and only the failure in the first one would result in restart.
Here are ShortDebugString() dump of those operations:

tablet_id: "078eea4253004735850eaad9990abd3c" include_trace: false consistency_level: STRONG propagated_hybrid_time: 6432936976583098368 read_time { read_ht: 6432936976581885952 local_limit_ht: 6432936976786685952 global_limit_ht: 6432936976786685952 in_txn_limit_ht: 6432936976582737920 } pgsql_batch { client: YQL_CLIENT_PGSQL schema_version: 0 targets { tscall { opcode: 4 operands { } } } column_refs { } is_aggregate: true limit: 1024 return_paging_state: true ysql_catalog_version: 144 table_id: "00004200000030008000000000004227" } proxy_uuid: "" memory_limit_score: 0
tablet_id: "7d5aedff2af444e1aee581aa4192ce67" include_trace: false consistency_level: STRONG propagated_hybrid_time: 6432936985809231872 read_time { read_ht: 6432936976581885952 local_limit_ht: 6432936976786685952 global_limit_ht: 6432936976786685952 in_txn_limit_ht: 6432936976582737920 } pgsql_batch { client: YQL_CLIENT_PGSQL schema_version: 0 hash_code: 8191 targets { tscall { opcode: 4 operands { } } } column_refs { } is_aggregate: true limit: 1024 paging_state { next_partition_key: "\037\377" total_num_rows_read: 1 } return_paging_state: true table_id: "00004200000030008000000000004227" } proxy_uuid: "" memory_limit_score: 0
tablet_id: "7e2924f3c84a438488d5a78ba021b915" include_trace: false consistency_level: STRONG propagated_hybrid_time: 6432937015808643072 read_time { read_ht: 6432936976581885952 local_limit_ht: 6432936976786685952 global_limit_ht: 6432936976786685952 in_txn_limit_ht: 6432936976582737920 } pgsql_batch { client: YQL_CLIENT_PGSQL schema_version: 0 hash_code: 16382 targets { tscall { opcode: 4 operands { } } } column_refs { } is_aggregate: true limit: 1024 paging_state { next_partition_key: "?\376" total_num_rows_read: 2 } return_paging_state: true table_id: "00004200000030008000000000004227" } proxy_uuid: "" memory_limit_score: 0
tablet_id: "baf2dbb0f69a49ee9f6562861455c1a9" include_trace: false consistency_level: STRONG propagated_hybrid_time: 6432937044472725504 read_time { read_ht: 6432936976581885952 local_limit_ht: 6432936976786685952 global_limit_ht: 6432936976786685952 in_txn_limit_ht: 6432936976582737920 } pgsql_batch { client: YQL_CLIENT_PGSQL schema_version: 0 hash_code: 24573 targets { tscall { opcode: 4 operands { } } } column_refs { } is_aggregate: true limit: 1024 paging_state { next_partition_key: "_\375" total_num_rows_read: 3 } return_paging_state: true table_id: "00004200000030008000000000004227" } proxy_uuid: "" memory_limit_score: 0
tablet_id: "61d1f3f0ae4f412989d10868b14a4b6d" include_trace: false consistency_level: STRONG propagated_hybrid_time: 6432937046632075264 read_time { read_ht: 6432936976581885952 local_limit_ht: 6432936976786685952 global_limit_ht: 6432936976786685952 in_txn_limit_ht: 6432936976582737920 } pgsql_batch { client: YQL_CLIENT_PGSQL schema_version: 0 hash_code: 32764 targets { tscall { opcode: 4 operands { } } } column_refs { } is_aggregate: true limit: 1024 paging_state { next_partition_key: "\177\374" total_num_rows_read: 4 } return_paging_state: true table_id: "00004200000030008000000000004227" } proxy_uuid: "" memory_limit_score: 0
tablet_id: "f26752e7e8dc43b8bcb741b5a519ff68" include_trace: false consistency_level: STRONG propagated_hybrid_time: 6432937052964438016 read_time { read_ht: 6432936976581885952 local_limit_ht: 6432936976786685952 global_limit_ht: 6432936976786685952 in_txn_limit_ht: 6432936976582737920 } pgsql_batch { client: YQL_CLIENT_PGSQL schema_version: 0 hash_code: 40955 targets { tscall { opcode: 4 operands { } } } column_refs { } is_aggregate: true limit: 1024 paging_state { next_partition_key: "\237\373" total_num_rows_read: 5 } return_paging_state: true table_id: "00004200000030008000000000004227" } proxy_uuid: "" memory_limit_score: 0
tablet_id: "481d93a4d38148529d2eb29eec8dda74" include_trace: false consistency_level: STRONG propagated_hybrid_time: 6432937088147111936 read_time { read_ht: 6432936976581885952 local_limit_ht: 6432936976786685952 global_limit_ht: 6432936976786685952 in_txn_limit_ht: 6432936976582737920 } pgsql_batch { client: YQL_CLIENT_PGSQL schema_version: 0 hash_code: 49146 targets { tscall { opcode: 4 operands { } } } column_refs { } is_aggregate: true limit: 1024 paging_state { next_partition_key: "\277\372" total_num_rows_read: 6 } return_paging_state: true table_id: "00004200000030008000000000004227" } proxy_uuid: "" memory_limit_score: 0
tablet_id: "f1193d8027294b39974f9c5da4eff2a5" include_trace: false consistency_level: STRONG propagated_hybrid_time: 6432937125778800640 read_time { read_ht: 6432936976581885952 local_limit_ht: 6432936976786685952 global_limit_ht: 6432936976786685952 in_txn_limit_ht: 6432936976582737920 } pgsql_batch { client: YQL_CLIENT_PGSQL schema_version: 0 hash_code: 57337 targets { tscall { opcode: 4 operands { } } } column_refs { } is_aggregate: true limit: 1024 paging_state { next_partition_key: "\337\371" total_num_rows_read: 7 } return_paging_state: true table_id: "00004200000030008000000000004227" } proxy_uuid: "" memory_limit_score: 0

Changing this requires further discussion.

Hi @kmuthukk , quoting your comment in #2500:

yes, this is an important issue we are working on right now to unblock soon (to automatically handle read restarts if there are concurrent writes happening to same rows).

See #2216

Closing this one as a duplicate of #2216.

Since #2500 was marked a duplicate of #2216, I'm asking my question here.
Has the "automatically handling of read restarts if there are concurrent writes happening to same rows" already been implemented? What's the current plan for fixing that issue? Is it part of #2634 ?

Hey @lhotari , I've been actively working on this for some time now and while it's hard to give exact estimates, it should be ready soon.
As for #2634 , it is a follow-up issue that will be implemented separately.

@lhotari Until this issue is fixed, we have a workaround that you can use: You can avoid this read restart error by enclosing your select statement in a read only deferrable serializable transaction.
For example:

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ ONLY, DEFERRABLE;
SELECT * FROM foo;
COMMIT;

This will do the select without hitting the read restart error. Let me know if this works for you. Note that this is available in the 2.0.3 release.

Hi @ndeodhar,
you mentioned this workaround:

Until this issue is fixed, we have a workaround that you can use: You can avoid this read restart error by enclosing your select statement in a read only deferrable serializable transaction.

Unfortunately, this type of workaround isn't applicable in our use case. It would also cause some consistency issues for the application we are building.

btw. We are using Spring Boot 2.2 & "fully reactive" Spring stack together with Spring Data R2DBC 1.0.0.RC1 + R2DBC 0.8.0.RC2 + R2DBC Postgres driver for reactive data access. YugaByte version is 2.0.3 (2.0.3.0-b7). In Spring we use the Transactional annotations to define the transaction attributes and transaction boundaries. This is supported in Spring Data R2DBC for reactive data access recently (details).

Do you need help in reproducing the issues?

Thanks for the information @lhotari ! Some details on how to reproduce the issue will be good - that way, we can also test our fix to ensure that it works for your use case.

Fixed in 3212616
Note that the following limitations apply though:

  • Only restarts aggregate and single-row queries
  • Can't restart parameterized prepared statements (e.g. SELECT COUNT(*) FROM t WHERE id > ? with ? being the bind variable)

However, both of these will very soon be addressed in #1237

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hudclark picture hudclark  路  4Comments

fabiocmazzo picture fabiocmazzo  路  5Comments

nocaway picture nocaway  路  3Comments

bmatican picture bmatican  路  3Comments

frafra picture frafra  路  3Comments