Tidb: Support Concurrent DDL Execution

Created on 7 Jul 2020  路  6Comments  路  Source: pingcap/tidb

Feature Request

Currently, TiDB initial support for concurrent DDL
(The DDL of "add index" and the other types of DDL can be executed concurrently when they are on the different tables. Related to #6955). When more services are connected to the same cluster, the Same type of DDL statement is usually blocked between different services.

In this case, the DDL capabilities will block more service to use the TiDB database. But TiDB can do more things to extend the DDL capabilities, this PR proposes to support execute DDL parallelly

For example:
Teams A and B share the same TiDB cluster. Team A cannot do any add index operation when team B is executing a DDL alter table t1 add index idx_ab(a,b), which will block the team A upgrading service especially when the table t1 is huge (eg: greater than 1B rows).

But there are some risks if we support add index parallelly, eg: the cluster load will increase if add index parallelly, which will impact the cluster performance.

Category

Feature/Improvement

Value

3: Make TiDB support more scenarios.

TODO list

  • [ ] Add a proposal.
  • [ ] #18425 Modify the storage format of the current DDL job.
  • [ ] Remove the old storage formate and update related executions(such as admin show ddl jobs/queries).
PrioritP1 featuraccepted siinfra typfeature-request

Most helpful comment

@lonng Would you mind to give more description for the necessary of DDL parallel execution, for example there already is a huge table's add index DDL job which is running, and when I want to add an index for a small table, I must wait for the previous add index job finished, this is not acceptable.

All 6 comments

@lonng Would you mind to give more description for the necessary of DDL parallel execution, for example there already is a huge table's add index DDL job which is running, and when I want to add an index for a small table, I must wait for the previous add index job finished, this is not acceptable.

Would it be more correct to call this "concurrent ddl"? This distinguishes it from https://github.com/pingcap/tidb/issues/19386 which is to implement parallelism across the cluster.

Would it be more correct to call this "concurrent ddl"? This distinguishes it from #19386 which is to implement parallelism across the cluster.

done

Hi @nullnotnil , this feature is planned to finish in 2020 Q4.

I think we also need a parallel level control and rate limitation policy to prevent DDL job eat all disk IO. And there also should be a method to online reduce the parallel level.

For online reconfiguration, may I suggest using SET GLOBAL sys_var = newvalue?

In https://github.com/pingcap/tidb/pull/21424 I am working on a way for changing sessionvars to execute arbitrary code. It is a little bit more complicated for globalvars because the semantics need to be worked out for the local instance + other tidb instances.

Was this page helpful?
0 / 5 - 0 ratings