Proxysql: Read/write split with dependency

Created on 23 Jul 2018 · 10Comments · Source: sysown/proxysql

The documentation describes at https://github.com/sysown/proxysql/wiki/Multiplexing that mulitplexing is disabled as soon as ProxySQL hits a query that might introduce a data or other dependency to any succeeding query. However, how about read/write split? If you have a query that might introduce a dependency, it should clearly be executed on a write node, and all queries possibly dependent on it should be executed on that exact node to ensure the data dependency will be satisfied (just as with multiplexing).

In particular, consider the default read/write split config at https://github.com/sysown/proxysql/wiki/ProxySQL-Read-Write-Split-(HOWTO)#basic-readwrite-split-using-regex

Then:

mysql> SET @sessionvariable='value';
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT @sessionvariable;
+------------------+
| @sessionvariable |
+------------------+
| NULL             |
+------------------+
1 row in set (0.00 sec)

This is because the SET will be executed on the write node, while the SELECT will be executed on one of the read nodes.

In this case, the SET should pin to a write node and the succeeding SELECT should be executed on that exact write node.

Source

rattacresh

Most helpful comment

@renecannao
get it.
You are really great and always have patience to answer questions.
Thanks a lot.

malongshuai on 26 Jul 2018

❤3

All 10 comments

To answer your question "If you have a query that might introduce a dependency, it should clearly be executed on a write node", the answer is no.
If you configured ProxySQL to send a dependent query to a reader node, ProxySQL is configured incorrectly.

@rattacresh : please refer to http://www.proxysql.com/blog/configure-read-write-split for example on how to properly configure routing.
I am updating the wiki right now to highlight that the example listed is a POC example, not something to use in production. Sending all SELECT to reader is not correct.

renecannao on 25 Jul 2018

the answer is no

Well, shouldn't such behaviour be available at least as an option?

If you configured ProxySQL to send a dependent query to a reader node, ProxySQL is configured incorrectly.

I wonder how to ensure that a query is not dependent and has not become dependent? The exact same query might be executed within a transaction and outside of a transaction. In the first case, the query is dependent on the transaction, in the second case it is not. The common case is where queries are used first without transaction, later on the code is wrapped with transaction begin/end. This means the query is now a dependent one, but might be executed outside of the transaction if it was configured to be routed to a different node. This is not at all fool-proof. Shouldn't this situation at least be detected and an error be thrown if a ProxySQL has recognized a dependent query that would be routed to a different node?

I am updating the wiki right now to highlight that the example listed is a POC example, not something to use in production. Sending all SELECT to reader is not correct.

IMO, then the wiki shouldn't even have such an example, not even as a proof of concept, and should warn very very loudly about dependency issues and the implications.

Also the difference between query routing and multiplexing should be stressed because they are easily confused. People might think they are safe after reading the multiplexing behaviour, only to figure out later on that everything is broken because of query routing.

rattacresh on 25 Jul 2018

👍1

Well, shouldn't such behaviour be available at least as an option?

Something like: if multiplexing is disabled, disable also routing?
This has several flaws. Let me make an example:

the application run a query that disables multiplexing on a connection on the slave
from now on, all queries go to the slave
Doesn't work...

About transaction, please refer here: https://github.com/sysown/proxysql/wiki/Users-configuration#disabling-routing-across-hostgroups-once-a-transaction-has-started-for-a-specific-user

Re-opening this issue until the documentation is more clear about the difference between multiplexing and routing

renecannao on 25 Jul 2018

Something like: if multiplexing is disabled, disable also routing?

No -- use the same logic as with multiplexing, but before even doing the query routing. In case of a dependency, use a write node and pin to it.

About transaction, please refer here

Shouldn't the default be to have this enabled? As per principle of least astonishment.

Do you mind if I update the wiki a bit to make things less confusing? (And please add a license to the wiki pages -- "Wikis also give you the option of including a custom footer where you can list things like contact details or license information for your project." https://guides.github.com/features/wikis/)

rattacresh on 25 Jul 2018

a) this is what eventually stick_conn will do
b) it is enabled by default since version 1.4.0
c) absolutely, feel free to edit the wiki, thanks!
d) I will create an issue to follow up later and add a footer. Thank you for the hint.

renecannao on 25 Jul 2018

b) it is enabled by default since version 1.4.0

ok. So a workaround should be to ensure transaction_persistent=1 and wrap dependent queries in a transaction? (assumed one does not want to carefully split queries on an individual basis...)

rattacresh on 25 Jul 2018

I don't think "workaround" is the right word here, as a general term.
I think it is a matter of having rules configured the right way. Generic rules (all selects) are not ok, but rather you should configure what queries are ok to be sent to readers.
Depending from the application, it is often common that only few query type can be routed to a slave to drastically reduce the load on the master.

ensure transaction_persistent=1 and wrap dependent queries in a transaction?

This works for cases like the one you mentioned before: The common case is where queries are used first without transaction, later on the code is wrapped with transaction begin/end.
In this type of scenarios, I would assume the query is probably a simple one, and maybe should always be sent to the master.
Of course, every application is different, and my assumptions can be wrong.

renecannao on 25 Jul 2018

@renecannao
Hi rene
I have a confusion about transaction_persistent and multiplexing.
transaction_persistent ensures statements in a transaction will route to the same hostgroup A. If the hostgroup A is a multi-primary Galera or Group replication, how to ensure these statements send to the same node in hostgroup A. multiplexing guarantees it?
Thanks...

malongshuai on 26 Jul 2018

@malongshuai : a transaction automatically disables multiplexing until the transaction completes.
Therefore if you have transaction_persistent=1 and a transaction starts, it is guaranteed that all the statements belonging to that transaction will go into the same connection.

renecannao on 26 Jul 2018

@renecannao
get it.
You are really great and always have patience to answer questions.
Thanks a lot.

malongshuai on 26 Jul 2018

❤3

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Large queries smaller than max_allowed_packet don't seem to work

geotro · 3Comments

Icinga2 -> ProxySQL -> MariaDB/Galera: MySQL_Session.cpp:3689 [ERROR] Unable to parse query

xenuser · 4Comments

AWS Aurora read_only status is only available via innodb_read_only

ottenhoff · 4Comments

Add support for SET CHARACTER SET

renecannao · 3Comments

PROXYSQL PAUSE and PROXYSQL RESUME

renecannao · 4Comments