Proxysql: ProxySQL reset connection if it starts with COM_QUIT

Created on 13 Mar 2021  Β·  7Comments  Β·  Source: sysown/proxysql

  • [x] ProxySQL version: 2.0.15
  • [x] OS version: Official Docker image: proxysql/proxysql:2.0.15

Description

I'm trying make zero-downtime migration from MASTER (MariaDB) to BALANCER (ProxySQL)
Now i have installed haproxy on my machines and all local applications use that address.
Now HAProxy balance connection between two MASTER IPs (main and reserved).

I have read-only replica (MariaDB) and want split traffic. For that i was add balancer (ProxySQL) address to HAProxy config.
ProxySQL address tagged as unhealthy and traffic not routed to him.

I'm investigate that issue and here is results.

Issue

HAProxy check mysql health status by sending COM_QUIT to the server and wait OK and close connection (see here)
This behaviour described in MySQL protocol official documentation

MariaDB flow (got OK, all fine)

ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠ΅

ProxySQL flow (no OK response, TCP RST only):

ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠ΅

Error log

ProxySQL print this errors for each HAProxy connection attempt:

2021-03-13 17:42:58 mysql_data_stream.cpp:411:check_data_flow(): [ERROR] Session=0x7f53a545b800, DataStream=0x7f53a5448c00 -- Data at both ends of a MySQL data stream: IN <1 bytes 0 packets> , OUT <1 bytes 0 packets>
2021-03-13 17:42:58 MySQL_Thread.cpp:4439:process_all_sessions(): [WARNING] Closing unhealthy client connection 94.19x.xx.xx:51320

Most helpful comment

Hi, after further inspection we have diagnosed which one was the underlying issue to this.

Looking at the connections that are being terminated by ProxySQL, we can see the exact conditions in which each of those connections are terminated.

(gdb) bt
#0  MySQL_Data_Stream::check_data_flow (this=0x7ffff2645780) at mysql_data_stream.cpp:443
#1  0x0000555555a4dfdb in MySQL_Thread::process_data_on_data_stream (this=0x7ffff2600000, myds=0x7ffff2645780, n=2) at MySQL_Thread.cpp:4690
#2  0x0000555555a4cf46 in MySQL_Thread::run (this=0x7ffff2600000) at MySQL_Thread.cpp:4403
#3  0x000055555598885b in mysql_worker_thread_func (arg=0x7ffff70d1eb0) at main.cpp:776
#4  0x00007ffff7d89299 in start_thread () from /usr/lib/libpthread.so.0
#5  0x00007ffff7948053 in clone () from /usr/lib/libc.so.6
(gdb) x/50hb PSarrayIN->pdata->ptr
0x7ffff260b400: 0x2e    0x00    0x00    0x01    0x00    0x82    0x00    0x00
0x7ffff260b408: 0x00    0x80    0x00    0x01    0x21    0x00    0x00    0x00
0x7ffff260b410: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff260b418: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff260b420: 0x00    0x00    0x00    0x00    0x68    0x61    0x70    0x72
0x7ffff260b428: 0x6f    0x78    0x79    0x5f    0x75    0x73    0x65    0x72
0x7ffff260b430: 0x00    0x00
(gdb) x/50hb PSarrayOUT->pdata->ptr
0x7ffff261f0f0: 0x4a    0x00    0x00    0x00    0x0a    0x35    0x2e    0x35
0x7ffff261f0f8: 0x2e    0x33    0x30    0x00    0x27    0x00    0x00    0x00
0x7ffff261f100: 0x4b    0x7e    0x66    0x71    0x56    0x68    0x3f    0x7f
0x7ffff261f108: 0x00    0x2f    0xf2    0x21    0x02    0x00    0x8f    0x81
0x7ffff261f110: 0x15    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff261f118: 0x00    0x00    0x00    0x1f    0x2d    0x12    0x47    0x67
0x7ffff261f120: 0x0c    0x40

ProxySQL was going to reply with a 'greeting response' packet when the connection is terminated, at this point ProxySQL is handling the connection just fine. But the state of the dataflow from the datastream is unexpected, since there is data in both ends of it, situation that should never happens, so, why it's taking place here? Theory of the issue was the following:

  1. New client connects, ProxySQL sends the Greeting Message.
  2. As soon as the connection is stablished, HAProxy doesn't wait at all for receiving the greeting message, but it sends the 'AuthRequest + COM_QUIT'.
  3. Now HAProxy starts listening for ProxySQL response (as it expects the OK packet for the COM_QUIT), this is the step that could lead to ProxySQL closing the connection, because if HAProxy hasn't started listening and reading, thus ProxySQL could find itself in the situation of Data at both ends of a MySQL data stream: 'IN <1 bytes 0 packets> , OUT <1 bytes 0 packets>' . This explains why this situation only takes places sometimes, and some other connections succeed just fine.
  4. If the connection hasn't been killed, and ProxySQL didn't fount himself in the 'data at both ends' situation, connection ends just fine, and HAProxy is happy with it.

This explanation is also backed up by the following experimental data. Fist this traffic capture from the unpatched version of ProxySQL:

133 1.811653340 127.0.0.1   127.0.0.1   TCP 121 50950 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476503529 TSecr=1476503529
361 3.812948008 127.0.0.1   127.0.0.1   TCP 121 50972 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476505531 TSecr=1476505531
363 3.813059052 127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 50972 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1476505531 TSecr=1476505531
365 3.813116077 127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 50972 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1476505531 TSecr=1476505531
482 5.813856360 127.0.0.1   127.0.0.1   TCP 121 50984 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476507531 TSecr=1476507531
604 7.815441376 127.0.0.1   127.0.0.1   TCP 121 50996 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476509533 TSecr=1476509533
606 7.815527353 127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 50996 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1476509533 TSecr=1476509533
608 7.815582674 127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 50996 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1476509533 TSecr=1476509533
791 9.816322349 127.0.0.1   127.0.0.1   TCP 121 51014 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476511534 TSecr=1476511534
951 11.817193725    127.0.0.1   127.0.0.1   TCP 121 51030 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476513535 TSecr=1476513535
1067    13.817351113    127.0.0.1   127.0.0.1   TCP 121 51044 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476515535 TSecr=1476515535
1069    13.817453891    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 51044 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1476515535 TSecr=1476515535

There is repeated pattern, we can see there are messages of size 121 sent from the client to ProxySQL these are the auth requests by HAProxy:

Frame 133: 121 bytes on wire (968 bits), 121 bytes captured (968 bits) on interface lo, id 0
Ethernet II, Src: 00:00:00_00:00:00 (00:00:00:00:00:00), Dst: 00:00:00_00:00:00 (00:00:00:00:00:00)
Internet Protocol Version 4, Src: 127.0.0.1, Dst: 127.0.0.1
Transmission Control Protocol, Src Port: 50950, Dst Port: 6033, Seq: 1, Ack: 1, Len: 55
Data (55 bytes)
    Data: 2e0000010082000000800001210000000000000000000000000000000000000000000000…
    [Length: 55]
0000   00 00 00 00 00 00 00 00 00 00 00 00 08 00 45 00   ..............E.
0010   00 6b 99 60 40 00 40 06 a3 2a 7f 00 00 01 7f 00   .k.`@.@..*......
0020   00 01 c7 06 17 91 3d ec d2 a6 8a b3 ff a2 80 18   ......=.........
0030   02 00 fe 5f 00 00 01 01 08 0a 58 01 a7 e9 58 01   ..._......X...X.
0040   a7 e9 2e 00 00 01 00 82 00 00 00 80 00 01 21 00   ..............!.
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00 00 00 00 00 68 61 70 72 6f 78 79 5f 75 73   ......haproxy_us
0070   65 72 00 00 01 00 00 00 01                        er.......

So, the only reason there are more than 1 consecutive of these packets, is because of the connections that ProxySQL is closing, because they hit the timing described previously, and HAProxy is retrying, in fact, if more of 4 of these packets are founds, HAProxy will consider the server to be down. If this is correct, to prevent this situation, the simpler solution would be to enforce ProxySQL to always sent the 'initial handshake packet' as soon as it's generated, this way we will be preventing ProxySQL finding data at both ends of the data stream. After a minimal patch:

diff --git a/lib/MySQL_Thread.cpp b/lib/MySQL_Thread.cpp
index b37ec3ab..4bb97ef2 100644
--- a/lib/MySQL_Thread.cpp
+++ b/lib/MySQL_Thread.cpp
@@ -5245,6 +5245,12 @@ void MySQL_Thread::listener_handle_new_connection(MySQL_Data_Stream *myds, unsig
                ioctl_FIONBIO(sess->client_myds->fd, 1);
                mypolls.add(POLLIN|POLLOUT, sess->client_myds->fd, sess->client_myds, curtime);
                proxy_debug(PROXY_DEBUG_NET,1,"Session=%p -- Adding client FD %d\n", sess, sess->client_myds->fd);
+
+               sess->writeout();
        } else {
                free(addr);
                // if we arrive here, accept() failed

We can observe how the traffic being capture has changed:

1300    17.642014001    127.0.0.1   127.0.0.1   TCP 121 46996 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508672998 TSecr=1508672997
1302    17.642130120    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 46996 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508672998 TSecr=1508672998
1304    17.642179639    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 46996 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508672998 TSecr=1508672998
1532    19.643891974    127.0.0.1   127.0.0.1   TCP 121 47018 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508674999 TSecr=1508674999
1534    19.643987351    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 47018 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508674999 TSecr=1508674999
1536    19.644050236    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 47018 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508675000 TSecr=1508675000
1653    21.644972886    127.0.0.1   127.0.0.1   TCP 121 47030 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508677000 TSecr=1508677000
1655    21.645084782    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 47030 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508677001 TSecr=1508677000
1657    21.645138118    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 47030 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508677001 TSecr=1508677001
1774    23.645653165    127.0.0.1   127.0.0.1   TCP 121 47044 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508679001 TSecr=1508679001
1776    23.645736253    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 47044 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508679001 TSecr=1508679001
1778    23.645789782    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 47044 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508679001 TSecr=1508679001
2006    25.646402390    127.0.0.1   127.0.0.1   TCP 121 47066 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508681002 TSecr=1508681002
2008    25.646481857    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 47066 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508681002 TSecr=1508681002
2010    25.646533666    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 47066 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508681002 TSecr=1508681002

Now the pattern is exactly the same as connecting directly to MySQL, no further messages are displayed in HAProxy, and no messages are displayed in ProxySQL informing about the termination of a connection.

All 7 comments

Hi @spumer .
Thank you for the report.
What HAProxy does is, imho, a bit bizarre.
It seems an abuse of the protocol: sending a COM_QUIT when the connection handshake isn't even started:
https://dev.mysql.com/doc/internals/en/connection-phase.html

We can evaluate how easy or complicated is to implement this.

Hi, after further inspection we have diagnosed which one was the underlying issue to this.

Looking at the connections that are being terminated by ProxySQL, we can see the exact conditions in which each of those connections are terminated.

(gdb) bt
#0  MySQL_Data_Stream::check_data_flow (this=0x7ffff2645780) at mysql_data_stream.cpp:443
#1  0x0000555555a4dfdb in MySQL_Thread::process_data_on_data_stream (this=0x7ffff2600000, myds=0x7ffff2645780, n=2) at MySQL_Thread.cpp:4690
#2  0x0000555555a4cf46 in MySQL_Thread::run (this=0x7ffff2600000) at MySQL_Thread.cpp:4403
#3  0x000055555598885b in mysql_worker_thread_func (arg=0x7ffff70d1eb0) at main.cpp:776
#4  0x00007ffff7d89299 in start_thread () from /usr/lib/libpthread.so.0
#5  0x00007ffff7948053 in clone () from /usr/lib/libc.so.6
(gdb) x/50hb PSarrayIN->pdata->ptr
0x7ffff260b400: 0x2e    0x00    0x00    0x01    0x00    0x82    0x00    0x00
0x7ffff260b408: 0x00    0x80    0x00    0x01    0x21    0x00    0x00    0x00
0x7ffff260b410: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff260b418: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff260b420: 0x00    0x00    0x00    0x00    0x68    0x61    0x70    0x72
0x7ffff260b428: 0x6f    0x78    0x79    0x5f    0x75    0x73    0x65    0x72
0x7ffff260b430: 0x00    0x00
(gdb) x/50hb PSarrayOUT->pdata->ptr
0x7ffff261f0f0: 0x4a    0x00    0x00    0x00    0x0a    0x35    0x2e    0x35
0x7ffff261f0f8: 0x2e    0x33    0x30    0x00    0x27    0x00    0x00    0x00
0x7ffff261f100: 0x4b    0x7e    0x66    0x71    0x56    0x68    0x3f    0x7f
0x7ffff261f108: 0x00    0x2f    0xf2    0x21    0x02    0x00    0x8f    0x81
0x7ffff261f110: 0x15    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff261f118: 0x00    0x00    0x00    0x1f    0x2d    0x12    0x47    0x67
0x7ffff261f120: 0x0c    0x40

ProxySQL was going to reply with a 'greeting response' packet when the connection is terminated, at this point ProxySQL is handling the connection just fine. But the state of the dataflow from the datastream is unexpected, since there is data in both ends of it, situation that should never happens, so, why it's taking place here? Theory of the issue was the following:

  1. New client connects, ProxySQL sends the Greeting Message.
  2. As soon as the connection is stablished, HAProxy doesn't wait at all for receiving the greeting message, but it sends the 'AuthRequest + COM_QUIT'.
  3. Now HAProxy starts listening for ProxySQL response (as it expects the OK packet for the COM_QUIT), this is the step that could lead to ProxySQL closing the connection, because if HAProxy hasn't started listening and reading, thus ProxySQL could find itself in the situation of Data at both ends of a MySQL data stream: 'IN <1 bytes 0 packets> , OUT <1 bytes 0 packets>' . This explains why this situation only takes places sometimes, and some other connections succeed just fine.
  4. If the connection hasn't been killed, and ProxySQL didn't fount himself in the 'data at both ends' situation, connection ends just fine, and HAProxy is happy with it.

This explanation is also backed up by the following experimental data. Fist this traffic capture from the unpatched version of ProxySQL:

133 1.811653340 127.0.0.1   127.0.0.1   TCP 121 50950 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476503529 TSecr=1476503529
361 3.812948008 127.0.0.1   127.0.0.1   TCP 121 50972 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476505531 TSecr=1476505531
363 3.813059052 127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 50972 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1476505531 TSecr=1476505531
365 3.813116077 127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 50972 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1476505531 TSecr=1476505531
482 5.813856360 127.0.0.1   127.0.0.1   TCP 121 50984 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476507531 TSecr=1476507531
604 7.815441376 127.0.0.1   127.0.0.1   TCP 121 50996 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476509533 TSecr=1476509533
606 7.815527353 127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 50996 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1476509533 TSecr=1476509533
608 7.815582674 127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 50996 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1476509533 TSecr=1476509533
791 9.816322349 127.0.0.1   127.0.0.1   TCP 121 51014 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476511534 TSecr=1476511534
951 11.817193725    127.0.0.1   127.0.0.1   TCP 121 51030 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476513535 TSecr=1476513535
1067    13.817351113    127.0.0.1   127.0.0.1   TCP 121 51044 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1476515535 TSecr=1476515535
1069    13.817453891    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 51044 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1476515535 TSecr=1476515535

There is repeated pattern, we can see there are messages of size 121 sent from the client to ProxySQL these are the auth requests by HAProxy:

Frame 133: 121 bytes on wire (968 bits), 121 bytes captured (968 bits) on interface lo, id 0
Ethernet II, Src: 00:00:00_00:00:00 (00:00:00:00:00:00), Dst: 00:00:00_00:00:00 (00:00:00:00:00:00)
Internet Protocol Version 4, Src: 127.0.0.1, Dst: 127.0.0.1
Transmission Control Protocol, Src Port: 50950, Dst Port: 6033, Seq: 1, Ack: 1, Len: 55
Data (55 bytes)
    Data: 2e0000010082000000800001210000000000000000000000000000000000000000000000…
    [Length: 55]
0000   00 00 00 00 00 00 00 00 00 00 00 00 08 00 45 00   ..............E.
0010   00 6b 99 60 40 00 40 06 a3 2a 7f 00 00 01 7f 00   .k.`@.@..*......
0020   00 01 c7 06 17 91 3d ec d2 a6 8a b3 ff a2 80 18   ......=.........
0030   02 00 fe 5f 00 00 01 01 08 0a 58 01 a7 e9 58 01   ..._......X...X.
0040   a7 e9 2e 00 00 01 00 82 00 00 00 80 00 01 21 00   ..............!.
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00 00 00 00 00 68 61 70 72 6f 78 79 5f 75 73   ......haproxy_us
0070   65 72 00 00 01 00 00 00 01                        er.......

So, the only reason there are more than 1 consecutive of these packets, is because of the connections that ProxySQL is closing, because they hit the timing described previously, and HAProxy is retrying, in fact, if more of 4 of these packets are founds, HAProxy will consider the server to be down. If this is correct, to prevent this situation, the simpler solution would be to enforce ProxySQL to always sent the 'initial handshake packet' as soon as it's generated, this way we will be preventing ProxySQL finding data at both ends of the data stream. After a minimal patch:

diff --git a/lib/MySQL_Thread.cpp b/lib/MySQL_Thread.cpp
index b37ec3ab..4bb97ef2 100644
--- a/lib/MySQL_Thread.cpp
+++ b/lib/MySQL_Thread.cpp
@@ -5245,6 +5245,12 @@ void MySQL_Thread::listener_handle_new_connection(MySQL_Data_Stream *myds, unsig
                ioctl_FIONBIO(sess->client_myds->fd, 1);
                mypolls.add(POLLIN|POLLOUT, sess->client_myds->fd, sess->client_myds, curtime);
                proxy_debug(PROXY_DEBUG_NET,1,"Session=%p -- Adding client FD %d\n", sess, sess->client_myds->fd);
+
+               sess->writeout();
        } else {
                free(addr);
                // if we arrive here, accept() failed

We can observe how the traffic being capture has changed:

1300    17.642014001    127.0.0.1   127.0.0.1   TCP 121 46996 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508672998 TSecr=1508672997
1302    17.642130120    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 46996 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508672998 TSecr=1508672998
1304    17.642179639    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 46996 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508672998 TSecr=1508672998
1532    19.643891974    127.0.0.1   127.0.0.1   TCP 121 47018 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508674999 TSecr=1508674999
1534    19.643987351    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 47018 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508674999 TSecr=1508674999
1536    19.644050236    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 47018 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508675000 TSecr=1508675000
1653    21.644972886    127.0.0.1   127.0.0.1   TCP 121 47030 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508677000 TSecr=1508677000
1655    21.645084782    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 47030 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508677001 TSecr=1508677000
1657    21.645138118    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 47030 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508677001 TSecr=1508677001
1774    23.645653165    127.0.0.1   127.0.0.1   TCP 121 47044 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508679001 TSecr=1508679001
1776    23.645736253    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 47044 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508679001 TSecr=1508679001
1778    23.645789782    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 47044 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508679001 TSecr=1508679001
2006    25.646402390    127.0.0.1   127.0.0.1   TCP 121 47066 β†’ 6033 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=55 TSval=1508681002 TSecr=1508681002
2008    25.646481857    127.0.0.1   127.0.0.1   TCP 144 6033 β†’ 47066 [PSH, ACK] Seq=1 Ack=56 Win=65536 Len=78 TSval=1508681002 TSecr=1508681002
2010    25.646533666    127.0.0.1   127.0.0.1   TCP 77  6033 β†’ 47066 [PSH, ACK] Seq=79 Ack=56 Win=65536 Len=11 TSval=1508681002 TSecr=1508681002

Now the pattern is exactly the same as connecting directly to MySQL, no further messages are displayed in HAProxy, and no messages are displayed in ProxySQL informing about the termination of a connection.

Test on 2.0.18

Same problem on HAProxy side, but now no errors on ProxySQL side

ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠ΅

Hi @spumer ,

I tested again in v2.0.18, and HAProxy version:

HA-Proxy version 2.3.8-e572195 2021/03/25 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2022.
Known bugs: http://www.haproxy.org/bugs/bugs-2.3.8.html
Running on: Linux 5.9.14-arch1-1 #1 SMP PREEMPT Sat, 12 Dec 2020 14:37:12 +0000 x86_64

And I don't see any issues either in ProxySQL side, or in HAProxy side, HAProxy log stays clear, with no signs of healthchecks failing. Besides, my output captures are also clean as the ones I previously shared (I'm adding a wireshark output too for clarity):

screenshot-2021-04-13-17:44:51

As you can see, the packets with length 77 correspond to the OK packets being set by ProxySQL, here you have a detailed of one of those packets:

screenshot-2021-04-13-17:46:28

Since I can't reproduce the issue you are experiencing, could you please either supply further information about:

  1. HAProxy logs (I assume there should be failures in HAProxy side regarding the failing mysql-check).
  2. Version of HAProxy being used.
  3. Configuration of HAProxy being used.

Or alternatively a reproducible test case showing your issue.

Thank you.

Yea, sorry for that. I had to provide it from the beggining :)

# lsb_release -r
Release:    18.04

# haproxy
HA-Proxy version 1.8.8-1ubuntu0.11 2020/06/22

# uname -r
4.15.0-140-generic
# cat /etc/haproxy/haproxy.cfg
global
    user haproxy
    group haproxy
    daemon

defaults
    log global
    retries 2
        timeout connect 5000
        timeout client  50000
        timeout server  50000

listen mysql-cluster
    bind 127.0.0.1:4040
    mode tcp
    option mysql-check user haproxy_check
    option log-health-checks
    balance roundrobin
    server mysql-1 xx.xx.xx.xx:3306 check
    server mysql-2 xx.xx.xx.xx:3306 check
    server mysql-3 188.127.xx.xx:6033 check

No error logs, this messages occured only on haproxy restart

# cat /var/log/haproxy.log.1
Apr 11 07:35:56 localhost haproxy[103512]: [WARNING] 099/202615 (103512) : Reexecuting Master process
Apr 11 07:37:38 localhost haproxy[103512]: [WARNING] 100/073556 (103512) : Exiting Master process...
Apr 11 07:37:38 localhost haproxy[103512]: [ALERT] 100/073556 (103512) : Current worker 49380 exited with code 143
Apr 11 07:37:38 localhost haproxy[103512]: [WARNING] 100/073556 (103512) : Former worker 103514 exited with code 143
Apr 11 07:37:38 localhost haproxy[103512]: [WARNING] 100/073556 (103512) : All workers exited. Exiting... (143)

Hi @spumer,

testing your setup I have found that the version of HAProxy that you use, uses the old Protocol::HandshakeResponse320, this response is extremely old, and unsupported by ProxySQL, looks like HAProxy defaulted to post-41 as the default mode for MySQL checks in version 2.2-dev8 (https://www.haproxy.org/download/2.3/src/CHANGELOG):

MEDIUM: checks: Make post-41 the default mode for mysql checks

Reason why I didn't experienced your issue. If you replace this line of you config file:

    option mysql-check user haproxy_check

in favor of this one:

    option mysql-check user haproxy_check post-41

The problem should go away. Please confirm if this solves your issue. Thank you.

The problem should go away. Please confirm if this solves your issue.

Problem is gone! Thank you very much!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lazzyfu picture lazzyfu  Β·  19Comments

lengerad picture lengerad  Β·  23Comments

leeparayno picture leeparayno  Β·  16Comments

williamsantanamoip picture williamsantanamoip  Β·  17Comments

izzyquestion picture izzyquestion  Β·  15Comments