proxysql suddenly crashes always at night time on most days of the week
2021-02-16 04:07:25 [INFO] Dumping current MySQL Servers structures for hostgroup ALL
HID: 11 , address: 10.5.0.146 , port: 3306 , gtid_port: 0 , weight: 1 , status: ONLINE , max_connections: 1000 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment: multidb5 - test
HID: 11 , address: 10.5.0.128 , port: 3306 , gtid_port: 0 , weight: 1 , status: ONLINE , max_connections: 1000 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment: multidb2 - test
/usr/bin/proxysql(_Z13crash_handleri+0x1a)[0x4d5fda]
/lib64/libc.so.6(+0x363b0)[0x7fe48b6c03b0]
/usr/bin/proxysql[0x8614ca]
/usr/bin/proxysql[0x93164e]
/usr/bin/proxysql(mysql_close+0xe1)[0x931bfc]
/usr/bin/proxysql(_Z19monitor_ping_threadPv+0x4a0)[0x5982a0]
/usr/bin/proxysql(_ZN14ConsumerThread3runEv+0xf7)[0x5aa957]
/lib64/libpthread.so.0(+0x7e65)[0x7fe48c8a0e65]
/lib64/libc.so.6(clone+0x6d)[0x7fe48b78888d]
2021-02-18 03:26:07 main.cpp:1573:ProxySQL_daemonize_phase3(): [ERROR] ProxySQL crashed. Restarting!
a longer version of the error log is attached to the issue
i have no information on why it is crashing so i cant tell you how to reproduce the problem
please let me know if there is any information missing from this issue
ProxySQL version 2.0.15-20-g32bb92c
OS: CentOS Linux release 7.7.1908 (Core)
proxysql-log-before-crash.txt
core.23293.zip
core.30100.zip
core.31511.zip
core.2338.zip
core.17071.zip
core.17127.zip
proxysql_binary.zip
Is there a specific bug that was fixed in 2.0.17 that you think I am
experiencing?
On Thu, Mar 4, 2021, 17:24 crydx notifications@github.com wrote:
upgrade to 2.0.17 may solve the problem
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/sysown/proxysql/issues/3329#issuecomment-790697635,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM5ZBDYPSCTFZJCM436Z4ALTB6Q3ZANCNFSM4YOKOKUA
.
@JavierJF : can you please look into this?
Thanks
i'll add some more information which might be useful, we are using proxysql to access a 3 node innodb cluster (single master).
each of these nodes have several ip addresses and i added all of them to mysql_servers
i also configured mysql_group_replication_hostgroups and set max_writers to 7 so that all ip addresses of the source node can be used for writes
it is also interesting that we have 2 proxysql instances running on different servers (same version) which we use to access this innodb cluster and both of them crash multiple times a week and not at the same time.
please let me know if there is any more information that might be helpful in solving this issue
we are also using 3 nodes innodb cluster (single master) and two proxysqls .
we also meet the same problem. do you deploy any scripts to query proxysql information through admin port, like "select * from runtime_mysql_servers"?
yes, we have a local script which checks the number of open client connections
I think these scripts which query proxysql information through admin port maybe cause the crash. we changed our scripts. you can have a try.
What change did you perform to the scripts?
On Tue, Mar 16, 2021, 09:18 YRWYCTB @.*> wrote:
I think these scripts which query proxysql information through admin port
maybe cause the crash. we changed our scripts. you can have a try.—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/sysown/proxysql/issues/3329#issuecomment-800018232,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM5ZBD7WTGYXEIJRG6LFK53TD4A5PANCNFSM4YOKOKUA
.
we just do not use the admin port to get information any more.
So you've given up on the function that the scripts were used for?
Or did you get the data in another way?
Our scripts query the number of client connections to proxysql
On Tue, Mar 16, 2021, 12:15 YRWYCTB @.*> wrote:
we just do not use the admin port to get information any more.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/sysown/proxysql/issues/3329#issuecomment-800130944,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM5ZBD4QMVQNRSPF6RP2QQTTD4VVZANCNFSM4YOKOKUA
.
Our scripts query the online mysql server numbers .
Now we get the online server numbers from mysql servers directly.
Hi,
after inspecting all the provided coredumps, looks like most of the memory issues can be traced back to the 'monitor_group_replication_thread', like can be seeing in this backtrace:
#0 atomic_load_p (mo=atomic_memory_order_relaxed, a=0x199eb8) at include/jemalloc/internal/atomic.h:62
#1 rtree_leaf_elm_bits_read (tsdn=<optimized out>, rtree=<optimized out>, dependent=true, elm=0x199eb8) at include/jemalloc/internal/rtree.h:175
#2 rtree_szind_slab_read (r_slab=<synthetic pointer>, r_szind=<synthetic pointer>, dependent=true, key=15252645203506290, rtree_ctx=0x7fe482ffda10, rtree=<optimized out>, tsdn=0x7fe482ffd9e0) at include/jemalloc/internal/rtree.h:500
#3 ifree (slow_path=false, tcache=0x7fe482ffdbd0, ptr=0x363033333d7472, tsd=0x7fe482ffd9e0) at src/jemalloc.c:2490
#4 je_free_default (ptr=0x363033333d7472) at src/jemalloc.c:2710
#5 0x000000000093164e in mysql_close_options (mysql=0x7fe47e201900) at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/mariadb_lib.c:1866
#6 0x0000000000931bfc in mysql_close (mysql=0x7fe47e201900) at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/mariadb_lib.c:1970
#7 0x00000000005a6516 in monitor_group_replication_thread (arg=<optimized out>) at MySQL_Monitor.cpp:1508
#8 0x00000000005aa957 in ConsumerThread::run (this=0x7fe48300e000) at MySQL_Monitor.cpp:82
#9 0x00007fe48c8a0e65 in start_thread () from /lib64/libpthread.so.0
#10 0x00007fe48b78888d in clone () from /lib64/libc.so.6
and particularly all refer to the following line from mariadb_lib:
free(mysql->options.extension->plugin_dir);
Since that value is never updated outside mariadblibclient this itself points to memory corruption error. But furthermore the fact that the unique different crash, which backtrace is:
(gdb) bt
#0 0x00000000008f8e9e in re2::DFA::InlinedSearchLoop (this=<optimized out>, params=0x7fe476bfc1a0, have_first_byte=false, want_earliest_match=false, run_forward=true, this=<optimized out>) at re2/dfa.cc:1409
#1 0x00000000008fbc5c in FastSearchLoop (params=0x7fe476bfc1a0, this=0x7fe47fe22300) at re2/dfa.cc:1607
#2 Search (matches=0x0, epp=<synthetic pointer>, failed=0x7fe476bfc290, run_forward=<optimized out>, want_earliest_match=false, anchored=true, context=<synthetic pointer>, text=..., this=0x7fe47fe22300) at re2/dfa.cc:1800
#3 re2::Prog::SearchDFA (this=0x7fe48a43b200, text=..., const_context=..., anchor=anchor@entry=re2::Prog::kAnchored, kind=<optimized out>, kind@entry=re2::Prog::kFirstMatch, match0=match0@entry=0x7fe476bfc2d0, failed=failed@entry=0x7fe476bfc290, matches=matches@entry=0x0)
at re2/dfa.cc:1900
#4 0x00000000008d65c2 in re2::RE2::Match (this=this@entry=0x7fe48a42a240, text=..., startpos=startpos@entry=0, endpos=<optimized out>, re_anchor=<optimized out>, submatch=submatch@entry=0x7fe476bfc4f0, nsubmatch=nsubmatch@entry=0) at re2/re2.cc:708
#5 0x00000000008d8271 in re2::RE2::DoMatch (this=0x7fe48a42a240, text=..., re_anchor=<optimized out>, consumed=0x0, args=0x0, n=0) at re2/re2.cc:805
#6 0x0000000000588710 in Apply<bool (*)(re2::StringPiece const&, re2::RE2 const&, re2::RE2::Arg const* const*, int), re2::StringPiece> (re=..., sp=..., f=<optimized out>) at ../deps/re2/re2/re2/re2.h:347
#7 re2::RE2::PartialMatch<>(re2::StringPiece const&, re2::RE2 const&) (text=..., re=...) at ../deps/re2/re2/re2/re2.h:370
#8 0x00000000005847eb in admin_session_handler (sess=0x7fe48723a900, _pa=0x7fe48a444c00, pkt=<optimized out>) at ProxySQL_Admin.cpp:3824
#9 0x0000000000536cfc in MySQL_Session::handler (this=this@entry=0x7fe48723a900) at MySQL_Session.cpp:3123
#10 0x000000000055088d in child_mysql (arg=<optimized out>) at ProxySQL_Admin.cpp:4522
#11 0x00007fe48c8a0e65 in start_thread () from /lib64/libpthread.so.0
#12 0x00007fe48b78888d in clone () from /lib64/libc.so.6
it's inside a RE2::PartialMatch of a complete valid query, which memory has been properly initialized and the pa->match_regexes struct from admin which memory also is properly initialized:
(gdb) p *(RE2*)(pa->match_regexes.re[0])
$4 = {pattern_ = "^SELECT\\s+@@max_allowed_packet\\s*", options_ = {static kDefaultMaxMem = 8388608, encoding_ = re2::RE2::Options::EncodingUTF8, posix_syntax_ = false, longest_match_ = false, log_errors_ = false, max_mem_ = 8388608, literal_ = false, never_nl_ = false,
dot_nl_ = false, never_capture_ = false, case_sensitive_ = false, perl_classes_ = false, word_boundary_ = false, one_line_ = false}, prefix_ = "", prefix_foldcase_ = false, entire_regexp_ = 0x7fe48a53f140, suffix_regexp_ = 0x7fe48a53f140, prog_ = 0x7fe48a43b200,
is_one_pass_ = true, rprog_ = 0x0, error_ = 0x7fe48a408318, error_code_ = re2::RE2::NoError, error_arg_ = "", num_captures_ = 0, named_groups_ = 0x0, group_names_ = 0x0, rprog_once_ = {_M_once = 0}, num_captures_once_ = {_M_once = 2}, named_groups_once_ = {_M_once = 0},
group_names_once_ = {_M_once = 0}}
All points to a memory corruption error, since your configuration involves mysql_group_replication_hostgroups all points that this crashes could be motivated by memory corruption described in this closed issue: https://github.com/sysown/proxysql/issues/3261.
The fix for that issues is available since 'v2.0.16', so my recommendation here would be to update to at least to 'v2.0.16' and check if the issue is still present, or can't be reproduced anymore.
Hope the issue is solved with that, thanks.
Thank you very much!!
I will update the version and see if this solves the problem
On Tue, Mar 23, 2021, 21:46 Javier Jaramago Fernández <
@.*> wrote:
Hi,
after inspecting all the provided coredumps, looks like most of the memory
issues can be traced back to the 'monitor_group_replication_thread', like
can be seeing in this backtrace:0 atomic_load_p (mo=atomic_memory_order_relaxed, a=0x199eb8) at include/jemalloc/internal/atomic.h:62
1 rtree_leaf_elm_bits_read (tsdn=
, rtree= , dependent=true, elm=0x199eb8) at include/jemalloc/internal/rtree.h:175 2 rtree_szind_slab_read (r_slab=
, r_szind= , dependent=true, key=15252645203506290, rtree_ctx=0x7fe482ffda10, rtree= , tsdn=0x7fe482ffd9e0) at include/jemalloc/internal/rtree.h:500 3 ifree (slow_path=false, tcache=0x7fe482ffdbd0, ptr=0x363033333d7472, tsd=0x7fe482ffd9e0) at src/jemalloc.c:2490
4 je_free_default (ptr=0x363033333d7472) at src/jemalloc.c:2710
5 0x000000000093164e in mysql_close_options (mysql=0x7fe47e201900) at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/mariadb_lib.c:1866
6 0x0000000000931bfc in mysql_close (mysql=0x7fe47e201900) at /opt/proxysql/deps/mariadb-client-library/mariadb_client/libmariadb/mariadb_lib.c:1970
7 0x00000000005a6516 in monitor_group_replication_thread (arg=
) at MySQL_Monitor.cpp:1508 8 0x00000000005aa957 in ConsumerThread::run (this=0x7fe48300e000) at MySQL_Monitor.cpp:82
9 0x00007fe48c8a0e65 in start_thread () from /lib64/libpthread.so.0
10 0x00007fe48b78888d in clone () from /lib64/libc.so.6
and particularly all refer to the following line from mariadb_lib:
free(mysql->options.extension->plugin_dir);
Since that value is never updated outside mariadblibclient this itself
points to memory corruption error. But furthermore the fact that the unique
different crash, which backtrace is:(gdb) bt
0 0x00000000008f8e9e in re2::DFA::InlinedSearchLoop (this=
, params=0x7fe476bfc1a0, have_first_byte=false, want_earliest_match=false, run_forward=true, this= ) at re2/dfa.cc:1409 1 0x00000000008fbc5c in FastSearchLoop (params=0x7fe476bfc1a0, this=0x7fe47fe22300) at re2/dfa.cc:1607
2 Search (matches=0x0, epp=
, failed=0x7fe476bfc290, run_forward= , want_earliest_match=false, anchored=true, context= , text=..., this=0x7fe47fe22300) at re2/dfa.cc:1800 3 re2::Prog::SearchDFA (this=0x7fe48a43b200, text=..., const_context=..., @.=re2::Prog::kAnchored, kind=
, * @.=re2::Prog::kFirstMatch, @.=0x7fe476bfc2d0, @.=0x7fe476bfc290, @.**=0x0)at re2/dfa.cc:19004 0x00000000008d65c2 in re2::RE2::Match @.=0x7fe48a42a240, text=..., *@.=0, endpos=
, re_anchor= , @.=0x7fe476bfc4f0, @.*=0) at re2/re2.cc:708 5 0x00000000008d8271 in re2::RE2::DoMatch (this=0x7fe48a42a240, text=..., re_anchor=
, consumed=0x0, args=0x0, n=0) at re2/re2.cc:805 6 0x0000000000588710 in Apply
(re=..., sp=..., f= ) at ../deps/re2/re2/re2/re2.h:347 7 re2::RE2::PartialMatch<>(re2::StringPiece const&, re2::RE2 const&) (text=..., re=...) at ../deps/re2/re2/re2/re2.h:370
8 0x00000000005847eb in admin_session_handler (sess=0x7fe48723a900, _pa=0x7fe48a444c00, pkt=
) at ProxySQL_Admin.cpp:3824 9 0x0000000000536cfc in MySQL_Session::handler @.*=0x7fe48723a900) at MySQL_Session.cpp:3123
10 0x000000000055088d in child_mysql (arg=
) at ProxySQL_Admin.cpp:4522 11 0x00007fe48c8a0e65 in start_thread () from /lib64/libpthread.so.0
12 0x00007fe48b78888d in clone () from /lib64/libc.so.6
it's inside a RE2::PartialMatch of a complete valid query, which memory
has been properly initialized and the pa->match_regexes struct from admin
which memory also is properly initialized:(gdb) p (RE2)(pa->match_regexes.re[0])
$4 = {pattern_ = "^SELECT\s+@@max_allowed_packet\s*", options_ = {static kDefaultMaxMem = 8388608, encoding_ = re2::RE2::Options::EncodingUTF8, posix_syntax_ = false, longest_match_ = false, log_errors_ = false, max_mem_ = 8388608, literal_ = false, never_nl_ = false,
dot_nl_ = false, never_capture_ = false, case_sensitive_ = false, perl_classes_ = false, word_boundary_ = false, one_line_ = false}, prefix_ = "", prefix_foldcase_ = false, entire_regexp_ = 0x7fe48a53f140, suffix_regexp_ = 0x7fe48a53f140, prog_ = 0x7fe48a43b200,
is_one_pass_ = true, rprog_ = 0x0, error_ = 0x7fe48a408318, error_code_ = re2::RE2::NoError, error_arg_ = "", num_captures_ = 0, named_groups_ = 0x0, group_names_ = 0x0, rprog_once_ = {_M_once = 0}, num_captures_once_ = {_M_once = 2}, named_groups_once_ = {_M_once = 0},
group_names_once_ = {_M_once = 0}}All points to a memory corruption error, since your configuration involves
mysql_group_replication_hostgroups all points that this crashes could be
motivated by memory corruption described in this closed issue: #3261
https://github.com/sysown/proxysql/issues/3261.
The fix for that issues is available since 'v2.0.16', so my recommendation
here would be to update to at least to 'v2.0.16' and check if the issue is
still present, or can't be reproduced anymore.Hope the issue is solved with that, thanks.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/sysown/proxysql/issues/3329#issuecomment-805185734,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AM5ZBD4VI3Y5YAEVENWZOYTTFDVZVANCNFSM4YOKOKUA
.
@JavierJF I believe this is also affecting proxysql_2.1.0. Will it be fixed in the 2.1.1?
@izzyquestion,can you say how to reproduce the problem?
@egezonberisha Yes, the fix for was introduced for v2.1.1 via this PR: https://github.com/sysown/proxysql/pull/3286, so the fix will be present in v2.1.1 release.