Kong: Kong migrations 0.14.1 to 1.0.2 callback timeouts due to sometimes remote C* node selection.

Created on 22 Jan 2019 · 8Comments · Source: Kong/kong

Summary

Just to bring attention to other core perf improvements we just ran migrations in a dev environment with 1,500 resources and noticed the schema consensus fails, I am not sure how long Kong waits to confirm schema consensus on the "kong migrations finish" step 2 in 1.0.2 after upgrading nodes from 0.14.1, it does not seem to be configurable nor does kong migrations finish command accept an argument to specify a timeout wait(actually --db-timeout would indeed help here so disregard, following points still matter though)

Luckily a second call does confirm things schema does reach consensus, but maybe a polling technique to keep checking until complete would be better if its just hardcoded and throws a fail?

/ $ kong migrations finish
2019/01/22 21:24:53 [warn] 61#0: *2 [lua] cluster.lua:496: refresh(): [lua-cassandra] found contact point with '0.0.0.0' as rpc_address, using 'server00466' to contact it instead. If this is incorrect you should avoid using '0.0.0.0' in rpc_address, context: ngx.timer
migrating core on keyspace 'kong_dev'...
core migrated up to: 001_14_to_15 (executed)
core migrated up to: 002_15_to_1 (executed)
migrating acl on keyspace 'kong_dev'...
acl migrated up to: 001_14_to_15 (executed)
migrating rate-limiting on keyspace 'kong_dev'...
rate-limiting migrated up to: 002_15_to_10 (executed)
migrating oauth2 on keyspace 'kong_dev'...
oauth2 migrated up to: 001_14_to_15 (executed)
oauth2 migrated up to: 002_15_to_10 (executed)
6 migrations processed
6 executed
Error: [Cassandra error] cluster_mutex callback threw an error: /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:200: [Cassandra error] failed to wait for schema consensus: [Read failure] Operation failed - received 0 responses and 1 failures
stack traceback:
        [C]: in function 'assert'
        /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:200: in function </usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:189>
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/kong/db/init.lua:390: in function 'cluster_mutex'
        /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:189: in function 'finish'
        /usr/local/share/lua/5.1/kong/cmd/migrations.lua:146: in function 'cmd_exec'
        /usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:87>
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:44>
        /usr/local/bin/kong:7: in function 'file_gen'
        init_worker_by_lua:47: in function <init_worker_by_lua:45>
        [C]: in function 'xpcall'
        init_worker_by_lua:54: in function <init_worker_by_lua:52>

  Run with --v (verbose) or --vv (debug) for more details
/ $ kong migrations finish
2019/01/22 21:26:57 [warn] 66#0: *2 [lua] cluster.lua:496: refresh(): [lua-cassandra] found contact point with '0.0.0.0' as rpc_address, using 'server00466' to contact it instead. If this is incorrect you should avoid using '0.0.0.0' in rpc_address, context: ngx.timer
no pending migrations to finish
/ $

A successful migrations finish on a cluster that has less resources just looks like this:

/ $ kong migrations finish
2019/01/22 22:28:57 [warn] 145#0: *2 [lua] cluster.lua:496: refresh(): [lua-cassandra] found contact point with '0.0.0.0' as rpc_address, using 'server7676' to contact it instead. If this is incorrect you should avoid using '0.0.0.0' in rpc_address, context: ngx.timer
migrating core on keyspace 'kong_stage'...
core migrated up to: 001_14_to_15 (executed)
core migrated up to: 002_15_to_1 (executed)
migrating rate-limiting on keyspace 'kong_stage'...
rate-limiting migrated up to: 002_15_to_10 (executed)
migrating acl on keyspace 'kong_stage'...
acl migrated up to: 001_14_to_15 (executed)
migrating oauth2 on keyspace 'kong_stage'...
oauth2 migrated up to: 001_14_to_15 (executed)
oauth2 migrated up to: 002_15_to_10 (executed)
6 migrations processed
6 executed
/ $

I am wondering @bungle this may be another hardening opportunity around db init/keepalive in the migrations process as well to help at scale that you are familiar with implementing now.

Edit - Also the secondary "kong migrations finish" does not honor picking a local DC C* node it executes against, when executed(I run C* settings with RequestDCAwareRoundRobin and LOCAL_QUORUM) it seems the command picks one out, server9096 is in the other dc compared to where this kong runs, ex here, due to this I think I am getting some timeout related errors on being able to finish my migrations:

 $ kong migrations finish
2019/01/22 23:37:54 [warn] 445#0: *2 [lua] cluster.lua:496: refresh(): [lua-cassandra] found contact point with '0.0.0.0' as rpc_address, using 'server9096' to contact it instead. If this is incorrect you should avoid using '0.0.0.0' in rpc_address, context: ngx.timer
migrating core on keyspace 'kong_stage'...
Error: [Cassandra error] cluster_mutex callback threw an error: /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:200: [Cassandra error] failed to run migration '001_14_to_15' teardown: ...l/share/lua/5.1/kong/db/migrations/core/001_14_to_15.lua:371: [Write timeout] Operation timed out - received only 3 responses.
stack traceback:
        [C]: in function 'assert'
        ...l/share/lua/5.1/kong/db/migrations/core/001_14_to_15.lua:371: in function <...l/share/lua/5.1/kong/db/migrations/core/001_14_to_15.lua:264>
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/kong/db/init.lua:564: in function 'run_migrations'
        /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:200: in function </usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:189>
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/kong/db/init.lua:390: in function 'cluster_mutex'
        /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:189: in function 'finish'
        /usr/local/share/lua/5.1/kong/cmd/migrations.lua:146: in function 'cmd_exec'
        /usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:87>
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:44>
        /usr/local/bin/kong:7: in function 'file_gen'
        init_worker_by_lua:47: in function <init_worker_by_lua:45>
        [C]: in function 'xpcall'
        init_worker_by_lua:54: in function <init_worker_by_lua:52>
stack traceback:
        [C]: in function 'assert'
        /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:200: in function </usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:189>
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/kong/db/init.lua:390: in function 'cluster_mutex'
        /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:189: in function 'finish'
        /usr/local/share/lua/5.1/kong/cmd/migrations.lua:146: in function 'cmd_exec'
        /usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:87>
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:44>
        /usr/local/bin/kong:7: in function 'file_gen'
        init_worker_by_lua:47: in function <init_worker_by_lua:45>
        [C]: in function 'xpcall'
        init_worker_by_lua:54: in function <init_worker_by_lua:52>

  Run with --v (verbose) or --vv (debug) for more details

Also does not seem this command honors the KONG_CASSANDRA_TIMEOUT setting on read/writes. The error we get comes back was faster than what I set it to(currently 15 seconds):

usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:205: [Cassandra error] cluster_mutex callback threw an error: /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:200: [Cassandraerror] failed to run migration '001_14_to_15' teardown: ...l/share/lua/5.1/kong/db/migrations/core/001_14_to_15.lua:371: [Write timeout] Operation timed out - received only 5 responses.

Edit - Apparently we can pass an argument, and somehow adding this did fix our write timeout error:

kong migrations finish --vv --db-timeout 60

This fixed the very last problem we had around meeting migration success somehow, my point still stands about it not picking a local node for the migrations command though.

Additional Details & Logs

Kong version 1.0.2

Source

jeremyjpj0916

All 8 comments

Hi @jeremyjpj0916,

Glad you found the new timeout argument of the migrations command, I was replying to this thread when I saw your edit. This value will set both the schema consensus and general r/w cassandra timeout values.

Regarding the DC-awareness not being respected, that is because the migrations will rely on the contact_points list, but will not do a cluster topology scan (like we do when Kong starts). Since the contact_points list is iterated upon from first to last element (and we use the first one we can connect to), it is typical that for a given node, the first few contact points are those belonging to the same datacenter. This way, when Kong starts, it first attempts to connect to contact points within the same DC. The same way, the migrations attempt retrieving one of those CPs to run, and will naturally connect to CPs in the same DC (again, assuming that they are properly ordered).
If, as of today, the first CPs for a given node are those of another DC, then no matter what, any cluster topology scan event (starting Kong or, if we were to change the migrations command, running said migrations) _would_ trigger a cross-DC query to refresh the cluster topology.

thibaultcha on 23 Jan 2019

👍1

@thibaultcha Thanks for the deeper insight, I did not study too much of the underlying migration cmd source besides all the timeout options and how they set the underlying db connectivity properties but I think it would be a nice to have baked into migrations to do the topology scan so Kong start shows similar behavior to Kong migration(with respect to initial db interaction being a scan to figure out the optimal nodes to work with prior to heavy migration calls?). Surprised the scan logic was not reused since Kong start will do it, was it intentionally left off or just harder to integrate or oversight? Regardless, before I migrate prod you better believe I am going to look at my contact points order in the list and organize from left to right the local to remote nodes first 3 and last 3 lol 😄, gave me a heart attack in DMZ because our CORE nodes were in the "right" order by luck and DMZ contact points had a few environments not in the right order and that made all the difference. Figure we can close this unless you wanna slap it as a feature request or whatever you consider the end result to be. Just would save on the potential risk if Kong was "smarter" when it migrates(as migrations and major version upgrades are probably the riskiest actions a Kong user can make). I try to maintain as much detail and feedback as I can. I figure another year of me pointing out anything I see strange or off and getting fixes or enhancements and Kong will be bulletproof 💪 .

jeremyjpj0916 on 23 Jan 2019

I believe it is originally an oversight, but at the same time, as stated, doing a cluster topology scan will _still_ require a cross DC query if the contact points are not in the right order. Which is certainly more acceptable that running all of the migrations on a foreign DC node. I'll push an update, probably tomorrow, along with the patch for what seems to be the underlying issue to #4226 and #4229.

thibaultcha on 23 Jan 2019

👍1

Glad you think its a worthwhile saftey feat on the topology check with migrations though, will tread lightly for now doing it as current design allows 😄. Hopefully an easy win simple change.

As for the other folks issues maybe I don't experience it because I do full replication and local quorum? Those days of ONE and low replication factors have hurt me enough with features/functionality to ever go back to that life 😂. Coding around the differing natures of C* seems like a real joy man.

jeremyjpj0916 on 23 Jan 2019

Also as an update, it seems even with C* in the right order for sure before migrations finish was ran(local dc nodes left hand side of ENV variable and remote on right hand side of ENV variable) I could still get the timeout error. The only way to prevent the timeout error was to always pass the:

--db-timeout 60

arg on migrations finish. Which I find strange because it says in code comments "(default 60)" , I believe something underlying on db connection behavior is only being actually set when this is actually set vs leaving it off(potentially something not defaulted to "60" that should be?) I have not investigated it too far.

jeremyjpj0916 on 28 Jan 2019

We always pass along the --db-timeout argument, even if not overridden by a CLI argument, so the same code path is taken whether or not you specify this argument (see https://github.com/Kong/kong/blob/master/kong/cmd/migrations.lua#L73-L85).

Now reading at the error again, I realize this is a C* server-side timeout error, and not a client timeout (that is, a timeout while replicating a value across your peers). Have you tried increasing the write_request_timeout_in_ms setting of your Cassandra configuration?

thibaultcha on 29 Jan 2019

@jeremyjpj0916 I just pushed this branch: https://github.com/Kong/kong/tree/fix/run-cass-lb-policy-in-migrations which makes the C* connector respect the LB policy when starting. Not sure if it solves the issue at all since it seems to be a C* mis-configuration after all, but it may still be something we want to ship as it still prevents running migrations against a cross-DC node :)

thibaultcha on 29 Jan 2019

🚀1

@thibaultcha good catch! I had not realized that. I noticed our configs are set to 2000 write timeout in ms for that value and read timeout is set to 5000. This is not a setting I have ever adjusted myself, may try bumping that up to 10000 each to be safe, thanks for the details. And agree on the feature, certainly a nice to have but I think I now see the light that the defaulted write timeout with this cluster settings must be too low to accomplish all the glorious db modifications Kong wants to do 😄 !

Feel free to close this issue whenever, I will be adjusting my C* settings, next time a migrations rolls around I will see if that fixes the situation and come back with details if I find any strange behavior.

jeremyjpj0916 on 29 Jan 2019

Was this page helpful?

0 / 5 - 0 ratings