Kong: (OAuth2)Kong expired bearer tokens tombstone/compaction C* cleanup occurring?

Created on 24 Jul 2019  路  9Comments  路  Source: Kong/kong

Summary

As discussed one point here:

https://github.com/Kong/kong/pull/4239#issuecomment-481372370

I began to further investigate if Kong OAuth2 expired bearer tokens were being tombstoned and subsequently compacted over time:

To test I generated 33,000+ OAuth2 tokens in a dev environment about 14 days ago now. We usually have a script that deletes oauth tokens nightly but disabled it in dev for testing purposes to see if Kong and C* are working naturally to prune and keep the table clean/devoid of expired ttl oauth2 token table data.

I currently see they are all still present in the db:

****@cqlsh:kong_dev> select count(*) from oauth2_tokens;
 count
-------
 33657

(1 rows)

Warnings :
Aggregation query used without partition key

Sample of some older tokens expired(lasts 3600 or 1 hr validity) see from 2019-07-10

****a@cqlsh> use kong_dev;
****@cqlsh:kong_dev> select * from oauth2_tokens;

 id                                   | access_token                     | authenticated_userid | created_at                      | credential_id                        | expires_in | refresh_token | scope | service_id | token_type
--------------------------------------+----------------------------------+----------------------+---------------------------------+--------------------------------------+------------+---------------+-------+------------+------------
 2165c4fd-82e9-4be5-bc47-6e09eb2deee5 | 33KCvfJM23VlQ0U5c53YqJOYfH05Cdpz |                 null | 2019-07-10 22:28:38.000000+0000 | c28d35ba-cb6f-4cb3-98b0-ae0566ff49e0 |       3600 |          null |  null |       null |     bearer
 51ea2b33-b5a8-4ca1-bfce-4cb6c5284b45 | ddQOvWwy2mDwbadVybDGewVeJhIl7eHP |                 null | 2019-07-10 04:17:21.000000+0000 | c28d35ba-cb6f-4cb3-98b0-ae0566ff49e0 |       3600 |          null |  null |       null |     bearer

I turned tracing on too on this query to see if I could see any tombstones in place, here is a sample of that:

Executing seq scan across 2 sstables for (d76350c3-a43a-48ed-a3b1-82ceedcfa9e0, min(-9223372036854775808)] [ReadStage-3] | 2019-07-24 00:35:30.981000 |  10.86.168.39 |        1451919 | 127.0.0.1
                                                                        Computing ranges to query [Native-Transport-Requests-1] | 2019-07-24 00:35:30.984000 |  10.86.168.39 |        1454375 | 127.0.0.1
                                                                         Read 100 live rows and 1 tombstone cells [ReadStage-4] | 2019-07-24 00:35:30.986000 |  10.86.168.39 |        1456683 | 127.0.0.1
                                                                        Computing ranges to query [Native-Transport-Requests-1] | 2019-07-24 00:35:30.989000 |  10.86.168.39 |        1459650 | 127.0.0.1
                                                                        Computing ranges to query [Native-Transport-Requests-1] | 2019-07-24 00:35:30.992000 |  10.86.168.39 |        1462315 | 127.0.0.1
                                                                        Computing ranges to query [Native-Transport-Requests-1] | 2019-07-24 00:35:30.994000 |  10.86.168.39 |        1464918 | 127.0.0.1
                                                                         Read 100 live rows and 0 tombstone cells [ReadStage-4] | 2019-07-24 00:35:30.997000 |  10.86.168.39 |        1467115 | 127.0.0.1

Note the reads were finding lots of 0 tombstone cells, so does that mean tombstoning is not occurring for many OAuth2 bearer tokens generated by Kong?

I further reviewed the C* keyspace table as well:

CREATE TABLE kong_dev.oauth2_tokens (
    id uuid PRIMARY KEY,
    access_token text,
    authenticated_userid text,
    created_at timestamp,
    credential_id uuid,
    expires_in int,
    refresh_token text,
    scope text,
    service_id uuid,
    token_type text
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE INDEX oauth2_tokens_refresh_token_idx ON kong_dev.oauth2_tokens (refresh_token);
CREATE INDEX oauth2_tokens_authenticated_userid_idx ON kong_dev.oauth2_tokens (authenticated_userid);
CREATE INDEX oauth2_credential_id_idx ON kong_dev.oauth2_tokens (credential_id);
CREATE INDEX oauth2_tokens_service_id_idx ON kong_dev.oauth2_tokens (service_id);
CREATE INDEX oauth2_tokens_access_token_idx ON kong_dev.oauth2_tokens (access_token);

gc_grace_seconds = 864000 is 10 days, so at 14 days we are well past that.

Am I missing an important piece of knowledge or is something not behaving as expected?

I suppose we also have to account for the Compaction strategy:

'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}

Which is the default, maybe in Kongs case this strategy is not the best approach?

Steps To Reproduce

  1. Generate a ton of OAuth2 Bearer tokens in your C* cluster through the Kong OAuth2 plugin

  2. See if after gc_grace_seconds = 864000 (10 days) you notice all cells being tombstoned and a while after any compaction occurring.

Additional Details & Logs

  • Kong version 1.2
pending author feedback

All 9 comments

Okay seemingly did clear out(well at least tombstoned) sometime since the original post and now(the 26th):

*****@cqlsh:kong_dev> select count(*) from oauth2_tokens;

 count
-------
     0

(1 rows)

Warnings :
Aggregation query used without partition key

Read 0 live rows and 33665 tombstone cells for query SELECT * FROM kong_dev.oauth2_tokens WHERE  LIMIT 100 (see tombstone_warn_threshold)

Wonder why the tombstoning of the records took over 14 days, now noticing it by the 16th day.

@jeremyjpj0916 a few things to clean up.

First, gc_grace_seconds is the amount of time a record will spend under a tombstone _after_ it has deleted or aged out via TTL. If you haven't deleted the records, and its columns haven't expired, you wouldn't expect to see them delete and thus under a tombstone. This is a good read on the subject: https://thelastpickle.com/blog/2018/03/21/hinted-handoff-gc-grace-demystified.html

The expires_in value on the record is only leveraged by the Lua logic in the plugin (https://github.com/Kong/kong/blob/master/kong/plugins/oauth2/access.lua#L781-L783); it doesn't have any bearing on when the token is actually deleted by Kong.

The compaction strategy has to do with how Cassandra manages the sstables on disk, not TTLs or tombstone cleanup. It would be relevant if you saw tombstones around longer than the gc_grace_seconds, but that's not the question at here it doesn't seem. Another good read: https://academy.datastax.com/support-blog/cleaning-tombstones-datastax-dse-and-apache-cassandra.

Given all this, I'd say the behavior is expected if you removed the tooling that prunes the tokens from the DB, and the token_expiration plugin config is 0 or very high (see https://github.com/Kong/kong/blob/master/kong/plugins/oauth2/access.lua#L54-L85 for that behavior).

@p0pr0ck5 Appreciate the insight, so when you say

The expires_in value on the record is only leveraged by the Lua logic in the plugin (https://github.com/Kong/kong/blob/master/kong/plugins/oauth2/access.lua#L781-L783); it doesn't have any bearing on when the token is actually deleted by Kong.

If that is the case, what does hold a bearing on when the token is actually deleted by Kong(if it is at all)? It seems the most logical design to me that after a OAuth2 token has expired by its conf.expires_in time that it would be time for Kong+Cassandra to rid itself of the record? Either via a column of the OAuth2 token table reaching expiration(which would force the case from C* side) or a forced delete of old tokens Kong recognizes as expired via its cache(which could be done Kong side but feels wrong to me)?

And to be clear(realized I did not put this in the original config above) our token_expiration is not set to 0 or very high:

{
  "created_at": 1548305715,
  "config": {
    "refresh_token_ttl": 1209600,
    "scopes": null,
    "mandatory_scope": false,
    "enable_client_credentials": true,
    "accept_http_if_already_terminated": false,
    "hide_credentials": false,
    "enable_implicit_grant": false,
    "global_credentials": true,
    "enable_authorization_code": false,
    "enable_password_grant": false,
    "token_expiration": 3600,
    "anonymous": null,
    "provision_key": "0ZLfr6AdTWk0gaUluVOyPnFYn0IcXyir",
    "auth_header_name": "authorization"
  },
  "id": "2eea9341-b6bc-45ba-80af-a2a38acdd39f",
  "service": null,
  "name": "oauth2",
  "enabled": true,
  "route": {
    "id": "423161dc-1d38-47eb-9093-0c9c20eb946b"
  },
  "run_on": "first",
  "consumer": null,
  "protocols": [
    "http",
    "https"
  ],
  "tags": null
}

Just set to an hour.

So the desired behavior as a Gateway Admin imo would be after an hour something gets figured out automatically either by how when Kong inserts a oauth2 token record with a ttl in a column that corresponds to the "token_expiration": 3600 it gets wiped by C*, or Kong purges with delete commands from it's side which would be ugly and not a good path forward.

Regardless we can just keep our daily cron that runs and deletes tokens 馃憤 , was just something I thought to bring up as it had sorta sounded like with 1.x tables with temp tokens Kong has gets cleaned up automatically with respect to the tokens functional TTL validity. Can close this if there is nothing further around this.

After reading up on C* I suppose Kong would not want to do this, but maybe it makes sense for teams that know what they want their oauth2 TTL to be (if minutes/hours/days). Then change table like so:

    AND default_time_to_live = 86400
    AND gc_grace_seconds = 86400

So 1 day

As for expected behavior I guess Kong is in line. But I think since people generally use oauth as short lived ttl then setting a purge that occurs db side daily naturally is probably good for the 99% of users(and really the only secure way to do oauth. long lived oauth2 tokens I consider as bad as key auth honestly). Closing this I got the info I needed. Thanks again @p0pr0ck5 .

It is worth pointing out that Kong in fact does insert OAuth2 token columns with a TTL. This call will eventually end up here, and use the insert_ttl prepared query, which specifies USING TTL <ttl>.

Note that you can retrieve the TTL for a column like so:

SELECT TTL (access_token) from oauth2_tokens WHERE id = ?;

Thus, Kong inserts tokens with a column TTL of conf.refresh_token_ttl when it is specified and greater than 0 (its default is 1209600, or 14 days.).

Additionally, note the CQL documentation on TTL:

Expired data is marked with a tombstone after on the next read on the read path, but it remains for a maximum of gc_grace_seconds.

The keywords here being "after the next read".

So, in your case, you inserted the tokens with conf.refresh_token_ttl = 1209600 (the default value), and conf.token_expiration = 3600. Thus, the column contains expires_in = 3600, and I its TTL would be 1209600. You then waited 14 days, and performed one or more reads on the entire partition (select count(*) from oauth2_tokens;). One of these reads triggered the tombstone process at some point. Now your columns are marked with a tombstone and will remain as such for gc_grace_seconds, which is set to 1 day.

That would be my guess as to what happened here.

That is actually a really good catch @thibaultcha , I wasn't even thinking about the refresh_token_ttl (no one really uses that pattern internally) and now reading the code comment you selected it explicitly mentions thats what dictates the tokens TTL in the plugin(derp on my part for missing that). So that field is why the TTL is undesirable because it is so long by default(and ofc a read needs to occur on that table post TTL it sounds like for gears to start turning). And well for oauth2_tokens the gc_grace_seconds is set to 10 days by default(seems a schema level thing so far, any reason for 10 days vs 1 day as a default on ridding of tombstones? Besides the fact there is slight risk if a node goes down long enough without a repair being done/zombie data.). So far I haven't touched the gc_grace_seconds on that table(in another git issue we just adjusted non-prod for the cluster_events to gc_grace_seconds to 1 day because the number of tombstones and records being generated in a decently active environment of oauth plugins + proxies creates a lot of cluster event db traffic and mucho tombstones rather quickly).

Since no one uses refresh token pattern anyways for us I am going to change that massive default to 3600 just like our access_token flow. Really appreciate the feedback, that field was the mystery to it.

The gc_grace_seconds is a C* setting, not a Kong setting. The ten-day window is the default for Cassandra.

@p0pr0ck5 and I read its non-configurable in the cassandra.yaml but you can change it globally by doing something like this: alter table number_item(oh wait this is a generic example) with gc_grace_seconds = XXXXXX; dunno if thats out of date for modern 3.X.X . It will apply to all data though(but luckily our C* clusters just run Kong so maybe not a big deal). Would be good for Kong to put together some recommendations on C* settings geared for Kong specifically somewhere in the docs(which arguably changes depending on how you run Kong for best practice). In our case using a r/w intensive plugin like oauth2 I think forces us to apply something a bit more aggressive. 10 days is a long time for idle tombstones due to some types of gateway behavior, especially considering the failed tombstone threshold is a small 100,000 by default. I think once a night cleanup is just about right for cleanup at our current levels.

Was this page helpful?
0 / 5 - 0 ratings