Elasticsearch: Support invalidating multiple API keys by their ID

Created on 4 Oct 2019 · 16Comments · Source: elastic/elasticsearch

Is there a way to support invalidating multiple API keys via their IDs with a single request? For example, perhaps we could support comma-separated IDs?

DELETE /_security/api_key
{
  "id" : "key1,key2,key3"
}

CC @bizybot

:SecuritSecurity >enhancement Security UI

Source

cjcenizal

All 16 comments

Is it expected that this API would be called in bulk (i.e., with more than a few)? If not, what's the overhead of sending multiple requests, or sending multiple requests in parallel? We aim to keep our APIs and the surface areas of our APIs small, otherwise we bloat how much that we have to maintain.

If we were to enhance this API to allow deleting multiple at once, I wouldn't be in favor using a comma to separate values. We should be explicit that multiple keys are being deleted, something like:

{
  "ids" : [ "key1", "key2", "key3" ]
}

This makes it so much clearer and easier to work with, for clients and us internally.

jasontedor on 5 Oct 2019

Pinging @elastic/es-security (:Security/Security)

elasticmachine on 7 Oct 2019

Pinging @elastic/es-ui (:ES-UI)

elasticmachine on 13 Feb 2020

Hi Security team, the Ingest Management team has a need for bulk invalidate API keys. We expose a UI for taking bulk actions on agents, including bulk unenroll and bulk force unenroll. During force unenrollment, we invalidate two API keys per agent. The user can potentially unenroll up to 10,000 agents at a time using this UI, resulting in the need to invalidate 20,000 keys. As you can probably imagine, sending 20,000 invalidate requests in parallel doesn't work too well today :) Can the team consider re-prioritizing this feature request with this use case in mind? In the meantime I will experiment with batching the requests as our workaround.

cc @ruflin @ph

jen-huang on 21 Sep 2020

Thanks for raising your use case for this feature.
I think we could reconsider if we want to do that, and which are the implications if we decide to go with it.
Or just see if there is a different solution to solve the problem.

Pinging @elastic/es-security to get more feedback on the technical side of the proposal.

bytebilly on 22 Sep 2020

We discussed with the team, and we have a few questions about the end to end flow that you expect to see in Fleet.

Given the designs in https://github.com/elastic/kibana/issues/72712, I guess that you expect users to request any possible bulk combination, not just related on policy, or creator. So, existing selectors for bulk invalidate in the invalidate API don't solve the problem.
Is that correct?

Another question is how you get the list of IDs for the selected agents. Is it something that is attached to the agent information? Do you need to fetch them from Elasticsearch? It could be useful to check if there is some optimization for the end to end flow rather than just the invalidate action.

bytebilly on 23 Sep 2020

@bytebilly I think it could be possible in some cases to optimize the unenrollment, let's say we want to revoke all Agent linked to a specific configuration, maybe we could attach that information as metadata and have the possibility to _delete by query_, or we could use the "realm_name" for that.

We have to assume that in theory, we could do partial unenroll of Agent from a configuration.

I will let @jen-huang comment on how we get the list of IDs.

ph on 23 Sep 2020

👍1

Hi, you can look at this code branch for the flow: forceUnenrollAgents()

A few key points:

Agent objects have references to two API key IDs
Those API keys are always created by the fleet_enroll user (a system user we create) using default_native realm. Thus, it's not possible to invalidate by user or realm because that will invalidate other keys that the user didn't choose.
The user selection can be done by list of agent IDs, or by query. Regardless, we first fetch all agent objects and from that information, extract all their API key IDs.

Since I last commented, I was able to improve the performance here by batching invalidate requests by ID to 500 keys at a time. This worked well in my local testing and was sufficient for our upper limit of 10,000 agents/20,000 keys.

jen-huang on 23 Sep 2020

Since I last commented, I was able to improve the performance here by batching invalidate requests by ID to 500 keys at a time. This worked well in my local testing and was sufficient for our upper limit of 10,000 agents/20,000 keys.

Could you please share what your improvements are? How did you implement batching for invalidate by ID? Are you still doing individual invalidate requests for each key, or do you access the security index directly? Thanks!

bytebilly on 29 Sep 2020

The improvements still do individual requests for each key through @elastic/kibana-security's authc.invalidateAPIKey interface. The batching simply sends up to 500 invalidate requests to ES in parallel, instead of 20,000.

jen-huang on 29 Sep 2020

👍1

I'd like to have agreement on a small design for this issue here. The payload of current API has a field id that expects a single API key ID to be provided. To support multiple IDs, I can see following three options:

Add a new field ids to takes a list of IDs, e.g. { "ids": ["abcd", "efgh"] }.
Make the existing id field accept comma separated IDs so it can work with either a single ID or multiple ones, e.g. { "id": "abcd,efgh" }
Make the existing id field accept either a string or a list, e.g. both these are acceptable, { "id": "abcd" } and { "id":["abcd","efgh"] }

I prefer option 1 because:

Option 2 makes sense only if it is a URL path parameter, not for payload.
Option 3 has some complexities in terms of code, API spec and client side code (especially for clients in a statically typed language, e.g. Java). It also feels inconsistent. I don't believe we have any other payload field behave like this and I don't want to introduce the first one.
The name id in both option 2 & 3 is slightly confusing when it can take multiple IDs.

With above said, option 1 also has its own overhead:

What will be the long term story for these two fields, id and ids? It's not urgent but I don't think we want to keep both of them forever? If we want to deprecate id, it will incur deprecation and BWC cost.
Client side code needs to be updated to accept new field and tests need to be udpated as well

Some common overhead for changing request/response entities:

API spec update and releveant tests
Doc update and relevant tests

ywangd on 30 Sep 2020

@ywangd I also prefer option 1. and deprecate id .

Should we do the same for the name parameter, i.e. deprecate it and have it replaced by a names one that is an array? I think so.

albertzaharovits on 30 Sep 2020

After team discussion, we agreed to overload the existing id field to take both string or array of strings (option 3 in above comment). Introducing a new ids field has not only BWC implications, but also raises consistent questions about other fields, e.g. name or the same field of GetApiKeyRequest. Addressing each of the consistency issue will also need take care BWC. In short, it could incur a lot more work while the sheer value is low. Overall, reusing the existing id is a more practical solution. Also I was mistaken that we don't have existing code for a field to take either string or array of string. The AbstractObjectParser#declareStringArray by default accepts both a single string or an array of string. So it is an existing pattern.

ywangd on 1 Oct 2020

I'm concerned with the suggestion here, because what we're ending up with is a field that can take on two different forms. That's additional mental overhead, and it complicates the specification and high-level clients. It's also confusing that id can be multi-valued (because it's not plural). I think we should take an iterative approach towards what we want our APIs to look like. One suggestion that I can offer here would be to keep the existing id field as-is, add an ids field that is an array of IDs of API keys to remove, and also enforce that only one of id or ids is in the request. This gives us the option to later remove the id field if we feel that is the right thing to do.

jasontedor on 1 Oct 2020

👍1

Thanks for chime in @jasontedor. I did share the same concerns for reusing the exising id field in my previous comment. But the BWC complexities of adding a new ids field got me think otherwise. Honestly, if we do not consider effort, adding a new ids field and remove existing id field is the way to go because it is clean and unambigous. So if we are aiming for long term, we should go with adding a new ids field.

I like the idea of iterative approach. To me it means we can have this new field to fleet to use without having to pay bwc immediately. We will have to pay the price, but it is amortized (before version 8).

ywangd on 2 Oct 2020

@jen-huang A new ids field is added to the invalidate API key API (#63224). Could you please update your side to leverage this new field for bulk invalidation? For each API key invalidation call, there is now an internal call to clear it out from the cache as well. So it would be great if these operations can be batched, i.e. using the new ids field to bulk invalidation, which incurs only a single cache clearing call. Thanks!

ywangd on 6 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings