There has been a case of a user that set xpack.security.transport.filter.deny: _all and could then not perform any operations on his cluster anymore, which is problematic since he couldn't get access to the host either. Is it something we should protect users against?
Pinging @elastic/es-security (:Security/Network)
It would be nice to have an elasticsearch setting to disable this API completely, for cases where it does not make any sense.
I.e. In our Elasticsearch Service, where everything goes through the proxy, but the proxy IP is unknown to the user and setting a deny all and only allowing their public IP would lock out the proxy and thus all further requests.
It would be nice to have an elasticsearch setting to disable this API completely, for cases where it does not make any sense
Node based settings to disable cluster settings have tricky semantics. If you restore a cluster from a snapshot you end up with cluster settings that can't be changed, etc.
In some other cases where we've looked at doing this, we've decided that the edge cases just make it impossible.
I suspect, it will be fine here. I think we want something as simple as xpack.security.ip.filter.enabled: false, but we'll need to think through various scenarios to work out where the edge cases are and whether they're a problem.
xpack.security.transport.filter.deny: _all
It sounds like we want to make it impossible to dynamically set a transport filtre that would exclude _all_ the current nodes in the cluster. (Or maybe _any_ of the current nodes, or just the master, or just the node on which the request is received, there's options).
Once you get that into cluster state it's hard to recover from. Although, having a node setting to turn off ip filtering (see comment above) would provide a work around if you did break your cluster.
If we try and solve this with "rules about what can be excluded", then we probably also want some rule about the HTTP filter too. If you can lock out all HTTP requests, then your cluster might work, but you can't use it. For that reason a "disable all configured IP filters" node level setting is probably the most useful option, because I'm not sure that there's a 100% reliable test we can apply when validating a dynamic cluster setting.
then we probably also want some rule about the HTTP filter too.
Good catch, didn't think about that as I only encountered it with transport, but in the Elasticsearch Service example the proxy connects via HTTP most of the time and could also be easily locked out.
For that reason a "disable all configured IP filters" node level setting is probably the most useful option, because I'm not sure that there's a 100% reliable test we can apply when validating a dynamic cluster setting.
Which, of course, already exists (I think we all forgot).
So, I would propose that the best fix right now if to explicitly turn off IP filtering in cases (like cloud) where it is not useful.
The last example here https://www.elastic.co/guide/en/elasticsearch/reference/7.6/ip-filtering.html#dynamic-ip-filtering suggests that xpack.security.*.filter.enabled: are dynamic settings, so even if we disable via elasticsearch.yml users could still enable it and look themselves out in case of Elasticsearch Service.
Thanks @jakommo I should have looked more closely.
It looks like we'll need a single, non-dynamic xpack.security.ip.filter.enabled setting as well that controls all filtering.
Assigned to @ywangd for (future) consideration in operator privileges project.
Most helpful comment
Which, of course, already exists (I think we all forgot).
https://www.elastic.co/guide/en/elasticsearch/reference/7.6/ip-filtering.html#_disabling_ip_filtering
So, I would propose that the best fix right now if to explicitly turn off IP filtering in cases (like cloud) where it is not useful.