Index names have restrictions due to inconveniences to dealing with things like really long index names, or commas, or "_" prefixed names.
ILM Policies should adhere to the same naming conventions
Pinging @elastic/es-core-infra
I am curious what the motivation for this is? We are not making the same restriction in CCR with the names of auto-follow patterns (I'm not saying ILM is wrong, maybe CCR should be following the lead (pun!) here).
@jasontedor I think it stems from trying to cut off any potential issues before they happen, really long names (causing URL issues for browsers), commas and _ prefixed names are some of the ones. I think it makes sense to try and enforce the rules for naming "things" (index names, ids, etc) with the same consistent rules so that they are different in different parts of the software.
A few comments:
The goal of having consistent names is noble, but this isn't giving us that. It's tying us to a naming scheme with quite some legacy behind it. It doesn't make the system any more predictable.
My feeling is unless something horrible is going to happen without a specific restriction, then the restrictions we enforce here will appear arbitrary. If we feel strongly that we should apply a restriction, then it should be one that is easy to understand. Index naming is borderline incomprehensible, because of its legacy in filesystem limitations.
we have received negative feedback on the limitations that we apply on index names today
My anecdotal experience is that most of the negative feedback is about _changing_ the limitations.
If we launched features with clear, documented and consistent naming limitations from the start we would get less push back. e.g. It would be better to reject unreasonably long names _now_ than have to introducing a breaking change in the future when we realise they cause problems somewhere.
If we launched features with clear, documented and consistent naming limitations from the start we would get less push back.
I almost completely agree with this. What is missing for me is simple, and index naming limitations are not. As I said:
If we feel strongly that we should apply a restriction, then it should be one that is easy to understand.
To be clear, I am okay with a limitation, but index naming limitations are not the right one for me.
To be clear, I am okay with a limitation, but index naming limitations are not the right one for me.
That's fine with me too. I am in favor of a limitation for the reasons Tim mentioned as well, and since we had an existing one it seemed right to use it. We can discuss alternative limitations here if we want to go with that.
The original motivation for this change was based on limiting the length of the policy name and rejecting policy names starting with _ (since we tend to reserve _X in API endpoints to prevent conflicts with other APIs. Maybe we should just enforce these rules only?
Another point that was raised was around users creating policies with potential problematic characters (e.g. a policy named 馃榾. We could further restrict the names to ascii characters but I'm not sure if that is too far at this stage?
I'm for a narrower set of restrictions - the index name restrictions do include rather a lot that we don't need.
As a first cut, how does this sound:
_, (as we allow comma-separated names in e.g. GET _ilm/one,two - currently you can PUT, but not GET a policy named contains,comma as far as I can tell, although other special characters work fine)Anything else we should restrict? Other comments?
I'm happy to go with the above restrictions
Are we okay with spaces in the policy name? (we currently allow it, just want to make sure we're fine with that)
I haven't seen any issues in some (fairly minimal) manual testing with spaces in policy names. The only reason I can see for disallowing spaces is consistency, which could very well be a valid argument.
Since there have been no objections to the limitations I posted above, I've updated https://github.com/elastic/elasticsearch/pull/35104 to apply those restrictions, and also edited the title and description to reflect that it no longer enforces index-name restrictions.
Does anyone have any more input on whether we should add "no spaces" to the list of limitations? I don't believe there are any (current) technical reasons, but we typically do not allow spaces in identifiers, and consistency may well be a good enough reason to disallow them. I could go either way.
Let us remove spaces too, they are not worth the trouble. We do not need to worry about, for example, policy names coming from an external system that might be allowing spaces (like, say, document IDs).
Most helpful comment
I almost completely agree with this. What is missing for me is simple, and index naming limitations are not. As I said:
To be clear, I am okay with a limitation, but index naming limitations are not the right one for me.