Elasticsearch: Enforce index-name rules for ILM policy names

Created on 26 Oct 2018  路  14Comments  路  Source: elastic/elasticsearch

Index names have restrictions due to inconveniences to dealing with things like really long index names, or commas, or "_" prefixed names.

ILM Policies should adhere to the same naming conventions

:CorFeatureILM+SLM blocker

Most helpful comment

If we launched features with clear, documented and consistent naming limitations from the start we would get less push back.

I almost completely agree with this. What is missing for me is simple, and index naming limitations are not. As I said:

If we feel strongly that we should apply a restriction, then it should be one that is easy to understand.

To be clear, I am okay with a limitation, but index naming limitations are not the right one for me.

All 14 comments

Pinging @elastic/es-core-infra

I am curious what the motivation for this is? We are not making the same restriction in CCR with the names of auto-follow patterns (I'm not saying ILM is wrong, maybe CCR should be following the lead (pun!) here).

@jasontedor I think it stems from trying to cut off any potential issues before they happen, really long names (causing URL issues for browsers), commas and _ prefixed names are some of the ones. I think it makes sense to try and enforce the rules for naming "things" (index names, ids, etc) with the same consistent rules so that they are different in different parts of the software.

A few comments:

  • we have inconsistent rules for IDs all over the system

    • e.g., alias names can contain uppercase letters, index names can not

    • e.g., datafeed IDs are not subject to the same rules as index names

    • e.g., document IDs can contain spaces, index names can not

    • e.g., repository names

    • ...

  • rules for index names are to a certain extent legacy from the fact that the name of the index use to be the name of the directory on disk
  • we have received negative feedback on the limitations that we apply on index names today

The goal of having consistent names is noble, but this isn't giving us that. It's tying us to a naming scheme with quite some legacy behind it. It doesn't make the system any more predictable.

My feeling is unless something horrible is going to happen without a specific restriction, then the restrictions we enforce here will appear arbitrary. If we feel strongly that we should apply a restriction, then it should be one that is easy to understand. Index naming is borderline incomprehensible, because of its legacy in filesystem limitations.

we have received negative feedback on the limitations that we apply on index names today

My anecdotal experience is that most of the negative feedback is about _changing_ the limitations.
If we launched features with clear, documented and consistent naming limitations from the start we would get less push back. e.g. It would be better to reject unreasonably long names _now_ than have to introducing a breaking change in the future when we realise they cause problems somewhere.

If we launched features with clear, documented and consistent naming limitations from the start we would get less push back.

I almost completely agree with this. What is missing for me is simple, and index naming limitations are not. As I said:

If we feel strongly that we should apply a restriction, then it should be one that is easy to understand.

To be clear, I am okay with a limitation, but index naming limitations are not the right one for me.

To be clear, I am okay with a limitation, but index naming limitations are not the right one for me.

That's fine with me too. I am in favor of a limitation for the reasons Tim mentioned as well, and since we had an existing one it seemed right to use it. We can discuss alternative limitations here if we want to go with that.

The original motivation for this change was based on limiting the length of the policy name and rejecting policy names starting with _ (since we tend to reserve _X in API endpoints to prevent conflicts with other APIs. Maybe we should just enforce these rules only?

Another point that was raised was around users creating policies with potential problematic characters (e.g. a policy named 馃榾. We could further restrict the names to ascii characters but I'm not sure if that is too far at this stage?

I'm for a narrower set of restrictions - the index name restrictions do include rather a lot that we don't need.

As a first cut, how does this sound:

  1. Length < 255 bytes (or another reasonable value)
  2. Cannot start with _
  3. Cannot contain , (as we allow comma-separated names in e.g. GET _ilm/one,two - currently you can PUT, but not GET a policy named contains,comma as far as I can tell, although other special characters work fine)

Anything else we should restrict? Other comments?

I'm happy to go with the above restrictions

Are we okay with spaces in the policy name? (we currently allow it, just want to make sure we're fine with that)

I haven't seen any issues in some (fairly minimal) manual testing with spaces in policy names. The only reason I can see for disallowing spaces is consistency, which could very well be a valid argument.

Since there have been no objections to the limitations I posted above, I've updated https://github.com/elastic/elasticsearch/pull/35104 to apply those restrictions, and also edited the title and description to reflect that it no longer enforces index-name restrictions.

Does anyone have any more input on whether we should add "no spaces" to the list of limitations? I don't believe there are any (current) technical reasons, but we typically do not allow spaces in identifiers, and consistency may well be a good enough reason to disallow them. I could go either way.

Let us remove spaces too, they are not worth the trouble. We do not need to worry about, for example, policy names coming from an external system that might be allowing spaces (like, say, document IDs).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

clintongormley picture clintongormley  路  3Comments

clintongormley picture clintongormley  路  3Comments

ppf2 picture ppf2  路  3Comments

rjernst picture rjernst  路  3Comments

Praveen82 picture Praveen82  路  3Comments