Elasticsearch: Implementation tracking for 7.0 types deprecation.

Created on 2 Nov 2018  路  12Comments  路  Source: elastic/elasticsearch

Tracks the details of the 'In 7.0' part of this comment: https://github.com/elastic/elasticsearch/issues/15613#issuecomment-239435920

Plan for 7.0

  • For requests with a type in the URL or as a leaf field, we will accept both typed + typeless versions of the API. We鈥檒l emit a deprecation warning to tell users they need to move to the typeless endpoints before 8.0. Responses will still contain a _type field, but we will return the dummy name _doc regardless of the underlying type name.
  • For APIs whose request/ response structure changes with the deprecation (create index, get mapping, etc.), we鈥檒l have a request parameter include_type_name that should be set to false to omit types in requests + responses. It will default to true in 6.7 with a warning that it needs to be explicitly specified (to either true or false), default to false in 7.0 with a warning to stop specifying it, and finally be removed in 8.0.

More information can be found here: https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html

To-do List

Phase One: Add Typeless APIs. These items are critical for 6.7/ 7.0, and should be in before feature freeze.

  • [x] Remove the include_type_name parameter from the bulk, delete, get, index, update, and search APIs. @jtibshirani #35192
  • [x] Make sure include_type_name is added to APIs that we missed, and has the right default values on both 6.7 and 7.0. @jtibshirani #37210 #37285
  • [x] Add the ability to ignore type-related warnings in REST tests. @jtibshirani #35395
  • [x] Emit deprecation warnings when typed APIs are used. As part of this work, we should deprecate the appropriate methods in the Java HLRC. For some APIs, we may also need to create a duplicate REST test without types. The following APIs should be updated:

    • [x] search @jpountz #29468

    • [x] count, msearch @jtibshirani #35421

    • [x] explain @jtibshirani #35611

    • [x] search_template, msearch_template @jtibshirani #35669

    • [x] termvectors, mtermvectors @jtibshirani #36182

    • [x] get, exists, mget @jtibshirani #35930

    • [x] delete @jtibshirani #36087

    • [x] bulk @markharwood #36549

    • [x] index (note that the Java HLRC deprecations may be tricky) @mayya-sharipova #36575

    • [x] update @jtibshirani #36181

    • [x] get_source, exists_source @cbuescher #36426

    • [x] document _create @jtibshirani #36863

    • [x] reindex @cbuescher #36823

    • [x] delete_by_query, update_by_query @mayya-sharipova #36365

    • [x] indices.validate_query @jtibshirani #35575

    • [x] indices.create @jtibshirani #37134

    • [x] indices.put_mapping @jtibshirani #37280

    • [x] indices.get @cbuescher #37149

    • [x] indices.get_mapping @jtibshirani #37796

    • [x] indices.get_field_mapping @mayya-sharipova #37667

    • [x] indices.get_template, indices.put_template @markharwood #37484

    • [x] indices.rollover @mayya-sharipova #38039

  • [x] Ensure that if an index鈥檚 type is named something other than _doc, then typeless API calls still work. Currently in this situation, using certain typeless APIs will produce an error due to a mismatched type name. @jpountz #35790
  • [x] Allow for typeless 'lookup' queries (MoreLikeThis, ids query, terms query, GeoShapeQuery) @mayya-sharipova @jtibshirani #37016
  • [x] Follow-up fixes from the above work.

    • [x] Backport the relevant parts of #35790 (allow typeless APIs when an index has a custom type) to 6.7. @jpountz #37147

    • [x] Typeless index call can fail against an index with a custom type (#36811). @jtibshirani #37451

  • [x] Deprecate types in watches (search templates and index actions). @jakelandis #37594

Phase Two: Important Clean-up. These tasks should be in by 6.7/ 7.0, but can go in after feature freeze.

  • [x] Update removal_of_types.asciidoc to reflect the new plan. @jtibshirani #38003
  • [x] Switch to typeless index creation in REST tests. @colings86 #37611 #38058
  • [x] Switch to typeless index creation (remove include_type_name=true) in documentation. @cbuescher #37568 #37601 #37646
  • [x] Solidify the approach to allowing typeless requests on an index with a custom type (#37450).

Phase Three: Additional Deprecations. These are good to have by 6.7/ 7.0, but could be pushed into 7.1 if strictly necessary.

  • [x] Deprecate references to _type in the search request body.

    • [x] uses of _type as a field name (in match queries, etc.) @mayya-sharipova #36503 #36802

    • [x] aggregations @jtibshirani #37131

    • [x] retrieving fields Loading _type as a field doesn't actually work, and doesn't make much sense because each search hit already contains a _type component.

  • [x] Emit a deprecation warning when a search template containing a type is triggered. Note this is different from emitting a warning when a typed template is originally added. Already covered by other search-related PRs.
  • [x] Emit a deprecation warning when an index template containing a type is triggered. Note this is different from emitting a warning when a typed template is originally added. We support making typeless calls against an index with a custom type, so it is not really harmful to have a template with a custom type (and in fact several internal templates around monitoring use typed templates).
  • [x] Deprecate references to _type in scripts. @jdconrad #37491 #37554
  • [x] Deprecate types in simulate pipeline requests (#37731). @gwbrown #37949
  • [x] Deprecate types in graph explore requests. (#40466)

Items we鈥檙e still following-up on

  • [x] For responses that contain the type as a leaf field, should we always return _doc regardless of the underlying type even when the old typed APIs are used?
  • [x] Types may be present in saved search requests, including search templates and watches. We should think through the upgrade plan here. https://github.com/elastic/elasticsearch/issues/35190#issuecomment-439121696
  • [x] When an index template is stored, the mappings are nested under the type name. We also need to consider how these will be accessed and upgraded.
:SearcMapping >deprecation Meta

Most helpful comment

One last thing I want to raise which might be super contentious:

Right now the default for include_type_name when not specified is true in 7.0. There are two downsides with this:

  • New installations will get deprecation warnings OOTB unless they specify include_type_name=false which they need to remove in 8.0.
  • Someone who wants to go through all the type related changes in one major version can not do so.

Can we remove some of the confusion of include_type_name defaulting to true by simply having it
default to false in 7.0 already?

  • New installations will return the desired responses by default.
  • Someone who wants to go through all the changes in one major version can do so.

Upgrades will have to account for the type removal whether the include_type_name defaults to true or false.

Are we actually doing our users a disservice by holding on to a deprecation period?

All 12 comments

Pinging @elastic/es-search-aggs

CC @elastic/es-clients

One last thing I want to raise which might be super contentious:

Right now the default for include_type_name when not specified is true in 7.0. There are two downsides with this:

  • New installations will get deprecation warnings OOTB unless they specify include_type_name=false which they need to remove in 8.0.
  • Someone who wants to go through all the type related changes in one major version can not do so.

Can we remove some of the confusion of include_type_name defaulting to true by simply having it
default to false in 7.0 already?

  • New installations will return the desired responses by default.
  • Someone who wants to go through all the changes in one major version can do so.

Upgrades will have to account for the type removal whether the include_type_name defaults to true or false.

Are we actually doing our users a disservice by holding on to a deprecation period?

Types may be present in saved search requests, including search templates and watches. We should think through the upgrade plan here.

The proposal is in 7.0 we will introduce deprecation warnings when saved queries with types are executed, but not do anything to proactively detect and upgrade them automatically. Some of the easier type removals would be in the structured metadata held around a saved search whereas other changes may be buried in clauses embedded in the saved search's choice of query DSL.

@Mpdreamz I'm thinking that this could be an issue for mixed version clusters that have 6.latest and 7.0 nodes as the response format would depend on the version of the node that you are querying?

New installations will get deprecation warnings OOTB unless
@Mpdreamz I'm thinking that this could be an issue for mixed version clusters that have 6.latest and 7.0 nodes as the response format would depend on the version of the node that you are querying?

This is typically dealt with in the following fashion:
1) Add a flag to enable new functionality in the old version, while defaulting to disable. Requests without that flag will emit deprecation warnings.
2) New version only supports the flag in "enabled mode" and will omit deprecation logs when flag is used.

The suggestion above follows this pattern, which is good. The only difference with what we do normally is the duration of deprecation. Normally it will be scoped down to something like OLD_MAJOR.latest. With the above suggestion it's NEW_MAJOR.x , which is a long time. I wonder if we should look at the costs of backporting the changes made to 6.x so we can follow our standard path.

In the beginning this flag was supposed to exist on most APIs including some of the most used like the document APIs. This made me view the long deprecation period as a feature. Now that it's only about the mappings APIs, I could change my mind. One subtlety with backporting to 6.x is that it might still have 5.x indices that have multiple types, so include_type_name=false would not make sense on such indices and we'd need to respond appropriately (error?).

I looked into auto-detecting PUTs of typeless mappings/templates to see if we can avoid the need for passing an include_type_name flag in the URL. It looks like this will be possible as we can check if the root object has a value called "properties". Originally I thought that real estate agents with a doc type called "properties" might be an edge case which would cause ambiguity. Fortunately 6.5 has a parsing bug that prevents you from creating doc types called "properties" (try it and see).

Fortunately 6.5 has a parsing bug that prevents you from creating doc types called "properties"

Turns out there's more to auto-detecting no-types than checking for the presence of a top-level properties field. Legal mappings can have no properties but include other top-level attributes e.g. _source:{enabled:false} or dynamic_templates:{...}.

While it would have been nice to auto-detect typeless mappings and provide the 7.0 examples in reference docs without any types this would not be consistent with the results of GET _mapping calls which would return types by default (unless the include_type_name=false param is passed).

Does auto-detection still seem useful or shall we insist on documenting APIs with include_type_name=false params?

Are we actually doing our users a disservice by holding on to a deprecation period?

We had a few discussions internally, and decided to introduce include_type_name in 6.7, so we can default it to false in 7.0. I've updated the plan in the issue description accordingly.

When an index template is stored, the mappings are nested under the type name. We also need to consider how these will be accessed and upgraded.

I logged https://github.com/elastic/elasticsearch/issues/38637 to further address this point.

All items have been completed.

Was this page helpful?
0 / 5 - 0 ratings