Elasticsearch: Aggregating on _field_names meta field not supported in ES 5.x

Created on 12 Jan 2017 · 8Comments · Source: elastic/elasticsearch

Elasticsearch version:
5.x

Description of the problem including expected versus actual behavior:
In Elasticsearch 2.4.x you were able to aggregate on the _field_names field as documented here:
https://www.elastic.co/guide/en/elasticsearch/reference/2.4/mapping-field-names-field.html#CO176-2
Aggregating on the _field_names field

I notice that this has been removed in the 5.x documentation. Is it no longer supported?

An example query:

{
  size: 0,
  aggregations: {
    schema: {
      terms: {
        field: '_field_names',
        size: 0
      }
    }
  }
}

Is there another way you can achieve this kind of query in ES 5?

Here is the error message from ES 5.0.0

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Fielddata is not supported on field [_field_names] of type [_field_names]"
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "trogdor",
        "node" : "omdSogIxQreKVpWOVFb38Q",
        "reason" : {
          "type" : "illegal_argument_exception",
          "reason" : "Fielddata is not supported on field [_field_names] of type [_field_names]"
        }
      }
    ],
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "Fielddata is not supported on field [_field_names] of type [_field_names]"
    }
  },
  "status" : 400
}

Source

stevewillard

😕3

Most helpful comment

This would actually be a nice feature. The use case is that I'd like to know all of the possible field names for the results of a query/filter, so that I can provide the user with a list of possible fields to narrow down their search by.

robert-blankenship on 4 Jun 2018

👍5

All 8 comments

Hi @stevewillard

No, the _field_names field has been locked down and is only indexed, it doesn't support fielddata (memory intensive) or doc values, which would require writing more data to disk which almost nobody would use.

To get counts of docs which have a particular field, you can run an exists query on the fields you're interested in.

clintongormley on 12 Jan 2017

😕6

Hi @clintongormley,

I was looking for a similar query to get all the mapping fields an index holds. As we index documents whose structure we don't control (customers' data), we don't know in advance which fields we are going to need.

Is there any way for getting all the fields of an index in a single query?

What I currently use is this:
http://{ES_IP_ADDRESS}:9200/{INDEX_NAME}/{DOCUMENT_TYPE}/_mapping/field/*?ignore_unavailable=false&allow_no_indices=false&include_defaults=true

But it seems to return only some of the results.

redlus on 19 Sep 2017

👍4

Is there a way to tell ES that we would actually like fielddata enabled for some of these special fields? For example, we were using _version in a function_score to help boost by item popularity (much like this discussion question), but can no longer do that in ES 5.x.

dan-blanchard on 8 Feb 2018

robert-blankenship on 4 Jun 2018

👍5

The "locking down" is a serious and depressing regression.

In v2, we were able to aggregate on _field_names to produce histograms of incidence of use in our corpus for various fields. We have too many fields to practically submit queries for each field _individually_.

The fact that something is "seldom" is not an argument for removing an existing capability which is harmless when unexercised. :(

ximm on 12 Jun 2018

The fact that something is "seldom" is not an argument for removing an existing capability which is harmless when unexercised. :(

It's not harmless when it adds 10% overhead to indexing rate.

clintongormley on 12 Jun 2018

Touché. That is unfortunate and not a price worth paying.

Meta-data like this is useful for our situation because we have indices based on 'open' (user-extensible) schema, which need to be in a common index; we have many K fields... it is very useful to be able to survey those to tune successive versions of the index, e.g. to decide which fields merit their own mappings and which are consigned to a catch-all field with fixed mapping.

But we will script it. :/

ximm on 12 Jun 2018

It still would be nice to have this as an optional parameter to enable or disable or a plugin. I am trying to find all applicable field names for a sub-set of documents. For documents we insert that's no problem but there are certain documents that we upsert and we would have to do either an upsert completely by script or one with a document upsert and then a script to update our calculated field names after which isn't ideal for speed.