Elasticsearch: Add `response_transform_script` to allow xpath style selection of response elements

Created on 22 Aug 2014  路  8Comments  路  Source: elastic/elasticsearch

Our JSON responses are currently hard coded. If you only need one piece of information you need to retrieve, and parse, a fairly lengthy JSON document to get it. Although the browser is generally good at this, large responses create problems and require moving data over a, potentially slow, wire, only to throw it away immediately. It would be really nice to only get from the backend (Elasticsearch) the data that is required.

This issue has cropped up in a few places, eg #2149, #7350, #7330 each of which have slightly different requirements. Instead of supporting multiple options to turn certain parameters on or off, it makes sense to provide a single generic solution that is flexible enough to solve all of these problems.

The best solution that we have found is to use GPath:

GPath is a path expression language integrated into Groovy which allows parts of nested structured data to be identified. In this sense, it has similar aims and scope as XPath does for XML.

We will add a response_transform_script parameter on all APIs (in the body or in the query string) which provides the full JSON response as a GPath object (eg _response) which can be manipulated at will. For instance, to return just the _source fields from the hits array from a search request, you could do:

GET /_search
{
    "response_transform_script: {
        "script": "_json.hits.hits.collect { it._source }"
    }
}
>feature help wanted

Most helpful comment

+1 on adding a feature that would allow me to remove the "_index", "_type", "_id" and "_score" fields from the output. the output is needlessly bloated by this sort of chaff

All 8 comments

In fact, would be nice to support "path selection" in the same way as we do with source filtering

Perhaps we should add a _response param which works like _source, and allows you to include/exclude paths (with wildcards). The script option could then be run on the response body as it is after response filtering (if any).

@clintongormley

Is this feature for retrieving data has been added ?

With the removal of groovy by default I'm fine with something as simple as JSONPath

@rashidkpc Hey, any progress ?? as it hampers the performance to filter metadata after retrieval.

+1 on adding a feature that would allow me to remove the "_index", "_type", "_id" and "_score" fields from the output. the output is needlessly bloated by this sort of chaff

Is this still under considerations.

If not can some one share a way to remove the _index, _type, _id and _score attributes from hits and get only the source Like:

hits:{
    [{ 
        name: 'Bruce'
        age: '70'
      }, { 
         name: 'Obie one'
         age: '79'
       }]
}

@hirenhcm See the linked PR #10980

Was this page helpful?
0 / 5 - 0 ratings