Elasticsearch: Causes of XContentParseException should be included in search for root_cause

Created on 30 Apr 2018  路  7Comments  路  Source: elastic/elasticsearch

Following #29373 parse exceptions are now XContentParseException rather than ParsingException. This has a major effect on the determination of the root_cause of an exception that is found during parsing due to the definition of root_cause.

The root_cause is determined by starting at the outermost exception and recursing through the causes until an exception is found that is _not_ an ElasticsearchException. Then:

  • If the outermost exception was not an ElasticsearchException then it is the root_cause
  • Otherwise the most deeply nested ElasticsearchException is the root_cause

ParsingException extends ElasticsearchException whereas XContentParseException does not.

This means that if a validation error occurs during parsing then following #29373 the most general parsing exception is reported as the root cause. Prior to #29373 the most specific parsing exception, i.e. most likely the one detailing the validation error, was considered the root_cause.

As a concrete example, this is a validation error after #29373:

{
    "error": {
        "root_cause": [{
            "type": "x_content_parse_exception",
            "reason": "[1:144] [job_details] failed to parse field [analysis_limits]"
        }],
        "type": "x_content_parse_exception",
        "reason": "[1:144] [job_details] failed to parse field [analysis_limits]",
        "caused_by": {
            "type": "x_content_parse_exception",
            "reason": "Failed to build [analysis_limits] after last required field arrived",
            "caused_by": {
                "type": "status_exception",
                "reason": "categorization_examples_limit cannot be less than 0. Value = -1"
            }
        }
    },
    "status": 400
}

And this is the equivalent error before #29373:

{
  "error": {
    "root_cause": [
      {
        "type": "status_exception",
        "reason": "categorization_examples_limit cannot be less than 0. Value = -1"
      }
    ],
    "type": "parsing_exception",
    "reason": "[job_details] failed to parse field [analysis_limits]",
    "line": 11,
    "col": 3,
    "caused_by": {
      "type": "parsing_exception",
      "reason": "Failed to build [analysis_limits] after last required field arrived",
      "caused_by": {
        "type": "status_exception",
        "reason": "categorization_examples_limit cannot be less than 0. Value = -1"
      }
    }
  },
  "status": 400
}

If all you have to go on is the root_cause then [1:144] [job_details] failed to parse field [analysis_limits] is nowhere near as useful as categorization_examples_limit cannot be less than 0. Value = -1. The red error bars that Kibana shows when an error occurs only show the root_cause, so Kibana users suffer from this.

I think the solution that will keep the root_cause functionality as it was previously would be to consider both ElasticsearchException and XContentParseException when recursing through causes.

:CorInfrCore >bug v6.3.0

Most helpful comment

This is a regression with respect to user experience. It's the sort of issue that results in more support calls which, once raised, will be difficult to diagnose. What are the chances of a fix for 6.3?

We will assess and let you know.

We will get this fixed in 6.3.0.

All 7 comments

Pinging @elastic/es-core-infra

This seems like a legit thing to me. @dakrone do you want this or should I have a look?

This is a regression with respect to user experience. It's the sort of issue that results in more support calls which, once raised, will be difficult to diagnose. What are the chances of a fix for 6.3?

@nik9000 Would you pick this one up please?

This is a regression with respect to user experience. It's the sort of issue that results in more support calls which, once raised, will be difficult to diagnose. What are the chances of a fix for 6.3?

We will assess and let you know.

@nik9000 Would you pick this one up please?

Sure. I'll start right now.

This is a regression with respect to user experience. It's the sort of issue that results in more support calls which, once raised, will be difficult to diagnose. What are the chances of a fix for 6.3?

We will assess and let you know.

We will get this fixed in 6.3.0.

Was this page helpful?
0 / 5 - 0 ratings