Elasticsearch: Double backslash in Painless script strings causes a syntax error

Created on 29 Dec 2016  路  6Comments  路  Source: elastic/elasticsearch

Elasticsearch version: 5.0.2

Plugins installed: X-Pack

JVM version: 1.8.0_111-b15

OS version: CentOS 7.2.1511

Description of the problem including expected versus actual behavior:
It seems that using double backslash in Painless script string values causes a syntax error. I need to perform a comparison against values with double backslashes. Is there a workaround for this, am I doing something wrong or is it a bug?

NOTE: the original error message below contains double backslashes in '...['\A]...' and '...{ '\Archived' }...', but it gets parsed by GitHub into a single backslash.

Steps to reproduce:

GET myindex/_search
{
  "size": 50,
  "script_fields": {
    "new_field": {
      "script": {
        "lang": "painless",
        "inline": "if (doc['finalstatus.keyword'].value == '\\Archived') { 'Hard Archived' } else { doc['otherfield.keyword'] }"
      }
    }
  }
}

Provide logs (if relevant):

"failed_shards": [
      {
        "shard": 0,
        "index": "myindex",
        "node": "1LwGYO3RQB6tOYKB9Mgo_A",
        "reason": {
          "type": "script_exception",
          "reason": "compile error",
          "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "unexpected character ['\\A].",
            "caused_by": {
              "type": "lexer_no_viable_alt_exception",
              "reason": "lexer_no_viable_alt_exception: null"
            }
          },
          "script_stack": [
            "... ectDownloadCompleted') { '\\Archived' } else { doc[ ...",
            "                             ^---- HERE"
          ]
...

Most helpful comment

Looks like a bug, yeah. I reproduced the problem you are having over REST and now in a unit test. I suspect you can work around it for now with '\\\\Archived'.substring(1). I'll see about opening up a PR to fix it.

All 6 comments

I believe this to be a json encoding issue. Because the script and the error response are both encoded in json the \\ becomes \ on the way in and the \ in the error message become \\ on the way out. Can you try something like \\\\ to get a literal \ in a string? That'd get transformed by json parsing to \\ and then the painless lexing should interpret that as a \.

Still, I think it be worth me adding an example of this to painless's docs as we expect lots of folks to use the script inline. And probably to fix up the error message for illegal \ sequences to mention something about json encoding.

I just tried this locally and \\\\ seems to work. This two layers of escaping thing is giving me nasty flashbacks to writing shell scripts....

Hi, thanks for picking this up!

I've tested \\\\ and it works (runs without errors), but the comparison doc['Status.keyword'].value == '\\\\Archived' doesn't work as planned, i.e. it skips the values '\\Archived'.

I got a very similar problem in Logstash and actually had to modfiy a filter plugin code to make it accept and parse similar values: https://github.com/logstash-plugins/logstash-filter-elasticsearch/issues/55

Let me play with it some more....

Looks like a bug, yeah. I reproduced the problem you are having over REST and now in a unit test. I suspect you can work around it for now with '\\\\Archived'.substring(1). I'll see about opening up a PR to fix it.

Thanks for confirming this and taking time to check!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

brwe picture brwe  路  3Comments

DhairyashilBhosale picture DhairyashilBhosale  路  3Comments

ttaranov picture ttaranov  路  3Comments

jasontedor picture jasontedor  路  3Comments

clintongormley picture clintongormley  路  3Comments