Elasticsearch: _reindex from remote and its bugs

Created on 7 Dec 2016 · 8Comments · Source: elastic/elasticsearch

I'm trying to mass-reindex from remote. As elasticdump is "unsupported", I was trying to use so-called "reindex from remote" feature. It failed in following aspekts:

It looks like it's impossible to not specify destination index. I need to transfer multiple indexes from remote. I can't specify them one by one. I need multi-index notation, like index: blah-*-2016-* . It looks like this isn't possible without some dirty scripting. It would be wonderful if all this would happen under the hood.

Next, I was trying to use the documented API.

GET _tasks/Af0W-dC3QQSlJ28uRru0fQ:8488
=>
{
  "completed": false,
  "task": {
    "node": "Af0W-dC3QQSlJ28uRru0fQ",
    "id": 8488,
    "type": "transport",
    "action": "indices:data/write/reindex",
    "status": {
      "total": 0,
      "updated": 0,
      "created": 0,
      "deleted": 0,
      "batches": 0,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
        "bulk": 0,
        "search": 0
      },
      "throttled_millis": 0,
      "requests_per_second": -1,
      "throttled_until_millis": 0
    },
    "description": "",
    "start_time_in_millis": 1481112063773,
    "running_time_in_nanos": 650222891445,
    "cancellable": true
  }
}

GET .tasks/task/8488
=>
{
  "_index": ".tasks",
  "_type": "task",
  "_id": "8488",
  "found": false
}

It should've been found as per docs.

GET /_tasks/taskId:8488
=>
{
  "error": {
    "root_cause": [
      {
        "type": "resource_not_found_exception",
        "reason": "task [taskId:8488] isn't running or stored its results"
      }
    ],
    "type": "resource_not_found_exception",
    "reason": "task [taskId:8488] isn't running or stored its results"
  },
  "status": 404
}

It's also unclear what's going on, was the task stalled, hung, connecting? Why is it taking so long without any movement? I'm on v5.0.2. Thanks.

:DistributeCRUD discuss

Source

celesteking

Most helpful comment

According to REST guidelines, you should've used DELETE _tasks/task/$id and problem solved. This is not the first time I see inconsistencies in API.

I will stop logging issues from now on (or helping in any way) and will probably switch over another tool for log storage. This thing is not production ready. Period.

celesteking on 7 Dec 2016

🎉5

All 8 comments

Also, docs are way far from usable:

  "source": {
    "index": "metricbeat-*"
  },
  "dest": {
    "index": "metricbeat"
  },
  "script": {
    "lang": "painless",
    "inline": "ctx._index = 'metricbeat-' + (ctx._index.substring('metricbeat-'.length(), ctx._index.length())) + '-1'"
  }

You're assigning index to itself, basically. this wont' work. maybe you meant ctx._source._index ? This is really hard for a newbie guys.

update: ctx._index = ctx._source._index didn't work. Nothing works. This is bullshit.

celesteking on 7 Dec 2016

👍1

You're assigning index to itself, basically. this wont' work. maybe you meant ctx._source._index ? This is really hard for a newbie guys.

Did you try it ? I just did and it works like a charm. It basically does what you're asking below:

It looks like it's impossible to not specify destination index

Yes it's possible with what you call "dirty scripting". Sorry if you don't like scripts but that's the way it works.

update: ctx._index = ctx._source._index didn't work. Nothing works. This is bullshit.

Yes it doesn't work if you invent new syntax, what works is what's documented. I'll ignore the last part of your comment since I understand your frustration but please note that being aggressive doesn't solve anything ;)

Regarding the hang, I can reproduce if the source node is not responding. I tested with the source node down and in that case the reindex blocks and there is no way to access the hanging task in the destination node.
For this reason I'll leave this issue open but I am sure that @nik9000 has a solution for this.

jimczi on 7 Dec 2016

.tasks / _tasks (why 2 of them?) API is unusable, reread my comment -- it just doesn't work, even for tasks that completed fine.

As regarding the main problem, yes, I tried specifying the script exactly as per doc. It doesn't work, it's trying to log to "metricbeat" , not to "metricbeat-2016-12-07". I can provide access to our cluster so that you can try it.

celesteking on 7 Dec 2016

👍1

@celesteking to echo @jimczi's comment - this isn't your first issue where you get really aggressive. Seriously, instead of telling us how shit it all is, just point out the problems you're having and we'll try to help you. The aggression just makes us want to look at the other 1,000 open issue instead of yours.

.tasks / _tasks (why 2 of them?)

.tasks is the index where the info is stored, and the docs are trying to point out that you'll need to delete data from this index at some stage in the future so that it doesn't use too much space. Nowhere does it tell you to do GET .tasks/task/ID

_tasks is the API and it should be GET _tasks/Af0W-dC3QQSlJ28uRru0fQ:8488 as you used in the first example, not GET /_tasks/taskId:8488. I can see how these docs could be confusing and will improve that.

clintongormley on 7 Dec 2016

👎1

According to REST guidelines, you should've used DELETE _tasks/task/$id and problem solved. This is not the first time I see inconsistencies in API.

I will stop logging issues from now on (or helping in any way) and will probably switch over another tool for log storage. This thing is not production ready. Period.

celesteking on 7 Dec 2016

🎉5

Regarding the hang, I can reproduce if the source node is not responding. I tested with the source node down and in that case the reindex blocks and there is no way to access the hanging task in the destination node.
For this reason I'll leave this issue open but I am sure that @nik9000 has a solution for this.

@jimczi, do you know if the reindex was running on the source node? I can imagine a situation where you start a reindex, and then shoot the node that the reindex was running on before it finishes. The task get action notices that the node is no longer running and looks in the tasks index. If it doesn't find the task it then it reports that error message. And it won't find it because the reindex didn't complete before the node left. I think I need to add a better error message to that.

As to the question of multi-source, multi-destination: this comes a fair bit but I think the script solution is fine. The reason you might not want to do this at all is that you probably want to manage the process of creating each of the sub-indexes so you have progress and an easy way to pick up where you left off and things like that. If you do want to do it you can use the script. It tested on every build so it is going to work.

nik9000 on 8 Dec 2016

@jimczi and I talked - trying to reindex from remote from a node that refuses the connection indeed hangs. The reindex process is still in the tasks API and can be found with curl 'localhost:9200/_tasks?pretty&detailed&actions=*reindex'. I'll have a look at that this afternoon.

nik9000 on 8 Dec 2016

@jimczi, do you know if the reindex was running on the source node? I can imagine a situation where you start a reindex, and then shoot the node that the reindex was running on before it finishes. The task get action notices that the node is no longer running and looks in the tasks index. If it doesn't find the task it then it reports that error message. And it won't find it because the reindex didn't complete before the node left. I think I need to add a better error message to that.

I merged #22062 just now to improve the error message when the node isn't part of the cluster any more.

@jimczi and I talked - trying to reindex from remote from a node that refuses the connection indeed hangs. The reindex process is still in the tasks API and can be found with curl 'localhost:9200/_tasks?pretty&detailed&actions=*reindex'. I'll have a look at that this afternoon.

I merged #22061 this morning to fix the hang.

nik9000 on 9 Dec 2016

Was this page helpful?

0 / 5 - 0 ratings