Right now when a reindex finishes if the user has set wait_for_completion=false or the http request has timed out then the reindex's status is gone. We should store the status on successful completion. We should also store it periodically in case the request dies or the node is kill -9ed or something.
In all probability this will actually be a general feature of task management rather than specific to reindex. Reindex is the first task that isn't always ephemeral (like search) and doesn't create an artifact from which you could read the status (like snapshot and restore). It'd be lovely to have something unified rather than reindex specific. We just have reindex and, partially, snapshot/restore to work from when designing it.
One way to implement this is to write the status to an index both during task execution (to survive kill -9) and after it is completed (so it is available after the task is complete). A couple of (mostly stream of consciousness) points from about that potential implementation:
running or done or canceled or something.Could an alternative be storing the last X reindex jobs in the cluster state, and have this be a rotating buffer? Keeping full history seems like something a user could do themselves, and maintaining an entire index for this seems very heavyweight.
That's certainly an alternative. My understanding is that keeping this in the cluster state is more heavyweight because it has to be pushed to all the nodes. The index only needs to be on a couple of nodes.
The index only needs to be on a couple of nodes.
Then you get into all kinds of craziness with how many replicas do you have, and other index settings like allocation awareness. This should be small in the cluster state, and with cluster state diffs it's a minor change.
And tasks are stored in the cluster state right? So really there will be an update anyways when the task completes, so moving it to a parallel section for recently completed tasks would happen in the same cluster state update.
I don’t think we should start storing everything in the cluster state. Last I heard (correct me if I’m wrong) - the plan is to have progress status as a general feature in the task API. We don’t know how many that will be.
The cluster state should store anything that seems like a configuration that need to be available on all nodes and be fast retrievable (like scripts ). Doing that with an index is complicated. All other things should go into an index, which is much simpler and has other advantages (like free search and aggregations). It can be single shard, single replica one. Not a big one.
On 09 Mar 2016, at 22:46, Ryan Ernst [email protected] wrote:
The index only needs to be on a couple of nodes.
Then you get into all kinds of craziness with how many replicas do you have, and other index settings like allocation awareness. This should be small in the cluster state, and with cluster state diffs it's a minor change.
—
Reply to this email directly or view it on GitHub.
@nik9000 can you comment on where we stand on this one? Thanks.
We should store the status on successful completion.
Done.
We should also store it periodically in case the request dies or the node is kill -9ed or something.
Not done and I don't think we have any plans to do it.
Most helpful comment
In all probability this will actually be a general feature of task management rather than specific to reindex. Reindex is the first task that isn't always ephemeral (like search) and doesn't create an artifact from which you could read the status (like snapshot and restore). It'd be lovely to have something unified rather than reindex specific. We just have reindex and, partially, snapshot/restore to work from when designing it.