Readthedocs.org: Upgrade elastic search to 7.x

Created on 22 Apr 2019  ·  20Comments  ·  Source: readthedocs/readthedocs.org

Most helpful comment

@stsewd
Yes... I am aware of that.
All tests are passing including the search tests. :smile:

All 20 comments

We are using django-elasticsearch-dsl and unfortunately it is not actively maintained (last commit was on 8th Nov 2018).

What should we do in this case? I can see three options.

  • Wait for the update of django-elasticsearch-dsl.
  • Find another library.
  • Switch to only using official low-level library (elasticsearch-py), which is updated, but this involves lot of work.

Edit: django-elasticsearch-dsl is updated. :tada:

Just a note.
My elasticsearch version is:

$ curl localhost:9200
{
  "name" : "j9iyXmN",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "b4kGzEhFSoiZufVXVlERfg",
  "version" : {
    "number" : "6.7.1",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "2f32220",
    "build_date" : "2019-04-02T15:59:27.961366Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

And all the tests pass.

To execute the elastic search tests, you need to pass an extra option to tox
tox -r -e py36 --including-search

@stsewd
Yes... I am aware of that.
All tests are passing including the search tests. :smile:

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@dojutsu-user is there anything actionable on this issue now? I'm not sure, but I think it's not possible to upgrade now. In that case, we should add why it's not possible to upgrade and what are the problems here to track them, and propose a plan --or close it, instead of having it open without adding value.

@humitos
I don't think upgradation should pose any problems.
During the whole gsoc period, I have been using Elasticsearch 6.7

$ curl localhost:9200
{
  "name" : "j9iyXmN",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "b4kGzEhFSoiZufVXVlERfg",
  "version" : {
    "number" : "6.7.2",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "56c6e48",
    "build_date" : "2019-04-29T09:05:50.290371Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

I just read the django-elasticsearch-dsl is going to have a new release pretty soon -- https://github.com/sabricot/django-elasticsearch-dsl/issues/177#issuecomment-509733539 (But not for ES version 7)
I think we have to wait for few days until django-elasticsearch-dsl starts supporting ES v7

I am unblocking this as https://github.com/sabricot/django-elasticsearch-dsl/issues/170 is closed and django-elasticsearch-dsl is supporting elasticsearch version 7 (https://pypi.org/project/django-elasticsearch-dsl/)

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

We received an email from ES that we need to migrate to a recent version, since 6.x is EOL, this isn't low priority anymore.

Just checked, we are running v6.5.4 in production, we need to update to 6.8.12 before updating to a mayor version.

Changelog for 6.6, 6.7 and 6.8

We are good to upgrade from 6.5 to 6.8. And we don't need a re-index or downtime.

Migration between minor versions — e.g. 6.x to 6.y — can be performed by upgrading one node at a time.

https://www.elastic.co/guide/en/elasticsearch/reference/6.8/breaking-changes.html

A rolling upgrade allows an Elasticsearch cluster to be upgraded one node at a time so upgrading does not interrupt service.

https://www.elastic.co/guide/en/elasticsearch/reference/6.8/rolling-upgrades.html

How to deploy avoiding downtime

Before the deploy

This can be done a day or two before the deploy

  • Create new deploy in ES cloud with ES 7.x
  • Change the ops repo to point to the new ES host
  • Deploy web-extra or a new instance with the code from ES 7.x
  • Trigger a re-index to the new deploy

During/after the deploy

  • Deploy the new instances using 7.x. Here we will have two instances running, one for 6.x and the other one with 7.x (but each one will be pointing to a different deploy in ES cloud)
  • When only the 7.x instances are running, trigger a re-index.
    Here we only need to re-index the projects with new builds from the last 24/48 hours,
    we can use the script from https://github.com/readthedocs/readthedocs.org/issues/5620#issuecomment-716760493
  • Make sure everything is working
  • Delete the old deploy

This won't cause downtime, but it will give outdated results from a time period
(while we deploy the new instances and re-index).
We could communicate this to users beforehand if we want.

@stsewd on the re-index during "deploy", we should only need to reindex the past 1 day of data, right? So that should be pretty quick. I think this plan sounds good to me. The full reindex might take somewhere around 8-10 hours tho, so we should plan ahead for that.

on the re-index during "deploy", we should only need to reindex the past 1 day of data, right?

Yes, I'll see if I can change the management command to accept that argument or just write a script

Pretty sure it already supports this, or we have some kind of code that can handle it already.

Yea, I have this in my notes:

from datetime import datetime, timedelta
from readthedocs.search.documents import PageDocument
from readthedocs.search.utils import index_new_files

kwargs = {'hours': 48}
since = datetime.now() - timedelta(**kwargs)

ps = Project.objects.filter(versions__builds__date__gte=since).distinct()
print("Indexing %s" % len(ps))
for project_obj in ps:
  for version_obj in project_obj.versions.filter(active=True, built=True):
    index_new_files(HTMLFile, version_obj, build=version_obj.builds.latest().pk)

Something similar should work.

Great, I have updated my comment with that.

Great -- the only other thing we should consider is what QA will look like on the new vs old cluster. We've had issues in the past with reindexing, so it would be good to have 5-10 queries that we want to test to make sure the results look similar. In particular, the number of results for broad searches, and also the range of versions.

Some of this is that we don't do a great job of cleaning up our indexes. So the current index certainly have some invalid/old/deleted data, but we also need to make sure we aren't missing important things.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

JiaweiZhuang picture JiaweiZhuang  ·  3Comments

jaraco picture jaraco  ·  4Comments

davidfischer picture davidfischer  ·  4Comments

SylvainCorlay picture SylvainCorlay  ·  3Comments

dxgldotorg picture dxgldotorg  ·  3Comments