Jetpack: Elastic Search Documentation and Usage Questions

Created on 2 Mar 2018  路  4Comments  路  Source: Automattic/jetpack

Hi there,

I've reached out several times during the last week via live chat and email but haven't seemed to receive answers to questions regarding Jetpack's Elastic Search feature (Professional Plan).

I'll list them here.

  1. Is there a way to test this in development mode or on a staging site? I understand that some of the ES happens on WordPress.com's end, but I cannot develop this on my client's production site, which receives a great deal of traffic.
  2. When ES is turned on, I notice a massive drop in site speed. Is there a way to debug (other than "turn off all plugins") this issue specifically with ES to figure out what's causing the site to suddenly slow down as soon as I turn on ES? When I turn it off, the site is fine. When I turn it on, the site gets extremely slow.
  3. On your documentation page, the filters for the JP search widget show checkboxes. This is not the case when I use the widget on a live site. The filters are wrapped in an unordered list and there is no visual indicator that a user has drilled down into the search like it's shown in your docs: https://jetpack.com/support/search/
  4. I'd like to test several search terms on the site and get back a log of the returned results to gauge how well ES is performing versus WordPress' normal search. How would I go about doing this?
  5. Filter quantities in the sidebar do not show up correctly. If I have added multiple filters to the sidebar and click one of them, then quantities in the filter parentheses are not accurate, often leading to Oops, this page is not found errors on the site.
  6. How is relevance calculated in ES and is there a way to modify the variables that influence relevance?
  7. There are currently three search options: Newest, Oldest, and Relevance. Does only Relevance use ES or do all of these options use ES?
  8. How do I know if site-wide content indexing is complete? One of the features of ES is "real-time indexing" and there's no indicator at all on the site that shows me content indexing has taken place. Also, does ES need to be turned on in order for content to be indexed via ES? I'd like to pre-index all content on the site before turning ES on.

I'm not able to share the site link on this issue due to confidentiality. How can I get the more specific issues resolved that do not involve live chat?

Thanks.

Search [Status] Needs Author Reply

Most helpful comment

FWIW sync (not indexing) status can be seen here: https://wordpress.com/settings/manage-connection

All 4 comments

Hi Philip,

Excited to see you giving JP Search a try. Great questions, sorry you ran into some blockers with live chat. It is all pretty new. If my answers below help I'll try and turn some of this into FAQs for the docs.

Did you manage to find https://developer.wordpress.com/docs/elasticsearch/ also? If not, I should make it more obvious somehow.

On to your questions:

  1. There is not a good solution other than having a publicly accessible staging site that has a Pro Plan on it. Some JP devs are using https://ngrok.com/ for local development that then can sync to WP.com. I know this isn't the best setup. I've opened an issue to try and find a better solution, if you have any more ideas on what you would want, can you put them there:

https://github.com/Automattic/jetpack/issues/8968

  1. Is the site speed only for searches, or everywhere? JP Search has support for https://wordpress.org/plugins/query-monitor/, can you install that and see if there is anything surprising. Depending on where your server is, there certainly can be a non-trivial delay getting the results. In (very early) analysis we are seeing 95th percentile queries being faster, but the median query being slower (all due to latency of going off-site). A lot though depends on how fast your MySQL server is. We have anecdotal reports of folks lowering their MySQL load. It is very hard to make a blanket statement here, but we should make sure you aren't running into a bug.

  2. checkboxes: this was a change in Jetpack 5.8. Maybe you are running a previous version? Or it is some incompatibility with the theme. Checkbox output is in https://github.com/Automattic/jetpack/blob/master/modules/search/class.jetpack-search-template-tags.php#L123

  3. I did a fair bit of this for the .org plugins search by just querying the public api for the site. My suggestion would be to use query monitor plugin to get the current query, and then run a script that calls the api repeatedly and outputs info about the posts returned. Here is the core part of the script I used: https://gist.github.com/gibrown/30e5d7dd43760f466ce792ae975e573a. I had a list of a few thousand queries that I would run and then I dumped the results to csv in order to examine them more easily in a spreadsheet. FYI, if you are looking at query relevancy, there is a bug we just found that will be fixed in 5.9: https://github.com/Automattic/jetpack/pull/8964

  4. Do you have an example? Ideally on a live site we can look at? This isn't a known problem.

  5. The query we use can be entirely replaced with filters and you can see how we build the query here: https://github.com/Automattic/jetpack/issues/7975. We want to add more filters for controlling parts of the queries also (https://github.com/Automattic/jetpack/issues/7975), but want to make sure we are building the correct thing.

  6. All sort options use ES.

  7. We don't currently have any indication of whether indexing is in progress unfortunately. For individual posts it should happen in seconds. It may feel slower if you repeat the same query because we cache all query responses on the api for 2 minutes. For bulk changes the indexing takes longer. It depends a lot on how much content. See https://data.blog/2017/07/11/real-time-elasticsearch-indexing-on-wordpress-com/

All sites connected to Jetpack are indexed in Elasticsearch. We use the same index that is used for Related Posts, so as soon as Jetpack is connected we start syncing and indexing everything. Currently there is nothing in the index that is different for a Pro plan site though I do expect that will change at some point (there are some features we want to build that we know we can't scale for all sites). When a site upgrades to Pro a bulk reindex is also started.

There is a little more background on this in https://github.com/Automattic/jetpack/issues/8024

Cheers, hope this helps.

FWIW sync (not indexing) status can be seen here: https://wordpress.com/settings/manage-connection

@gibrown and @Viper007Bond Thank you both! I'm back to work on Monday and will mull this over and also follow up with you on your answers. I really appreciate the detailed responses.

@philiparthurmoore going to close this out. Let me know if there is anything I can help with.

Was this page helpful?
0 / 5 - 0 ratings