Redash: Can't search query correctly with non-ASCII chars

Created on 23 Jun 2018  Â·  8Comments  Â·  Source: getredash/redash

Issue Summary

Can't search query correctly with non-ASCII chars.

Steps to Reproduce

  1. Make query which has non-ASCII chars name or description
  2. Search with non-ASCII chars

e.g.

There is a query which has non-ASCII chars ユーザ.

all_queries

Search with ユーザ, then no quries in the result.

search_query1

When I search with ユー, then hit correctly.

search_query2

I guess that Query.search(in #2041) has changed this behavior. But I have no idea what we should fix it with keeping full text search feature.

Technical details:

  • Redash Version: master
  • Browser/OS: Version 67.0.3396.87 (Official Build) (64-bit)
  • How did you install Redash: Docker
Backend Bug

Most helpful comment

@deecay adding an option to enable simpler search sounds good to me. Considering the global usage of Redash, I expect this to be popular enough to put in Organization Settings UI.

All 8 comments

Possibly related to #2618.

Yeah, the recently udpated query full text search is based on Postgres' built-in textsearch extension which will use the "simple" configuration (parsers, templates, dictionaries) which only applies lower case and removes stop words from the content body while searching.

Unfortunately by default it only comes with support for a few Indo-European languages and misses others such as Korean, Japanese and Chinese (and more).

To add support for this, we'd need to add additional support for those languages, for example PGroonga, which supports all languages, but requires a 3rd party extension. The tutorial gives an idea how this would like, including for example the ability to just keep using ILIKE queries.

Alternatively we could move away the FTS from using Postgres altogether and switch to one of the many alternative search engines such as Elasticsearch, but that comes with a non-trivial amount of architectural changes.

Alternatively we could move away the FTS from using Postgres altogether and switch to one of the many alternative search engines such as Elasticsearch, but that comes with a non-trivial amount of architectural changes.

I wouldn't want to have ES as a mandatory dependency in Redash as it will make deployments harder. But maybe we can make this functionality pluggable:

  1. Have a hook for "index new content" (dashboard / query / other in the future) and "index updated content".
  2. Have an interface for performing a search.

By default the two will use Postgres, but will have additional implementation using ES, Algolia, other.

It would complicate the list views a bit since it's written right now to not differ between searching and just fetching the list of all items. I guess the API handlers can provide the interface to cater to that and ask the search backend to provide a list of item model IDs in the order of the search ranking and then fetch the appropriate date model items from the data base.

While I don't think it will be a huge deal, there is some overhead involved that we should probably be testing. E.g. support for pagination in the search backend would seem like a good idea.

BTW, would you consider making this something to be distributed in the Redash core, or as extensions?

The list view is the least of the complications this will create :) I'm more worried about permissions and similar concerns, data sync (between search engine and database) and other.

And yes, this can be an extension.

Hi @jezdez,

Could we bring back the naive and slow LIKE search as an option? Maybe an ENV variable LEGACY_FULL_TEXT_SEARCH or something to switch between two ways of searching?

For me, being able to search in multi-byte is far more critical than having faster and more modern tsvector textsearch.

@deecay adding an option to enable simpler search sounds good to me. Considering the global usage of Redash, I expect this to be popular enough to put in Organization Settings UI.

Was this page helpful?
0 / 5 - 0 ratings