Efcore: SQLite full-text search support

Created on 17 Mar 2016  路  9Comments  路  Source: dotnet/efcore

SQLite's FTS3 and FTS4 (full-text search) extension modules allow users to create special tables with a built-in full-text index. This extension is usually enabled by default. (I checked: most platforms we support have a version of SQLite compiled with ENABLE_FTS3).

To leverage this, the create table syntax and query syntax are slightly different.

CREATE VIRTUAL TABLE enrondata1 USING fts3(content TEXT);     /* FTS3 table */
CREATE TABLE enrondata2(content TEXT);                        /* Ordinary table */
SELECT count(*) FROM enrondata1 WHERE content MATCH 'linux';  /* FTS search -- fast */
SELECT count(*) FROM enrondata2 WHERE content LIKE '%linux%'; /* regular search -- slow */

To enable this in EF, we would need to add some kind of configuration option to specific which tables are FTS and make query respond accordingly.

In the meantime, users can work around this with MigrationBuilder.Sql and FromSql.

More docs: https://www.sqlite.org/fts3.html

area-query area-sqlite needs-design type-enhancement

Most helpful comment

FYI, I finally wrote a blog post about SQLite Full-Text Search and EF Core.

All 9 comments

Un-combining.

You are talking about fts 3 and 4, will you also support 5?

Now that SQLitePCL.raw supports FTS5, we'd probably just start with support for that.

Some notes:

  • You create an FTS-enabled table instead of an FTS index
  • You can use the match() function to search
    sql SELECT * FROM email WHERE match($query, body)
  • To search all columns, pass the table identifier as the second argument (super weird)
    sql SELECT * FROM email WHERE match($query, email)
  • The bm25() (rank), highlight() and snippet() functions also take the table identifier as an argument

Daydreaming about this some more:

LINQ like the this...

from e in db.Emails
where EF.Functions.Match(e, query)
orderby EF.Functions.Bm25(e)
select new
{
    e.Id,
    Subject = EF.Functions.Highlight(e.Subject, "<b>", "</b>"),
    Body = EF.Functions.Snippet(e.Body, "<b>", "</b>", "...", 64)
};

...would produce SQL like this.

select
    id,
    highlight(email, subject, '<b>', '</b>') as subject,
    snippet(email, body, '<b>', '</b>', '...', 64) as body
from email
where match(email, $query)
order by bm25(email);

Notice how expressions like Highlight(e.Subject, ...) are translated into highlight(email, subject, ...) and you could write either Match(e, ...) (translates to match(email, ...)) or Match(e.Body, ...) (translates to match(body, ...)).

Obviously, Match, Bm25, Highlight, and Snippet cannot be evaluated on the client and should throw.

Note, some functionality is already possible in EF Core today.

  • You can abuse the == operator instead of using match()
  • You can map the hidden rank column instead of using bm25()
  • Of course, you can always drop down to FromSql and write the queries yourself
from e in db.Emails
where e.Body == query
orderby e.Rank
select e;

@bricelam fts5 has a boatload of alternate/equivalent syntaxes you can use; one of them is to use the name of the table as a scalar function accepting a string. Shouldn鈥檛 it be possible to just map it as a scalar db function with that support recently in EF Core oob?

Eg

Select * from Table t 
Join Search
On t.rowid = search.rowid
Where search(@keywords)
Order by rank ASC

In ADO.NET parlance. If you add the virtual Rank column (already doable) and the scalar function is as I suspect supported, then that makes things easier.

Support for NEAR sucks because it uses the ugly quoted string within a quoted string format, but perhaps there鈥檚 an alternative syntax there?

FYI, I finally wrote a blog post about SQLite Full-Text Search and EF Core.

Was this page helpful?
0 / 5 - 0 ratings