Efcore: Enhancements to full-text seach

Created on 4 Oct 2018  路  5Comments  路  Source: dotnet/efcore

SQL Server provides an option for full-text CONTAINS that includes all indexes defined in a full-text index (rather than specifying each index separately): CONTAIN(*, , [LANGUAGE ]). EF Core does not currently support this use case.

Although the ideal solution would be to provide a provider-agnostic extension to define a "contains from any indexed field" extension to Linq, that would require a significant refactor of the full-text search logic. I have implemented an extension method EF.Functions.ContainsAny(...) that will replace the string field represented by the member accessor in the first parameter with *.

The expression
``` c#
var result = await context.Employees
.Where(c => EF.Functions.Contains(c.Title, "Representative"))

will generate (as currently implemented in 2.2)
``` sql
SELECT [c].[EmployeeID], [c].[City], [c].[Country], [c].[FirstName], [c].[ReportsTo], [c].[Title]
FROM [Employees] AS [c]
WHERE CONTAINS(c.Title, N'Representative')

While the (proposed new) expression
``` c#
var result = await context.Employees
.Where(c => EF.Functions.ContainsAny(c.Title, "Representative"))

will generate (as currently implemented in 2.2)
``` sql
SELECT [c].[EmployeeID], [c].[City], [c].[Country], [c].[FirstName], [c].[ReportsTo], [c].[Title]
FROM [Employees] AS [c]
WHERE CONTAINS(*, N'Representative')

This only addresses the CONTAINS case, although the FREETEXT case would be trivial to implement as well.

area-query customer-reported type-enhancement

Most helpful comment

Hello @milomconsulting. Thanks you very much for submitting this proposal. Indeed, I believe it would be very good to add support for passing '*' to the free text methods. Here is some feedback:

  1. According to the documentation (in this case for FREETEXT, but it should work the same for CONTAINS):

    If more than one table is in the FROM clause, * must be qualified by the table name.

    I would recommend you try simply always qualifying the * with the table alias, the same way we do now for the columns.

  2. During our triage meting, @bricelam suggested that the name ContainsAny isn't very clear and that maybe we should just extend the existing Contains and FreeText methods to support another pattern instead . We could simply support translating
    ``` c#
    var result = await context.Employees
    .Where(c => EF.Functions.Contains(c, "Representative"))

    ``` sql
    SELECT [c].[EmployeeID], [c].[City], [c].[Country], [c].[FirstName], [c].[ReportsTo], 
       [c].[Title]
    FROM [Employees] AS [c]
    WHERE CONTAINS([c].*, N'Representative')
    

    The only issue with that is that the existing extension method only accepts a property of type string in its signature (making it type-safe seemed a good idea, because we currently only allow a reference to single property anyway). We should be able to overcome that by adding a new overload that takes object or an unconstrained T generic argument (we can later consider removing the string overload in the next major release, 3.0, since it would provide very little value).

  3. Besides a single column and *, both FREETEXT and CONTAINS support passing a column list. There are bonus points for translating anonymous types (which is a columns idiom in LINQ for this kind of use cases):
    ``` c#
    var result = await context.Employees
    .Where(c => EF.Functions.Contains(new {c.FirstName, c.Title}, "Representative"))

    ``` sql
    SELECT [c].[EmployeeID], [c].[City], [c].[Country], [c].[FirstName], [c].[ReportsTo], 
       [c].[Title]
    FROM [Employees] AS [c]
    WHERE CONTAINS(([c].[FirstName], [c].[Title]), N'Representative')
    

    This can be done with the same new signature we would add for *.

  4. Besides all of that, CONTAINS also supports searching for document properties using the PROPERTY(property_name, column_name) syntax. If we ever want to add support for that, I would suggest in that case we do add a separa method, e.g. ContainsProperty.

I would treat * as the highest priority.

All 5 comments

@divega to advise.

Hello @milomconsulting. Thanks you very much for submitting this proposal. Indeed, I believe it would be very good to add support for passing '*' to the free text methods. Here is some feedback:

  1. According to the documentation (in this case for FREETEXT, but it should work the same for CONTAINS):

    If more than one table is in the FROM clause, * must be qualified by the table name.

    I would recommend you try simply always qualifying the * with the table alias, the same way we do now for the columns.

  2. During our triage meting, @bricelam suggested that the name ContainsAny isn't very clear and that maybe we should just extend the existing Contains and FreeText methods to support another pattern instead . We could simply support translating
    ``` c#
    var result = await context.Employees
    .Where(c => EF.Functions.Contains(c, "Representative"))

    ``` sql
    SELECT [c].[EmployeeID], [c].[City], [c].[Country], [c].[FirstName], [c].[ReportsTo], 
       [c].[Title]
    FROM [Employees] AS [c]
    WHERE CONTAINS([c].*, N'Representative')
    

    The only issue with that is that the existing extension method only accepts a property of type string in its signature (making it type-safe seemed a good idea, because we currently only allow a reference to single property anyway). We should be able to overcome that by adding a new overload that takes object or an unconstrained T generic argument (we can later consider removing the string overload in the next major release, 3.0, since it would provide very little value).

  3. Besides a single column and *, both FREETEXT and CONTAINS support passing a column list. There are bonus points for translating anonymous types (which is a columns idiom in LINQ for this kind of use cases):
    ``` c#
    var result = await context.Employees
    .Where(c => EF.Functions.Contains(new {c.FirstName, c.Title}, "Representative"))

    ``` sql
    SELECT [c].[EmployeeID], [c].[City], [c].[Country], [c].[FirstName], [c].[ReportsTo], 
       [c].[Title]
    FROM [Employees] AS [c]
    WHERE CONTAINS(([c].[FirstName], [c].[Title]), N'Representative')
    

    This can be done with the same new signature we would add for *.

  4. Besides all of that, CONTAINS also supports searching for document properties using the PROPERTY(property_name, column_name) syntax. If we ever want to add support for that, I would suggest in that case we do add a separa method, e.g. ContainsProperty.

I would treat * as the highest priority.

Any idea if this will make it in 5.0? I'd like to see these improvements implemented.

@Shane32 This issue is in the Backlog milestone. This means that it is not planned for the next release (EF Core 5.0). We will re-assess the backlog following the this release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources.

See also #10462.

Was this page helpful?
0 / 5 - 0 ratings