Efcore: Feature Request: LINQ to table parameter

Created on 6 Sep 2018  路  7Comments  路  Source: dotnet/efcore

Use Case: Ability to use query against a variable number of inputs, for a batching scenario. Ideally I'd like to have the same query plan used whether there are 2 input values or 100 input values, in conjunction with a linq query.

Hypothetical linq:
```c#
public class MyDbContext : DbContext
{
// ... or however it makes sense to define a udt
public DbParameter IdTable { get; set; }
}

public class IdUdt : DbTableParameter
{
// not sure if this would be necessary, but could enable the With function below
protected override void Populate(long value)
{
this.Id = value;
}

public long Id { get; set; }

}

public static class Execution
{
public static async Task> GetWithTableParameterAsync(IEnumerable ids, CancellationToken cancellationToken)
{
using (var db = new MyDbContext()) {
var query = from person in db.People
join id in db.IdTable.With(ids) on person.PersonId equals id.Id
select person;
return await query.ToListAsync(cancellationToken);
}
}
}

Ideally this generates t-sql that looks something like:
```t-sql
declare @p0 dbo.udt_Id
insert into @p0 values(1)
insert into @p0 values(2)
insert into @p0 values(3)
insert into @p0 values(4)

exec sp_executesql N'SELECT 
    [Extent1].[PersonID] AS [Id], 
    [Extent1].[Name] AS [Name]
    FROM [dbo].[People] AS [Extent1]
    INNER JOIN @p0 i ON t.[PersonID] = i.[ID]',N'@ids [dbo].[udt_TableBigintId] READONLY',@ids=@p0

Note that I realize I can already do something similar with a Contains -- something like:
c# public static async Task<List<Person>> GetWithInClauseAsync(IEnumerable<long> ids, CancellationToken cancellationToken) { using (var db = new MyDbContext()) { var query = from person in db.People where ids.Contains(person.PersonId) select person; return await query.ToListAsync(cancellationToken); } }

The difference is in the sql that gets generated, since it embeds the id values into the query (or if using an expression tree walker, I believe it's possible to change this to have n equality checks with sql parameters). I'd prefer a single query plan to either of these solutions.

I also realize that it's possible to use table parameters with raw sql, but the challenge there is that I can't easily inject it into the middle of a complex query if I want to use LINQ.

area-query customer-reported type-enhancement

Most helpful comment

Hi @awr your issue was assigned to me as investigation in 2.2 with the idea that I would do some thinking to try to decide if this was a duplicate of an existing issue or if there was something in your proposal that we wanted to pursue.

I did a search and I actually couldn't find an active issue in our backlog that covers using TVPs in this way.

We have certainly done some thinking about using TVPs as an implementation detail for things like Enumerable.Contains(). For example, a query like this (based on your examples) would use TVPs automatically:

C# public static async Task<List<Person>> GetWithTableParameterAsync( IEnumerable<long> ids, CancellationToken cancellationToken) { using (var db = new MyDbContext()) { var query = from person in db.People where ids.Contains(person.PersonId) select person; return await query.ToListAsync(cancellationToken); } }
However, there some challenges with doing this automatically. One of them is the fact that SQL Server requires table types to be declared in the database before you can use them in a TVP. Hence we have come up with other alternative implementations that presumably would have lower impact, like #12777.

But I believe what you are proposing is really interesting. On one hand it could be used to solve the same scenario above, but instead of Enumerable.Contains(), you would explicitly join the two query roots like in your sample code. On the other hand, it seems to be a more general feature that could have other applications.

I think I have been able to come up with two orthogonal new capabilities that we could add which I think would generalize the idea even more:

  1. New SQL Server specific extension method for EntityTypeBuilder and QueryTypeBuilder to indicate you want the corresponding table type to be created. The name could be something like ForSqlServerDeclareTableType(). I like the option of adding this for entity types and query types, which we already support, more than adding yet a third way of declaring a "shape" in the model.

  2. A new FromData() core query operator: This could be very similar to FromSql in the sense that it would be used to override the data source of the query root in that particular query, but instead of specifying that you need to execute some SQL, you would provide the data in-line or though a variable. This method would normally evaluate in memory, but in the particular case of SQL Server , it could cause the creation of a TVP and transference of the data to the server to be processed there. We would need to decide if that behavior would be triggered implicitly, whether it would depend on other parts of the query being evaluated on the server, or if it needs to be required explicitly.
    FromData() could support inputs similar to the existing HasData() method used for data seeding in model definition, and it could support streaming up TVPs if the input is an IEnumerable<T> or IAsynncEnumerable<T>. In case you pass actual entity instances to FromData(), we would need consider if the right behavior of the method would be to clone or actually use the same instances.
    Note that ToQuery() (the API for defining query) can already be used in the model to supply in-memory data.

Assuming we had these capabilities, your example could look something like this:

``` C#
public class MyDbContext : DbContext
{
// ... or however it makes sense to define a udt
public DbQuery LongRows { get; set; }

override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Query<LongRow>().ForSqlSserverDeclareTableType();
}

}

public class LongRow
{
public long Id { get; set; }
}

public static class Execution
{
public static async Task> GetWithTableParameterAsync(
IEnumerable ids,
CancellationToken cancellationToken)
{
using (var db = new MyDbContext()) {
var query = from person in db.People
join id in db.LongRows.FromData(ids => ids.Select(id => new LongRow {Id = id})))
on person.PersonId equals id.Id
select person;
return await query.ToListAsync(cancellationToken);
}
}
}
```

Of course we could also have sugar to make this more terse, like an attribute on the type or DbSet or DbQuery property to automatically create the table type, and a version of FromData() that takes a collection of a single scalar and converts.

All 7 comments

Anything I can do to help this along? I'd be happy to put together a PR if you point me in the general direction of where you'd want to see this (assuming it fits with your direction for the project).

Hi @awr your issue was assigned to me as investigation in 2.2 with the idea that I would do some thinking to try to decide if this was a duplicate of an existing issue or if there was something in your proposal that we wanted to pursue.

I did a search and I actually couldn't find an active issue in our backlog that covers using TVPs in this way.

We have certainly done some thinking about using TVPs as an implementation detail for things like Enumerable.Contains(). For example, a query like this (based on your examples) would use TVPs automatically:

C# public static async Task<List<Person>> GetWithTableParameterAsync( IEnumerable<long> ids, CancellationToken cancellationToken) { using (var db = new MyDbContext()) { var query = from person in db.People where ids.Contains(person.PersonId) select person; return await query.ToListAsync(cancellationToken); } }
However, there some challenges with doing this automatically. One of them is the fact that SQL Server requires table types to be declared in the database before you can use them in a TVP. Hence we have come up with other alternative implementations that presumably would have lower impact, like #12777.

But I believe what you are proposing is really interesting. On one hand it could be used to solve the same scenario above, but instead of Enumerable.Contains(), you would explicitly join the two query roots like in your sample code. On the other hand, it seems to be a more general feature that could have other applications.

I think I have been able to come up with two orthogonal new capabilities that we could add which I think would generalize the idea even more:

  1. New SQL Server specific extension method for EntityTypeBuilder and QueryTypeBuilder to indicate you want the corresponding table type to be created. The name could be something like ForSqlServerDeclareTableType(). I like the option of adding this for entity types and query types, which we already support, more than adding yet a third way of declaring a "shape" in the model.

  2. A new FromData() core query operator: This could be very similar to FromSql in the sense that it would be used to override the data source of the query root in that particular query, but instead of specifying that you need to execute some SQL, you would provide the data in-line or though a variable. This method would normally evaluate in memory, but in the particular case of SQL Server , it could cause the creation of a TVP and transference of the data to the server to be processed there. We would need to decide if that behavior would be triggered implicitly, whether it would depend on other parts of the query being evaluated on the server, or if it needs to be required explicitly.
    FromData() could support inputs similar to the existing HasData() method used for data seeding in model definition, and it could support streaming up TVPs if the input is an IEnumerable<T> or IAsynncEnumerable<T>. In case you pass actual entity instances to FromData(), we would need consider if the right behavior of the method would be to clone or actually use the same instances.
    Note that ToQuery() (the API for defining query) can already be used in the model to supply in-memory data.

Assuming we had these capabilities, your example could look something like this:

``` C#
public class MyDbContext : DbContext
{
// ... or however it makes sense to define a udt
public DbQuery LongRows { get; set; }

override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Query<LongRow>().ForSqlSserverDeclareTableType();
}

}

public class LongRow
{
public long Id { get; set; }
}

public static class Execution
{
public static async Task> GetWithTableParameterAsync(
IEnumerable ids,
CancellationToken cancellationToken)
{
using (var db = new MyDbContext()) {
var query = from person in db.People
join id in db.LongRows.FromData(ids => ids.Select(id => new LongRow {Id = id})))
on person.PersonId equals id.Id
select person;
return await query.ToListAsync(cancellationToken);
}
}
}
```

Of course we could also have sugar to make this more terse, like an attribute on the type or DbSet or DbQuery property to automatically create the table type, and a version of FromData() that takes a collection of a single scalar and converts.

Note for triage: I think this belongs in the backlog for now, but since @awr is interested in contributing we can discuss the design in general terms.

Updated my previous comment to reflect that client evaluation isn't an option anymore after 3.0.

I think the syntax should be more close to that of Set()

something along the lines of

IQueryable<TEntity> TableValueParameterSet<TEntityId>(IEnumerable<TEntityId> ids) where TEntityId : struct

By conventon EF could use a table value paramater type named $"ef_core_{nameof(TEntityId)}_tvp" or similar

If EF core eventually support insert / delete / update directly on IQueryables you could even do

ctx.TableValueParameterSet(ids)
  .Join(ctx.Set<MyEntity>(), id => id, e => e.Id, (id, e) => e)
  .Delete();

If we extend the table value parameter mechanics to support custom entity types not just simple value types we could even bulk insert with a crazy high level of performance. Something like.

var batch = Enumerable.Range(0, 2000).Select(i => new MyEntity{ Foo = $"Hello World_{i}" }).ToList();
ctx.TableValueParameterSet(batch)
.Insert();

Possibilities are endless :D

This issue need more love from the ef core team.

Was this page helpful?
0 / 5 - 0 ratings