Efcore: Query: SQL for queries which are projecting out single result from collection in projection

Created on 6 Oct 2017  路  15Comments  路  Source: dotnet/efcore

```C#
var query = db.Users.Select(
u => new
{
u.Text,
T = u.Notifications.FirstOrDefault()
})
.ToList();

Generates following SQL
```SQL
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (0ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [u].[Text], [u].[Id]
      FROM [Users] AS [u]
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (3ms) [Parameters=[@_outer_Id='1'], CommandType='Text', CommandTimeout='30']
      SELECT TOP(1) [n].[Id], [n].[Message], [n].[UserId]
      FROM [Notifications] AS [n]
      WHERE @_outer_Id = [n].[UserId]

It is N+1 queries. Since there is FirstOrDefault operator (hence only 1 related row max), instead of going through 2 queries form like include, we can just do cross apply (which include tries to avoid with 2 queries) since there is no duplicated data.

Expected SQL:

SELECT [u].[Text], [u].[Id], [t].[Id], [t].[Message], [t].[UserId]
FROM [Users] AS [u]
CROSS APPLY (
    SELECT TOP(1) [n].[Id], [n].[Message], [n].[UserId]
    FROM [Notifications] AS [n]
    WHERE [u].[Id] = [n].[UserId]
) AS [t]
closed-fixed punted-for-2.1 type-enhancement

Most helpful comment

Can we get an ETA on this fix? Right now, we have several lateral joins (cross/outer apply) and it's resulting in N + 1 queries. If there are a 100 rows, it results in 101 queries instead of 1 query. Sometimes, we have several lateral joins....this can results in 100s of queries just for 1 query. It increases the CPU usage and increases the latency.

In Entity Framework 6, this would be 1 query (for 100 rows) and it would correctly translate the LINQ to Cross/Outer Apply.....in Entity Frame Core 2.1 Final, it's 101 queries (for 100 rows).

Example:

SELECT Courses.CourseId, Courses.Category, LastDetail.*, DetailOptions.OptionYear, DetailOptions.OptionId
FROM Courses
OUTER APPLY
(
   SELECT TOP 1 Details.Title, Details.Status, Details.Created
   FROM Details
   WHERE Details.CourseId = Course.CourseId
   ORDER BY Details.Created DESC
) LastDetail
LEFT JOIN DetailOptions ON DetailOptions.DetailId = DetailOptions.DetailId

LINQ:

var query = (from c in db.Courses
                    let ld = (from ld in db.Details
                                 where ld.CourseId == c.CourseId
                                orderby ld.Created descending
                                select new  { ld.Title, ld.Status, ld.Created }).FirstOrDefault()
                    join do in db.DetailOptions on ld.DetailId equals do.DetailId into doGrp
                   from do in doGrp.DefaultIfEmpty()
                    select new { c.CourseId, c.Category, LastDetail = (ld == null) ? null : ld}, do.OptionId, do.OptionYear).ToList()

Rewriting the "LINQ" for all these queries "just to make it work" is costly and not a good solution. To make it use CROSS APPLY, we have to let ld = from db.Details.Where().OrderBy().FirstOrDefault() and then left join again to db.Details.

All 15 comments

Can we get an ETA on this fix? Right now, we have several lateral joins (cross/outer apply) and it's resulting in N + 1 queries. If there are a 100 rows, it results in 101 queries instead of 1 query. Sometimes, we have several lateral joins....this can results in 100s of queries just for 1 query. It increases the CPU usage and increases the latency.

In Entity Framework 6, this would be 1 query (for 100 rows) and it would correctly translate the LINQ to Cross/Outer Apply.....in Entity Frame Core 2.1 Final, it's 101 queries (for 100 rows).

Example:

SELECT Courses.CourseId, Courses.Category, LastDetail.*, DetailOptions.OptionYear, DetailOptions.OptionId
FROM Courses
OUTER APPLY
(
   SELECT TOP 1 Details.Title, Details.Status, Details.Created
   FROM Details
   WHERE Details.CourseId = Course.CourseId
   ORDER BY Details.Created DESC
) LastDetail
LEFT JOIN DetailOptions ON DetailOptions.DetailId = DetailOptions.DetailId

LINQ:

var query = (from c in db.Courses
                    let ld = (from ld in db.Details
                                 where ld.CourseId == c.CourseId
                                orderby ld.Created descending
                                select new  { ld.Title, ld.Status, ld.Created }).FirstOrDefault()
                    join do in db.DetailOptions on ld.DetailId equals do.DetailId into doGrp
                   from do in doGrp.DefaultIfEmpty()
                    select new { c.CourseId, c.Category, LastDetail = (ld == null) ? null : ld}, do.OptionId, do.OptionYear).ToList()

Rewriting the "LINQ" for all these queries "just to make it work" is costly and not a good solution. To make it use CROSS APPLY, we have to let ld = from db.Details.Where().OrderBy().FirstOrDefault() and then left join again to db.Details.

It's too bad this got postponed. But thanks for the update.

Hi,

Any possibility to have this issue fixed as soon as possible, in the next minor/major release ?

Thank you

@zulander1 It's currently in the 3.0 milestone, which realistically is the best we can hope to do at this point.

While people on the OLD EF can write queries...I have to use the OLD EF to generate the best query...then copy the resulting working SQL query and use FromSQL...and hack away to make this usable.
Needless to say this is the single most painful slowdown of programming since switching to .NET Core.
I have to say that for any non-basic queries this platform so far is pretty much useless. The main reason being this unacceptable performance loss due to generated multiple queries which SQL server can otherwise process in a second given the proper SQL syntax.

On the other hand I've refreshed my TSQL skills due to this bug. So I guess that's a "plus"

It also seems to break .Distinct(). In EF6, since my query was done all in SQL Server, the DISTINCT was done there too and I would get one result back. But now because of this N+1 queries problem, the Distinct() is probably called in C# and doesn't remove duplicates because while the objects all have the same property values, they are different objects.

Hi,

I'm having the same issue. As many to many is not already supported, creating a request to retrieve information from this kind of relationship like Role and User dbset, it will make N+1 queries.

Steps to reproduce

```c#
Context = new MainContext();

var request = Context.Users.Select(data => new UserDTO
{
Id = data.Id,
Roles = data.UserRoles.Select(x => x.Role.Name),
})
.ToList(); // We also tried without it

// Foreach to demonstrate request executing in console
foreach (var item in request)
{
var roles = item.Roles;
}

Console.ReadLine();
```

Further technical details

EF Core version: 2.1.4
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: W10
IDE: Visual Studio 2017 15.8.9

@Rubis61 for your scenario, you can take advantage of correlated collection optimization we added recently (see: https://github.com/aspnet/EntityFrameworkCore/commit/b95f23ffa4fbba305b3f0ea4c864fe53f456ad58 for some detailed information). Just need to append ToList() or ToArray() call at the end of the inner collection and we will produce 2 queries instead of N+1:

var request = Context.Users.Select(data => new UserDTO
{
    Id = data.Id,
    Roles = data.UserRoles.Select(x => x.Role.Name).ToList(),
})
.ToList();

Correlated collection optimization has several limitations, e.g. doesn't work if you want to return just one element of the inner collection, that's what the issue is tracking.

@zulander1 It's currently in the 3.0 milestone, which realistically is the best we can hope to do at this point.

@ajcvickers are will still on target for this ?

@zulander1 The 3.0 milestone is overbooked. We're re-evaluating over the next couple of weeks, after which there will be some more confidence.

I wanted to add an additional issue I have seen with this. After calling FirstOrDefault in a projection, only the first table after that point is joined.

C# var query = Context.Invoice.Select(invoice => new InvoiceSummary { OriginCity = invoice.Transactions.OrderByDescending(t => t.TransactionNumber).FirstOrDefault().Location.PostalAddress.City });

Something like this will result in a null reference exception. Location is joined but PostalAddress is not and is null. Even if I add Include() statements to the start of the query it is ignored.

Same issue of @muchman...

Triage: we need to make this work for Sqlite also--so try window function or SQlite specific re-write.

@muchman and @Dona278 issues are fixed in the latest bits - previously member pushdown (pushing member access that happens after FirstOrDefault, before it) only worked on one level of navigation access. In the new pipeline logic is recursive so arbitrary depth of navigation chains are supported

Does this work with includes and sub where now?

Was this page helpful?
0 / 5 - 0 ratings