Efcore: Translate GroupBy followed by FirstOrDefault over group

Created on 21 May 2018  路  20Comments  路  Source: dotnet/efcore

area-query ef6-parity internal-tag-sailors type-enhancement

Most helpful comment

Very important feature IMHO.
My use case is this:

await _db.RetailerRates
                .GroupBy(r => r.Price)
                .Select(g => g.OrderByDescending(r => r.CreatedAt).First())
                .ToListAsync();

All 20 comments

??

Is there any update? Any alternative or planned fix? @ajcvickers @shanselman

Is there a workaround for this that would allow for database-evaluated DistinctBy() logic? Seems like this Issue blocks database evaluation of this.

Any workaround while it's being fixed? I am in the process of migrating .NET Core from 2.1 to 3.1 and EF Core from 2.1 to 3.1.

Another breaking change for migrating to ef core 3.1 and nobody knows when it will be supported :( @smitpatel @ajcvickers

@suadev - It was getting client eval'ed before. It is part of same breaking change disabling client eval. The perf was bad for this case. If your group had 100 element each and you are selecting only first, we still got data for all 100 elements to client side to compute the first.

Very important feature IMHO.
My use case is this:

await _db.RetailerRates
                .GroupBy(r => r.Price)
                .Select(g => g.OrderByDescending(r => r.CreatedAt).First())
                .ToListAsync();

This query pattern shows up quite a bit in my under-way 1000 table LINQ to SQL migration project.

Example use case: You have a list of blogs. For each blog you want to display a single highlighted post (the most upvoted or most recent one). So you might group by blog and take the first item from each group.

This can be translated with ROW_NUMBER ... WHERE r = 1 or with CROSS APPLY (SELECT TOP 1 ...). Both variants have different query plan shapes and performance characteristics. It would be nice if the shape could be selected by writing the LINQ query in a particular way.

This query could result in the ROW_NUMBER shape:

ctx.BlogPosts
.GroupBy(bp => bp.Blog)
.Select(g => g.OrderByDescending(bp => bp.CreatedAt).FirstOrDefault())
.ToList();

This query could result in the CROSS APPLY shape:

ctx.BlogPosts
.GroupBy(bp => bp.Blog)
.SelectMany(g => g.OrderByDescending(bp => bp.CreatedAt).Take(1).DefaultIfEmpty())
.ToList();

But any way to support this would be better than no way so that our migration can proceed.


The ROW_NUMBER plan leads SQL Server to sort all rows which could be billions. The CROSS APPLY lends itself to a loop join plan that very selectively queries an index. ROW_NUMBER is great if the fraction of rows selected is near 100%. CROSS APPLY is great when picking very few items.

Depending on the scenario the performance difference can be significant. SQL Server is unable to automatically switch between theses forms.

@smitpatel @ajcvickers Is there any update on this issue or work around you could suggest for my case #21619 please?

If I distil my use-case it comes down to a simple multi-column group-by statement and a projection:

            var groupedOrganisationMentions = await collection
                .GroupBy(m => new {m.LocationId, m.PublicationId, m.PracticeAreaId})
                .Select(am => new
                {
                    Key = am.Key,
                    Items = am.ToList()
                }).ToListAsync(cancellationToken: token).ConfigureAwait(false);

The Items projection just doesn't want to work.

I've got a similar use case that seems to be related to this issue. I use code like this in several places:

// This works fine
query.Select(price => new Customer {
   Name = price.Payer.Name,
   Code = price.Payer.Code,
   City = price.Payer.City,
   ParentCode = price.Payer.ParentCode,
   ParentLevel = CustomerLevel.Corporate,
   CustomerLevel = CustomerLevel.Payer
}).Distinct().ToListAsync();

That works fine. The query variable could potentially have a variety of expressions and joins with no issue. As soon as I add a call to OrderBy, it will not evaluate. I've tried several workarounds that I've found around the interwebz, and nothing seems to resolve it.

// This throws error
// query is of type IQueryable<Price>
query.Select(price => new Customer {
   Name = price.Payer.Name,
   Code = price.Payer.Code,
   City = price.Payer.City,
   ParentCode = price.Payer.ParentCode,
   ParentLevel = CustomerLevel.Corporate,
   CustomerLevel = CustomerLevel.Payer
}).Distinct().OrderBy(cust => cust.Name).ToListAsync();

Also, placement of the OrderBy does not seem to matter.

// This also throws error
// query is of type IQueryable<Price>
query
   .OrderBy(price => price.payer.Name)
   .Select(price => new Customer {
      Name = price.Payer.Name,
      Code = price.Payer.Code,
      City = price.Payer.City,
      ParentCode = price.Payer.ParentCode,
      ParentLevel = CustomerLevel.Corporate,
      CustomerLevel = CustomerLevel.Payer
   }).Distinct().ToListAsync();

@lenniebriscoe - The query you posted above is different from what you posted on #21619 Your query here does not contain First/FirstOrDefault over grouping element. Please refer to #19929 and #17068 to understand different queries and their translation to relational database.

@brandonsmith86 - Your query does neither contains a GroupBy operator nor FirstOrDefault over the grouping element and is irrelevant to this discussion. Please file a new issue with full repro code if you believe there is a bug in EF Core.

@brandonsmith86 - Your query does neither contains a GroupBy operator nor FirstOrDefault over the grouping element and is irrelevant to this discussion. Please file a new issue with full repro code if you believe there is a bug in EF Core.

Sorry about that. Many of the examples in this issue do not have reference to FirstOrDefault, and I assumed a call to Distinct uses grouping logic, same as GroupBy method. I will post elsewhere.

@smitpatel
Any news about this feature? (GroupBy followed by FirstOrDefault over group)
Thanks for your reply.

Okey you can achive this with using this approach:
https://dev.azure.com/pawelgerr/Thinktecture.EntityFrameworkCore/_wiki/wikis/Thinktecture.EntityFrameworkCore.wiki/35/RowNumber-Support-(EF-Core-2)

DbContext.OrderItems.Select(o => new 
                     { 
                        RowNumber = EF.Functions.RowNumber(o.ProductId)
                     })
                     .AsSubQuery()
                     .Where(i => i.RowNumber == 1)

Would be nice if this gets built in.

Hi guys looking for resolution or workaround for why this can't be translated in EF Core

.GroupBy()
.Select(group=>group.First())

Really, some basic which NHibernate was able to handle in 2006 (and also EF6.x was OK with it) is still not in EF core? Not after 2 years?

This really has to be supported in EF core.

This is a very common pattern in my EF queries, I'm shocked it's not in EF Core. Effectively preventing me from migrating to .Net (core). Must have

@stevozilik EF6 runs on .NET Core, so even if you can't use EF Core due to this, it should not stop you migrating to .NET Core.

Was this page helpful?
0 / 5 - 0 ratings