Efcore: LINQ Average conversion to SQL is correct in ver2-preview1, but incorrect in version 2

Created on 22 Aug 2017  路  13Comments  路  Source: dotnet/efcore

In EF Core 2.0.0-preview1-final release the Linq Average command was correctly converted to the SQL AVG command. In release 2.0.0 this seems to be broken.

If you are seeing an exception, include the full exceptions details (message and stack trace).

System.InvalidOperationException : Sequence contains no elements.
   at Microsoft.EntityFrameworkCore.Query.QueryMethodProvider.GetResult[TResult](IEnumerable`1 valueBuffers, Boolean throwOnNullResult)
   at lambda_method(Closure , QueryContext , ValueBuffer )
   at Microsoft.EntityFrameworkCore.Query.ExpressionVisitors.Internal.ProjectionShaper.TypedProjectionShaper`3.Shape(QueryContext queryContext, ValueBuffer valueBuffer)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.Enumerator.BufferlessMoveNext(Boolean buffer)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.Enumerator.MoveNext()
   at Microsoft.EntityFrameworkCore.Query.Internal.LinqOperatorProvider.ExceptionInterceptor`1.EnumeratorExceptionInterceptor.MoveNext()
   at System.Collections.Generic.List`1.AddEnumerable(IEnumerable`1 enumerable)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Ver2.UnitTests.TestAverageVer2.TakeAverageOfTwoBooks() in c:\users\jonsm\Source\Repos\EfTestDifferentVersions\Ver2\UnitTests\TestAverageVer2.cs:line 73

Steps to reproduce

I have a Book entity class with a collection of Review entity classes that contain NumVotes. I want to calculate the average number of votes. A Book can have zero to many Reviews.

My entity classes are:
```c#
public class Book
{
public int BookId { get; set; }
public string Title { get; set; }
public string Description { get; set; }
public DateTime PublishedOn { get; set; }
public string Publisher { get; set; }
public decimal Price { get; set; }
public string ImageUrl { get; set; }

//----------------------------------------------
//relationships

public PriceOffer Promotion { get; set; }       
public ICollection<Review> Reviews { get; set; }
public ICollection<BookAuthor> AuthorsLink { get; set; }

}

```c#
public class Review                      
{
    public int ReviewId { get; set; }
    public string VoterName { get; set; }
    public int NumStars { get; set; }
    public string Comment { get; set; }

    //-----------------------------------
    //Relationships

    public int BookId { get; set; }      
}

My test code is below, and it fails on the database query when running EF Core 2.0.0, but works correctly if running EF Core 2.0.0-preview1-final.
Note: the method TwoBooksOneWithReviews creates two Book entities; one with no reviews and the second with two reviews.
```c#
private class Dto
{
public string Title { get; set; }
public double? AveVotes { get; set; }
}

[Fact]
public void TakeAverageOfTwoBooks()
{
//SETUP
var options = SqliteInMemory.CreateOptions();

using (var context = new EfCoreContext(options))
{
    context.Database.EnsureCreated();

    context.AddRange(SeedData.TwoBooksOneWithReviews());
    context.SaveChanges();
    var logs = context.SetupLogging();

    //VERIFY
    var result = context.Books.Select(x => new Dto
    {
        Title = x.Title,
        AveVotes = x.Reviews.Select(y => y.NumStars).Average()
    }).ToList();

    //VERIFY
    result.First().AveVotes.ShouldBeNull();
    result.Last().AveVotes.ShouldEqual(4);
    foreach (var log in logs)
    {
        _output.WriteLine(log);
    }
}

}

If I run exactly the same test on release 2.0.0-preview1-final then it works correctly and outputs the following SQL 
```SQL
Executed DbCommand (0ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
SELECT "x"."Title", (
    SELECT AVG(CAST("y"."NumStars" AS REAL))
    FROM "Review" AS "y"
    WHERE "x"."BookId" = "y"."BookId"
) AS "AveVotes"
FROM "Books" AS "x"

Further technical details

EF Core version: 2.0.0 incorrect (2.0.0-preview1-final correct)
Database Provider: Microsoft.EntityFrameworkCore.Sqlite (but fails on SqlServer too)
Operating system: Windows 10
IDE: VS2017 15.3

closed-by-design

Most helpful comment

Could I ask if this bug is fixed in the planned patch update, 2.0.1? My reasons for asking this are:

  1. It seems that you know what the problem is - DetermineAggregateThrowingBehavior. I therefore assume its not hard to fix.
  2. The SQL code produced is very poor, and you get a lot of warnings in the log.
  3. I am trying to release the book Entity Framework in Action soon and I include Average as an example in the book - see demo web site http://efcoreinaction.com/ (currently running EF Core 1.1.0).

All 13 comments

Enumerable.Average throws InvalidOperationException when sequence contains no elements.
Ref: https://msdn.microsoft.com/en-us/library/bb548874(v=vs.110).aspx
We aligned with C# behavior due to bug https://github.com/aspnet/EntityFrameworkCore/issues/7901

I would politely ask you to reconsider this design decision and go back to returning null on a 'on sequence' situation. My reasons are:

A. Returning NULL for 'no sequence' is a recognised approach for databases

As EF Core is accessing a database then following the database convention is the most applicable approach. I believe EF6.x Average etc. returns null on 'no sequences'. Even the person who reported the error in #7901 suggesting returning NULL as a solution, null with documentation.

B. Adding code to handle the Average with 'no sequence' situation produces very poor SQL

I have obviously looked at a work around, and here is a new query that replaces the query in my unit test
```c#
var result = context.Books.Select(x => new Dto
{
Title = x.Title,
AveVotes = x.Reviews.Count == 0 ? null : (double?)x.Reviews.Select(y => y.NumStars).Average()
}).ToList();

This works, but has problems which the logs show. The log output contains 
1. `QueryPossibleExceptionWithAggregateOperator` followed by a
2 `QueryClientEvaluationWarning`

It then proceeds to use Client vs. Server for calculating the average.  
This is a very poor performing query because of this, thus making the use of `Average` etc. not useable in cases where the collection is possibly empty.

Here is the full log - note: in the unit test there is one entry with `Review`s and one entry with two `Review`s.

Possible unintended use of a potentially throwing aggregate method (Min, Max, Average) in a subquery. Client evaluation will be used and operator will throw if no data exists. Changing the subquery result type to a nullable type will allow full translation.
The LINQ expression 'Average()' could not be translated and will be evaluated locally.
Possible unintended use of a potentially throwing aggregate method (Min, Max, Average) in a subquery. Client evaluation will be used and operator will throw if no data exists. Changing the subquery result type to a nullable type will allow full translation.
The LINQ expression 'Average()' could not be translated and will be evaluated locally.
Possible unintended use of a potentially throwing aggregate method (Min, Max, Average) in a subquery. Client evaluation will be used and operator will throw if no data exists. Changing the subquery result type to a nullable type will allow full translation.
The LINQ expression 'Average()' could not be translated and will be evaluated locally.
Possible unintended use of a potentially throwing aggregate method (Min, Max, Average) in a subquery. Client evaluation will be used and operator will throw if no data exists. Changing the subquery result type to a nullable type will allow full translation.
The LINQ expression 'Average()' could not be translated and will be evaluated locally.
Executed DbCommand (0ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
SELECT "x"."Title", "x"."BookId", CASE
WHEN (
SELECT COUNT(*)
FROM "Review" AS "r1"
WHERE "x"."BookId" = "r1"."BookId"
) = 0
THEN CAST(1 AS BIT) ELSE CAST(0 AS BIT)
END
FROM "Books" AS "x"
Executed DbCommand (0ms) [Parameters=[@_outer_BookId='?'], CommandType='Text', CommandTimeout='30']
SELECT AVG(CAST("y2"."NumStars" AS REAL))
FROM "Review" AS "y2"
WHERE @_outer_BookId = "y2"."BookId"
```

We cannot return null because we cannot assign it to non-nullable object. If average is being computed in nullable type then EF Core doesn't throw.

@smitpatel what needs to be nullable? The type of the member the result of Average() is assigned to or the return type of the selector passed to Average()?

FWIW, in @JonPSmith's sample code the former is already nullable:

C# private class Dto { public string Title { get; set; } public double? AveVotes { get; set; } }

Also he is right that this used to work in previous versions of EF.

I know we decided on this when we discussed #7901, but even on that issue, the customer was asking for basically the same behavior that @JonPSmith is asking for:

Interestingly, the standard behaviour for the LINQ Min() and Max() functions is to throw InvalidOperationException if there are no matching rows. This is a bit of a pain鈥攔eturning null makes more sense really鈥攂ut it would be helpful if this difference was clearly documented.

This throws for all cases even for nullable types. DetermineAggregateThrowingBehavior is not working correctly. @anpete

@anpete - Bump! this is causing issue with groupby since all Min/Max/Average raises flag that it needs client eval hence cannot be translated with groupby.

Could I ask if this bug is fixed in the planned patch update, 2.0.1? My reasons for asking this are:

  1. It seems that you know what the problem is - DetermineAggregateThrowingBehavior. I therefore assume its not hard to fix.
  2. The SQL code produced is very poor, and you get a lot of warnings in the log.
  3. I am trying to release the book Entity Framework in Action soon and I include Average as an example in the book - see demo web site http://efcoreinaction.com/ (currently running EF Core 1.1.0).

@anpete to follow up with a fix and we can discuss whether to patch or not.

I investigated and determined that the current behavior is by-design:

  • We correctly perform client-eval because the operand of _Average_ is not nullable. E.g. in .Select(y => y.NumStars).Average() _Review.NumStars_ is not nullable
  • The type of _Dto.AveVotes_ is irrelevant, it is merely the destination for the result of the full expression: x.Reviews.Count == 0 ? 0 : x.Reviews.Select(y => y.NumStars).Average()
  • Given an analysis of the full expression above, it could be possible to deduce that the code will never throw (and hence avoid the client-eval) - because of the conditional check that _Reviews_ isn't empty.
  • It is trivial to opt-out of the throwing operator by simply casting the operand to a nullable type: E.g. x.Reviews.Count == 0 ? 0 : x.Reviews.Select(y => (double?)y.NumStars).Average()

Hi @anpete,

Thanks for looking into this. I did try x.Reviews.Count == 0 ? null : (double?)x.Reviews.Select(y => y.NumStars).Average(), but I didn't realise that I needed to do x.Reviews.Select(y => (double?)y.NumStars).Average(). I do think that is non-obvious, but now I know.

Sorry for the trouble.

@anpete Since https://github.com/aspnet/EntityFrameworkCore/issues/9726 was closed as a duplicate of this one, hoe do you recommend that a call to AsyncMax be written? (Obviously there are lots of alternatives but none seem especially friendly. For now I'm just using OrderBy and LastOrDefaultAsync instead of max.)

@smitpatel It's not clear to me why #9722 and #9726 are duplicates of this, which means it's probably not clear to at least some other people too. 馃槃 Can you write a brief explanation?

@JonPSmith No worries, agree it is a little tricky.
@slubowsky I re-opened #9726 and #9722 is a dupe of that.

Was this page helpful?
0 / 5 - 0 ratings