Efcore: NET Core 5 release 3 - Include OrderBy throws exception

Created on 29 Apr 2020  路  14Comments  路  Source: dotnet/efcore

I am trying out NET Core 5 release 3 Include filter/order and I get an exception. I expect I have done something wrong but I can't find it.

Steps to reproduce

The unit test
``` C#
[Fact]
public void TestIncludeSortReviews()
{
//SETUP
var options = SqliteInMemory.CreateOptions();
using (var context = new EfCoreContext(options))
{
context.Database.EnsureCreated();
var newBook = new Book { Reviews = new List
{
new Review { NumStars = 2 } , new Review { NumStars = 1 }
} };
context.Add(newBook);
context.SaveChanges();

    //ATTEMPT
    var query = context.Books
        .Include(x => x.Reviews.OrderBy(y => y.NumStars));
    var books = query.ToList();

    //VERIFY
    _output.WriteLine(query.ToQueryString());
    books.Single().Reviews.Select(x => x.NumStars).ShouldEqual(new []{1,2});

}

}

The entity classes are

public class Book
{
public int BookId { get; set; }
public string Title { get; set; }
public string Description { get; set; }
public DateTime PublishedOn { get; set; }
public string Publisher { get; set; }
public decimal Price { get; set; }
public string ImageUrl { get; set; }

public bool SoftDeleted { get; set; }

//-----------------------------------------------
//relationships

public PriceOffer Promotion { get; set; }
public ICollection<Review> Reviews { get; set; } 
public ICollection<BookAuthor> AuthorsLink { get; set; }

}

```c#
public class Review 
{
    public int ReviewId { get; set; }
    public string VoterName { get; set; }
    public int NumStars { get; set; }
    public string Comment { get; set; }

    //-----------------------------------
    //Relationships

    public int BookId { get; set; } //#M
}

The DbContext is
```c#
public class EfCoreContext : DbContext
{
public EfCoreContext(DbContextOptions options)
: base(options) { }

public DbSet<Book> Books { get; set; }
public DbSet<Author> Authors { get; set; }
public DbSet<PriceOffer> PriceOffers { get; set; }

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<BookAuthor>() 
        .HasKey(x => new {x.BookId, x.AuthorId});

    modelBuilder.Entity<Book>()
        .HasQueryFilter(p => !p.SoftDeleted);
}

}

Exception is - note it fails on the line `var books = query.ToList();`

System.ArgumentException : Expression of type 'System.Linq.IOrderedQueryable1[DataLayer.EfClasses.Review]' cannot be used for return type 'System.Linq.IOrderedEnumerable1[DataLayer.EfClasses.Review]'
at System.Linq.Expressions.Expression.ValidateLambdaArgs(Type delegateType, Expression& body, ReadOnlyCollection1 parameters, String paramName) at System.Linq.Expressions.Expression.Lambda[TDelegate](Expression body, String name, Boolean tailCall, IEnumerable1 parameters)
at System.Linq.Expressions.Expression.Lambda[TDelegate](Expression body, Boolean tailCall, IEnumerable1 parameters) at System.Linq.Expressions.Expression.Lambda[TDelegate](Expression body, ParameterExpression[] parameters) at System.Linq.Expressions.Expression11.Rewrite(Expression body, ParameterExpression[] parameters)
at System.Linq.Expressions.ExpressionVisitor.VisitLambdaT
at Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddQuery[TResult](Object cacheKey, Func1 compiler) at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.Execute[TResult](Expression query) at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryProvider.Execute[TResult](Expression expression) at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryable1.GetEnumerator()
at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.IncludableQueryable2.GetEnumerator() at System.Collections.Generic.List1..ctor(IEnumerable1 collection) at System.Linq.Enumerable.ToList[TSource](IEnumerable1 source)
at Test.UnitTests.TestDataLayer.Ch06_HowIncludeWorks.TestIncludeSortSingle() in C:\Users\JonPSmith\source\repos\EfCoreinAction-SecondEdition\Test\UnitTests\TestDataLayer\Ch06_HowIncludeWorks.cs:line 65

### Further technical details

My Test.csproj file has the following packages


netcoreapp3.1

<IsPackable>false</IsPackable>










all
runtime; build; native; contentfiles; analyzers; buildtransitive







```

EF Core version:
Database provider: Microsoft.EntityFrameworkCore.Sqlite
Target framework: 5.0.0-preview.3.20181.2
Operating system:
IDE: Visual Studio 2019 16.5.4

closed-duplicate customer-reported

All 14 comments

Duplicate of #20578
It is fixed in nightly build already and will be released in new preview.
As a work-around, you can put Skip(0) after OrderBy and it should work.

I have just updated to release 5.0.0-preview.4.20220.10 of EF Core and the code in the initial issue now runs, but fails to order the data.

However, if I change the code to a disconnected state it works.

Below is the code that works
```c#
[Fact]
public void TestIncludeSortReviewsDisconnected()
{
//SETUP
var options = SqliteInMemory.CreateOptions();
using (var context = new EfCoreContext(options))
{
context.Database.EnsureCreated();
var newBook = new Book
{
Reviews = new List
{
new Review {NumStars = 2}, new Review {NumStars = 1}
}
};
context.Add(newBook);
context.SaveChanges();
}
using (var context = new EfCoreContext(options))
{
//ATTEMPT
var query = context.Books
.Include(x => x.Reviews.OrderBy(y => y.NumStars));
var books = query.ToList();

    //VERIFY
    _output.WriteLine(query.ToQueryString());
    books.Single().Reviews.Select(x => x.NumStars).ShouldEqual(new[] { 1, 2 });
}

}
```

UPDATED: removed incorrect SQL commands.

EF Core version:
Database provider: Microsoft.EntityFrameworkCore.Sqlite
Target framework: 5.0.0-preview.4.20220.10
Operating system: Windows 10
IDE: Visual Studio 2019 16.5.5

We need to investigate why the SQL generated is different in both cases.

@JonPSmith - The behavior you are seeing is somewhat expected. 2 reasons behind it

  • Any filtered/ordered include done using context which is already tracking those instances may not give expected result. If you have whole graph and Book->Reviews loaded in context and you perform a filtered include, we will not remove the additional Reviews from collection which does not match the filter. Same way if you have just a Review which does not match the filter tracked by context and did filtered include then it will be fixed up with the rest, even though it was not fetched from database. So for Ordered Include with same context, we fetch the data from database and try to add to collection, if the collection already has the Review, we don't change ordering.
  • In your code, you are not initializing the collections. By default EF creates HashSet. The behavior of HashSet for ordered collection is non-deterministic. Even if EF core fetch data from server in correct order and call AddToCollection in correct order, it could still reshuffle based on hashing key. At the least the collection requires to be order preserving.

@smitpatel . Thank you for your quick and detailed response. I have a question on your second bullet. The type of the Reviews navigational property is ICollection<Review>, which is ordered. I had assumed that assigning to that would retain the order. If it doesn't then I think I would have had more problems on other sorting issues than just this one. Can you explain - thanks.

Based on this code, I think it would end up being HashSet.
https://github.com/dotnet/efcore/blob/e6674cdbc8e45ffd890ba20441230fb383087e3a/src/EFCore/Metadata/Internal/CollectionTypeFactory.cs#L26-L33

Based on my knowledge, ICollection<T> does not specify anything about ordering contract. Items in the collection returned would be in some order but that does not necessarily same as the order in which they are added.

Thanks. I didn't know that and need to ponder how that alters my approach.

Hi @smitpatel,

I researched the HashSet issue you explained and here are my conclusions. Don't change anything in EF Core but helps me understand that not initializing a navigational collection property is OK.

Over the last three years I learnt to not to initialize a navigational collection property. I do this because a few times I forgot an .Include, then I get the wrong answer, i.e. an empty collection when it should contain a collection. By not initializing a navigational collection fails safe in that if I forget .Include then the property is null, which causes an exception.

Now, before the ability of EF Core to sort in the .Include, whether the HashSet holds the data in the order they were added is immaterial, but I did notice they were in primary key order.

With EF Core 5's "sort/filter in Include" how HashSet adds new entries matters. A study of the HashSet code shows it does add new entries in the order they are presented, unless there is a duplicate (it uses Shot[] and adds to the end if the next value is unique). The HashSet documentation says

A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

But that is only true because of duplicates - even with duplicates it does has some order - see HashSet documentation example.

I have written this for my own peace of mind, and I will continue to not to initialize a navigational collection property. Thanks for reading.

As I said earlier, items in HashSet has some order but it does not necessarily mean order they are added. Further since it is not part of the contract, implementation change can do anything. If you think about it then it based on implementation of HashSet and default HashCode calculation the order could change in very odd way. So I wouldn't rely on that behavior.
If you want order preserving collection type but uninitialized then you can use IList<T>. In that case since HashSet is not assignable to it, EF Core will initialize a list for it which will preserve the order in which entries are added.

Thank you @smitpatel. I will change my uninitialized navigational collection property to IList<T> and update my book to show that, and why that is the best type.

With this knowledge I revisited the code and retested it and I was WRONG about the different SQL being produced (and I have removed from my initial comment). I believe the difference is because of the first bullet in your comment.

I therefore marked this issue is closed.

@JonPSmith Just to close the loop on this, the reason we use HashSet by default is because the performance for Contains for a List with many items is very poor since it does a linear scan each time. This can make change tracking and fixup slow for large data sets.

@ajcvickers,

Thanks for the detailed feedback. This is the typical technical trade-off we meet on any project, in this case using a type where the order is defined against performance.

I will do some performance testing so that when I update my book I can provide some idea of the trade-offs.

Hi @smitpatel,

You said

As I said earlier, items in HashSet has some order but it does not necessarily mean order they are added.

While I agree that the definition of the HashSet says "A set is a collection that contains no duplicate elements, and whose elements are in no particular order." I have looked at the current implementation and it does hold unique values in the order they are added.

@ajcvickers. My performance tests (which DON'T cover all options) show that relational fixup on a query and Add get large hits but SaveChanges (change tracking) is OK. Does that fit with your knowledge of EF Core?

PS. I am currently updating my book to use IList as it is that is the safe solution, but in the chapter on performance tuning I will talk this though and let the developer decide what they want to use.

I have looked at the current implementation and it does hold unique values in the order they are added.

There is a difference between contract and implementation. Changing contract is a breaking change. But changing implementation especially when it does not change contract is not. While current implementation may hold values in the order they are added, changing that implementation in future to void that invariant is non-breaking change as user would be relying on an undocumented behavior.

@JonPSmith See the answer from Jon Skeet on this Stack Overflow issue: https://stackoverflow.com/questions/657263/does-hashset-preserve-insertion-order

In particular:

It's possible that if you never remove any items, it will preserve insertion order. I'm not sure, but I wouldn't be entirely surprised. However, I think it would be a very bad idea to rely on that:

  • It's not documented to work that way, and the documentation explicitly states that it's not sorted.
  • I haven't looked at the internal structures or source code (which I don't have, obviously) - I'd have to study them carefully before making any such claim in a firm manner.
  • The implementation could very easily change between versions of the framework. Relying on this would be like relying on the string.GetHashCode implementation not changing - which some people did back in the .NET 1.1 days, and then they got burned when the implementation did change in .NET 2.0...
Was this page helpful?
0 / 5 - 0 ratings