When querying an entity and filtering on an owned entity the SQL query that is produced includes a LEFT JOIN
that could be avoided.
Entites:
class Order
{
public int Id { get; set; }
public string Title { get; set; }
public Address Address { get; set; }
}
class Address
{
public string Street { get; set; }
public string City { get; set; }
}
Model configuration:
modelBuilder.Entity<Order>().OwnsOne(x => x.Address);
The database table that is created looks like this:
A simple query like:
context.Orders.Where(x => x.Address == null).ToList();
Produces this SQL:
SELECT o."Id", o."Title", t."Id", t."Address_City", t."Address_Street"
FROM "Orders" AS o
LEFT JOIN (
SELECT o0."Id", o0."Address_City", o0."Address_Street", o1."Id" AS "Id0"
FROM "Orders" AS o0
INNER JOIN "Orders" AS o1 ON o0."Id" = o1."Id"
WHERE (o0."Address_Street" IS NOT NULL) OR (o0."Address_City" IS NOT NULL)
) AS t ON o."Id" = t."Id"
WHERE (t."Id" IS NULL)
Which is overly complicated. The columns Address_City
and Address_Street
are available on the Orders
table without any JOIN.
Same thing when querying a specific owned entity property:
context.Orders.Where(x => x.Address.City == "Rome").ToList();
SELECT o."Id", o."Title", t."Id", t."Address_City", t."Address_Street"
FROM "Orders" AS o
LEFT JOIN (
SELECT o0."Id", o0."Address_City", o0."Address_Street", o1."Id" AS "Id0"
FROM "Orders" AS o0
INNER JOIN "Orders" AS o1 ON o0."Id" = o1."Id"
WHERE (o0."Address_Street" IS NOT NULL) OR (o0."Address_City" IS NOT NULL)
) AS t ON o."Id" = t."Id"
WHERE (t."Address_City" = 'Rome') AND (t."Address_City" IS NOT NULL)
Example project (PostgreSQL): EfCoreOwnedEntity.zip
EF Core version: 3.0.0
Database provider: Npgsql.EntityFrameworkCore.PostgreSQL 3.0.1
Target framework: .NET Core 3.0
Operating system: Windows 10 1903
IDE: e.g. Visual Studio 2019 16.3.2
@smitpatel to investigate.
Legit generated SQL.
@smitpatel I think in these cases we could get rid of the join since the outer filter is more restrictive. However we would need to first profile whether this would result in any measurable perf improvement besides just being a simpler query.
Legit generated SQL.
@smitpatel Not if you are not using owned entities with table splitting.
For example we are using owned entity for audit-related information
[Owned]
public class AuditLog
{
[Column(nameof(IsDeleted), Order = 990)]
public bool IsDeleted { get; set; }
[Column(nameof(CreatedTime), Order = 991)]
public DateTime CreatedTime { get; set; }
[Column(nameof(ModifiedTime), Order = 992)]
public DateTime? ModifiedTime { get; set; }
[Column(nameof(CreatedBy), Order = 993)]
public string CreatedBy { get; set; }
[Column(nameof(ModifiedBy), Order = 994)]
public string ModifiedBy { get; set; }
}
We put this entity on many multiple entities (e.g. manufacturers, products, product translations etc), therefore our EF Core-generated queries are monstrous.
This simple expression
var mfgsWithProducts = dbContext
.Set<Manufacturer>()
.Include(m => m.Products)
.ThenInclude(p => p.Translations)
.ToList();
results in
SELECT ....
FROM [Manufacturers] AS [m]
LEFT JOIN (
SELECT ....
FROM [Manufacturers] AS [m0]
INNER JOIN [Manufacturers] AS [m1] ON [m0].[Id] = [m1].[Id]
WHERE [m0].[IsDeleted] IS NOT NULL AND ([m0].[CreatedTime] IS NOT NULL AND [m0].[CreatedBy] IS NOT NULL)
) AS [t] ON [m].[Id] = [t].[Id]
LEFT JOIN (
SELECT ....
FROM [Products] AS [p0]
LEFT JOIN (
SELECT ....
FROM [Products] AS [p1]
INNER JOIN [Products] AS [p2] ON [p1].[Id] = [p2].[Id]
WHERE [p1].[IsDeleted] IS NOT NULL AND ([p1].[CreatedTime] IS NOT NULL AND [p1].[CreatedBy] IS NOT NULL)
) AS [t0] ON [p0].[Id] = [t0].[Id]
LEFT JOIN (
SELECT ....
FROM [ProductTranslations] AS [p3]
LEFT JOIN (
SELECT ....
FROM [ProductTranslations] AS [p4]
INNER JOIN [ProductTranslations] AS [p5] ON [p4].[Id] = [p5].[Id]
WHERE [p4].[IsDeleted] IS NOT NULL AND ([p4].[CreatedTime] IS NOT NULL AND [p4].[CreatedBy] IS NOT NULL)
) AS [t1] ON [p3].[Id] = [t1].[Id]
) AS [t2] ON [p0].[Id] = [t2].[ProductId]
) AS [t3] ON [m].[Id] = [t3].[ManufacturerId]
ORDER BY [m].[Id], [p].[ManufacturerId], [p].[Id], [t3].[Id], [t3].[Id1]
this is ridiculous, since we use owned entities without table splitting, therefore we don't need to left join table to themselves
Any news?
@salaros This issue is in the Backlog milestone. This means that it is not going to happen for the 3.1 release. We will re-assess the backlog following the 3.1 release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources.
any workaround?
I have extensive use of Owned entities and the query with left joins on the same table generates time-out because of the complexity. It could be a simple select * but EF generates 10 left join queries on the same table. I tried to use FromSqlRaw but the result is just another join
any workaround?
for now we stopped using owned entities
Because of the extensive use of owned entities, I can't stop using it. As a workaround I built a database view as a plain object and I referenced the view from my context. By now this has been working (the real Order entity has more owned entities than showed):
Entities:
```c#
public class Order{
public string Number{get;set;}
public Person BillTo{get;set;}
public Person InvoicedTo{get;set;}
}
[Owned]
public class Person {
public string TaxID{get;set;}
public string Name{get;set;}
public string Address{get;set;}
}
this is the table:
```sql
create table Orders (
Id varchar(100)
BillTo_TaxID varchar(100)
BillTo_Name varchar(100)
BillTo_Address varchar(100)
InvoicedTo_TaxID varchar(100)
InvoicedTo_Name varchar(100)
InvoicedTo_Address varchar(100)
)
Now, I created a view from my table:
create view VwOrders as select * from dbo.Orders;
And in my dbContext I added the view as an Entity:
public DbSet<VwOrder> VwOrders { get; set; }
And in the builder:
```c#
modelBuilder.Entity
eb.HasNoKey();
eb.ToView("VwOrders");
});
The VwOrder class:
```c#
public class VwOrder {
public string Id {get;set;}
public string BillTo_TaxID {get;set;}
public string BillTo_Name {get;set;}
public string BillTo_Address {get;set;}
public string InvoicedTo_TaxID {get;set;}
public string InvoicedTo_Name {get;set;}
public string InvoicedTo_Address {get;set;}
public Order ToOrder(){
var ret = new Order {
Id = this.Id,
BillTo = createEntity<Person>("BillTo"),
InvoicedTo = createEntity<Person>("InvoicedTo")
}
return ret;
}
}
And the method createEntity
```c#
private T createEntity
var datos = new T();
var myprops = this.GetType().GetProperties().Where(x => x.CanRead).ToDictionary(x => x.Name);
var props = datos.GetType().GetProperties().Where(x => x.CanWrite).ToArray();
foreach(var prop in props) {
try {
prop.SetValue(datos, myprops[prefix + "_" + prop.Name].GetValue(this));
} catch(Exception ex) {
Console.WriteLine($"{ex}");
}
}
return datos;
}
To query, I just made something like:
```c#
var order = bd.VwOrders.Where(x=>x.Id == "xx").AsNoTracking().FirstOrDefault()?.ToOrder();
@salaros
Did you find that the nested joins hurt your query performance? Mine went from nearly negligible (around 1ms) with ef core 2.2 to 1.5 seconds with ef core 3.0 (200k rows in the table).
Did you find that the nested joins hurt your query performance? Mine went from nearly negligible (around 1ms) with ef core 2.2 to 1.5 seconds with ef core 3.0 (200k rows in the table).
Yeap, especially with orderby and global filters, but as they say
Legit generated SQL
I'm planning to create a PR ASAP with a fix for owned entities (without table splitting). Unfortunately EF Core's release policies are very strange, so there is no way to tell if it's gonna make it for 3.1. In general it seems like the main goal is to gradually kill this project.
Legit generated SQL.
Just to be clear for everyone going in tangential direction. My comment says in other words, "yes, we generated that SQL & we should fix it".
so there is no way to tell if it's gonna make it for 3.1
Not just EF Core, any other open source project on github, look at the milestone of the issue tracking any item. The milestone will tell which release it was or will be fixed. This issue is in backlog milestone and it is not going to happen for 3.1. Release 3.1 is already finalized and going through testing phase.
As for submitting a fix for this issue, this issue is not marked as "good-first-issue" hence we believe it is fairly complex issue. We still encourage you to work on it if you wish. But make sure to discuss design of the fix with us first (by commenting in this issue). If you submit PR directly and if that is incorrect fix, we will not accept the contribution.
Inefficient SQL is one thing. But is there a reason for EF to generate non-nullable owned entity property columns as nullable? If I have something like this:
public class Device {
[Key]
public string Id { get; set; }
[Required]
public DeviceBasicStatistics BasicStatistics { get; set; }
}
public class DeviceBasicStatistics {
// long is not nullable, yet the column is generated as nullable
public long ReportCount { get; set; }
}
I would expect ReportCount
to default to 0, and since 0 is not null, BasicStatistics
will always be created. However, BasicStatistics_ReportCount
is generated as nullable, and if the ReportCount
is set to null
for whatever reason, Device.BasicStatistics
is not loaded and remains null, which breaks the expectations.
Is this a separate issue?
@ZimM-LostPolygon See #12100
I think this should be marked as a type-bug instead of a type-enhancement. In fact, as more rows are added to a table, performance progressively degrades to the point it becomes unusable. Users could not be fully aware of this problem; maybe Microsoft should issue an official statement to discourage using owned types in EFCore 3.0.
My model is identical to @matteocontrini's except for the fact it has 2 owned type properties in my entity class instead of just 1. Here's the query generated by EFCore. It's way too complicated: there are LEFT JOINs of subqueries with nested INNER JOINs.
SELECT "t"."Id", "t"."Author", "t"."Description", "t"."Email", "t"."ImagePath", "t"."Rating", "t"."Title", "t2"."Id", "t2"."CurrentPrice_Amount", "t2"."CurrentPrice_Currency", "t1"."Id", "t1"."FullPrice_Amount", "t1"."FullPrice_Currency"
FROM (
SELECT "c"."Id", "c"."Author", "c"."Description", "c"."Email", "c"."ImagePath", "c"."Rating", "c"."Title"
FROM "Courses" AS "c"
WHERE ((@__model_Search_0 = '') AND @__model_Search_0 IS NOT NULL) OR (instr("c"."Title", @__model_Search_0) > 0)
ORDER BY "c"."Rating" DESC
LIMIT @__p_2 OFFSET @__p_1
) AS "t"
LEFT JOIN (
SELECT "c0"."Id", "c0"."CurrentPrice_Amount", "c0"."CurrentPrice_Currency", "c1"."Id" AS "Id0"
FROM "Courses" AS "c0"
INNER JOIN "Courses" AS "c1" ON "c0"."Id" = "c1"."Id"
WHERE "c0"."CurrentPrice_Currency" IS NOT NULL AND "c0"."CurrentPrice_Amount" IS NOT NULL
) AS "t0" ON "t"."Id" = "t0"."Id"
LEFT JOIN (
SELECT "c2"."Id", "c2"."FullPrice_Amount", "c2"."FullPrice_Currency", "c3"."Id" AS "Id0"
FROM "Courses" AS "c2"
INNER JOIN "Courses" AS "c3" ON "c2"."Id" = "c3"."Id"
WHERE "c2"."FullPrice_Currency" IS NOT NULL AND "c2"."FullPrice_Amount" IS NOT NULL
) AS "t1" ON "t"."Id" = "t1"."Id"
LEFT JOIN (
SELECT "c4"."Id", "c4"."CurrentPrice_Amount", "c4"."CurrentPrice_Currency", "c5"."Id" AS "Id0"
FROM "Courses" AS "c4"
INNER JOIN "Courses" AS "c5" ON "c4"."Id" = "c5"."Id"
WHERE "c4"."CurrentPrice_Currency" IS NOT NULL AND "c4"."CurrentPrice_Amount" IS NOT NULL
) AS "t2" ON "t"."Id" = "t2"."Id"
ORDER BY "t"."Rating" DESC
And here's a quick benchmark I performed. The blue line represents a SQL query I typed by hand and the orange line is the query generated by the LINQ provider. As you can see, performance starts degrading very fast as more rows are added to the table. I'm talking about just 2000 rows in a Sqlite database. All needed indexes are in place.
I am experiencing the same problem, was happy to use Owned entities till I realised that I have got 4 left joins to the same table. I am not going to use them, till this is fixed.
If you use Nested owned types it gets a lot worse.
If you extend the model to:
```C#
class Order
{
public int Id { get; set; }
public string Title { get; set; }
public Address Address { get; set; }
}
[Owned]
class Address
{
public string Street { get; set; }
public string City { get; set; }
public PostalCode PostalCode { get; set; }
}
[Owned]
class PostalCode
{
public string Area { get; set; }
public string Zone { get; set; }
}
then
context.Orders.ToList();
produces (postgresql):
```SQL
SELECT o."Id", o."Title", t1."Id", t1."Address_City", t1."Address_Street", t5."Id", t5."Address_PostalCode_Area", t5."Address_PostalCode_Zone"
FROM "Order" AS o
LEFT JOIN (
SELECT t0."Id", t0."Address_City", t0."Address_Street", o3."Id" AS "Id0"
FROM (
SELECT o0."Id", o0."Address_City", o0."Address_Street"
FROM "Order" AS o0
WHERE (o0."Address_Street" IS NOT NULL) OR (o0."Address_City" IS NOT NULL)
UNION
SELECT o1."Id", o1."Address_City", o1."Address_Street"
FROM "Order" AS o1
INNER JOIN (
SELECT o2."Id", o2."Address_PostalCode_Area", o2."Address_PostalCode_Zone"
FROM "Order" AS o2
WHERE (o2."Address_PostalCode_Zone" IS NOT NULL) OR (o2."Address_PostalCode_Area" IS NOT NULL)
) AS t ON o1."Id" = t."Id"
) AS t0
INNER JOIN "Order" AS o3 ON t0."Id" = o3."Id"
) AS t1 ON o."Id" = t1."Id"
LEFT JOIN (
SELECT o4."Id", o4."Address_PostalCode_Area", o4."Address_PostalCode_Zone", t4."Id" AS "Id0", t4."Id0" AS "Id00"
FROM "Order" AS o4
INNER JOIN (
SELECT t3."Id", t3."Address_City", t3."Address_Street", o8."Id" AS "Id0"
FROM (
SELECT o5."Id", o5."Address_City", o5."Address_Street"
FROM "Order" AS o5
WHERE (o5."Address_Street" IS NOT NULL) OR (o5."Address_City" IS NOT NULL)
UNION
SELECT o6."Id", o6."Address_City", o6."Address_Street"
FROM "Order" AS o6
INNER JOIN (
SELECT o7."Id", o7."Address_PostalCode_Area", o7."Address_PostalCode_Zone"
FROM "Order" AS o7
WHERE (o7."Address_PostalCode_Zone" IS NOT NULL) OR (o7."Address_PostalCode_Area" IS NOT NULL)
) AS t2 ON o6."Id" = t2."Id"
) AS t3
INNER JOIN "Order" AS o8 ON t3."Id" = o8."Id"
) AS t4 ON o4."Id" = t4."Id"
WHERE (o4."Address_PostalCode_Zone" IS NOT NULL) OR (o4."Address_PostalCode_Area" IS NOT NULL)
) AS t5 ON t1."Id" = t5."Id"
Indication of performance problem (tested using different model, but equivalent), using a table with 40,000 records:
Efcore 3.1 query: 500 ms
Manual query: 100 ms.
If you use a Where filter, the performance difference gets a lot bigger. A filter selecting only 2 records (using index) from the table:
Efcore 3.1 query: 280ms
Manual query: 1ms.
This makes the owned entity with table splitting feature not useful in practice.
Damnit, I found out about Owned types and really thought it was great. I implemented it in a table where I have ~50 properties that are owned types, and the query generated for a simple context.EntityDbSet.FirstOrDefault()
has over 330 lines of code and 57 LEFT JOIN! :( 馃憥
The manual query I would have written would have been probably a single liner for this simple scenario... that's a real bummer and should be noted in the official documentation about Owned Entity Types
Moreover, I just tested another small variation and it seems to mess up even more, if I use a variable as a param in a Where
or a FirstOrDefault
call like .FirstOrDefault(x => x.Id == variableParam)...
, the resulting query still contains the 57 Left Join... AND the result set does contain only null values for the owned types... this is really bad lol
@os1r1s110 The strangest thing is that this "feature" made it into .NET Core 3.1, which is LTS, even though this bug has been reported back in Oct 2019
Legit generated SQL
Yeah, sure, but only if owned entities live on a separate table.
What is trully amazing, is the fact that this issue is not considered to be a "bug" but an "enhancment", giving the imperssion that currently EF works as expected, in an LTS version. I would really like to hear from the team the criteria that qualify an issue as a bug.
To be fair they have a lot of work done on EF Core and it's mostly really good, this one might have fell into a crack and I just hope they will get a chance to revisit to make this feature useful :)
I just saw that the principal interested is in vacation, so let's not hope for a quick response here.
I don't know who else could be looking at it ...
@os1r1s110 I trully respect the work and effort put on EfCore, and I also accept the fact that there might be other issues with higher priority. I don't judge anybody, I am just curious about what qualifies as a bug and what not. To me, it's crystal clear that this behaviour is not working as expected, it's definatelly a bug.
This is the worst sort of bug - a latent bug that lulls you into a false sense of security that you are doing the right thing and everything is OK until BANG! Your queries are taking 2 minutes to execute and your system grinds to a halt.
Discuss with @smitpatel and @AndriySvyryd
I have the same problem, query is overly complicated, moreover result is incorrect.
Steps to reproduce:
Entities:
```C#
public class CarDto : IDto
{
public Guid Id { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public EngineDto Engine { get; set; }
public Guid LanguageId { get; set; }
}
public class EngineDto
{
public Guid Id { get; set; }
public string Name { get; set; }
}
Model configuration:
```C#
modelBuilder.Entity<CarDto>().Property(q => q.Id).HasColumnName("CarId");
modelBuilder.Entity<CarDto>().Property(q => q.Name).HasColumnName("Name");
modelBuilder.Entity<CarDto>().Property(q => q.Description).HasColumnName("Description");
modelBuilder.Entity<CarDto>().OwnsOne(q => q.Engine, x =>
{
x.Property(q => q.Id).HasColumnName("EngineId");
x.Property(q => q.Name).HasColumnName("EngineName");
});
modelBuilder.Entity<CarDto>().ToView("vCars");
The database view looks like this:
The data looks like this:
The query in LINQ looks like this:
var data = _db.Set<CarDto>().Where(q => q.LanguageId == languageId).ToList();
This query produces SQL, which looks like this:
SELECT [v].[CarId], [v].[Description], [v].[LanguageId], [v].[Name], [t].[CarId], [t].[EngineId], [t].[EngineName]
FROM [vCars] AS [v]
LEFT JOIN (
SELECT [v0].[CarId], [v0].[EngineId], [v0].[EngineName], [v1].[CarId] AS [CarId0]
FROM [vCars] AS [v0]
INNER JOIN [vCars] AS [v1] ON [v0].[CarId] = [v1].[CarId]
WHERE [v0].[EngineId] IS NOT NULL
) AS [t] ON [v].[CarId] = [t].[CarId]
WHERE [v].[LanguageId] = @__languageId_0',N'@__languageId_0 uniqueidentifier',@__languageId_0='9EA1AD19-2C42-4755-B837-701C39E41D37'
and result is:
There are eight rows, sholud be only two, like that:
Any update on when this is going to be resolved?
@thijscrombeen We're investigating options for releasing before 5.0.
Update on this: we spent a long time figuring out if we could fix this in a patch release with sufficiently low risk. A complete fix is not looking very feasible, but we're still looking at tactical fixes for some cases.
Putting this in 3.1.x for that work. /cc @smitpatel
How comes it got broken? Is it going to be actually re-done completely as whatever was working before is not fixable anymore?
@VaclavElias - It is not broken. It is altogether different thing. In previous version of EF Core, owned entities were required hence it generated simpler SQL. Due to #9005, they are now optional, hence we need to add more checks in SQL making it complicated to make sure we get correct results from server back. In order to go back to previous version's behavior fully, #12100 is required and you would need to configure the model according to that.
Filed #19932 for the fix which was added to the patch.
Support for owned entities was introduced in EF Core 2.0. At that time we received significant feedback that forcing owned entities to be required was very limiting. Based on this we made owned relationships optional in EF Core 3.0.
The concept of what constitutes an "optional owned entity" is nuanced. (Or maybe it isn't, but I for one have had trouble getting my head around what it really means, both in terms of behavior and mapping.) Regardless, in adding the flexibility to allow owned entities to be optional, we didn't appropriately take into account the degraded queries produced in cases where the owned entity does not need to be optional.
So, while maybe the people asking for optional owned entities are happy, all the people who didn't need them to be optional are now seeing much worse queries.(1)
We should not have _replaced_ required owned entities with optional owned entities. We should have kept required owned entities and allowed optional owned entities to be configured. (We could also have changed the default as long as it was possible to get back to the old behavior.)
So why didn't we do this here? Because:
So, in retrospect, we should have punted optional owned entities for 3.0 and tackled in a future release when we could at the same time keep supporting the old behavior.
We have been investigating what we can do to improve these queries in a 3.1.x patch release. For example, see #19932.
Unfortunately the interaction between the model shape and the query pipeline is not making it easy to find tactical fixes that are suitable for patching. We will continue to pursue this, but it won't fix all the queries that were degraded by this change.
We're scheduling support for _both_ optional _and_ required owned entities in 5.0 for November. See #12100. We realize that November is a long time to wait. We are working hard to ensure our daily builds and previews are high-quality, so this may be an option for some people.
This kind of retrospective analysis is part of our ongoing development process. We made mistakes here, but as always we're learning and will feed these lessons into future design and planning. As always, we welcome constructive feedback on anything here. If you have other ideas for things we could do here, then please let us know!
(1) I usually refer to this as a "grass is greener" scenario. That is, given a bunch of people using something, those who are unhappy generally make a lot more noise than those who are happy. If all these people say, "we need it this way instead!" then it can start to seem, psychologically, that we made the wrong choice. So because _everybody_ is saying that the grass is greener over there, we believe this and go with it.
But this is obviously flawed. Because the people happy with the current grass _aren't being vocal about it_. So it's not really _everybody_ at all.
In cases like this we sometimes only really realize that we made a mistake when we've already jumped the fence and now _everybody_ is saying, "Hey! The grass back where we came from was much better!"
@ajcvickers Thanks for the clarification/update.
I would have a question for you though. As I understand it, you seem to say it's one or the other (non optional owned entities with good queries, or optional owned entities with the bad queries....) Is that really mutually exclusive?
We thought that optional owned entities was an important feature to deliver. (This may still be true--since we did deliver it, we're not seeing feedback from anyone who would have said something if we hadn't done it.)
I for example, am one of the people who think that optional owned entities are an important feature (really important IMO) as I want to use it extensively.
A simple example for my case is to save test results for a given entity, on which not all tests are mandatory (depending on if the user has selected it or not). Instead of saving the generated report as a PDF, I save it in a table with owned entities (one for each test result, which embeds some other meta-data about the test and result). That wouldn't be possible at all if all owned entities were required. Now I wonder though why it isn't possible to have optional owned entities AND generate reasonable queries. In my use case, I could easily select all the owned entities in a single liner SELECT * FROM <table> where Id = x
without requiring complex join queries if I wrote the query manually, but if I rely on EF Core to generate it, it does effectively create a really not optimal query as already mentioned... Wouldn't there be a way to generate these simple queries ?
TBN: I am super greatful for all the work that's been done in EF Core and I really don't want this to be taken as a "complaint" or anything, I'm legitimately trying to understand what prevents the query from being as simple as mentioned above when using a LINQ query like the following: _dbContext.TestResults.Where(x => x.Id == <id>).FirstOrDefault()
Thanks in advance!
@os1r1s110 I'll talk to the team and check on the technical details. You may be correct, but I expect if that's the case, it still requires significant changes to the model/query pipeline equivalent to supporting both. (We will, of course, strive for good queries for both cases.) This is also complicated by the difference between owned entities sharing a table (table splitting) and owned entities mapped to their own table.
(I'm very much acting as a manager here. :-) @AndriySvyryd and @smitpatel understand the complications much better than I do.)
Does "overly complicated" also mean bad performing?
Does "overly complicated" also mean bad performing?
This thread contains several benchmarks, showing how query performance degrades with each new .Include()
. In my case I had the same owned type on almost all my entities (auditlog for tracking who and when created/last modified the those entities).
EF Core-generated queries were so slow on .NET Core 3.1 LTS that I had to completely change my data model I've been successfully using since .NET Core 2.0 through all the updates.
Thank you for the additional context @ajcvickers - this is obviously a complicated issue for your team and I'm sure everyone here appreciates the effort you are putting in.
I'm sure there is a really good reason why you can't do this, but the nieve solution would be to push the optional processing up into c# land?
Eg, this is similar to what you are generating at the moment:
SELECT t1.a, t2.owned_b, t2.owned_c
FROM SomeTable t1
LEFT JOIN (
SELECT t.id, t.owned_b, t.owned_c
FROM SomeTable t
WHERE t.owned_b is not null and t.owned_c is not null
) t2 on t1.id = t2.id
I think the whole point of that complexity in the query to say "if any property is null in the db, make the entire owned object null"
I would think you could achieve the same thing with
SELECT t1.a, t1.owned_b, t1.owned_c
FROM SomeTable t1
and some c# filtering.
I don't mean to tell your team how to "suck eggs" - obviously you know a lot more about this problem space than we do, but maybe sharing why this is so difficult would help us understand, as I'm sure I'm not the only one wondering this.
I'm talking with a team that has SAME modeling as Dec 24 @msneijders (https://github.com/dotnet/efcore/issues/18299#issuecomment-568738570,) multiple levels of owned entities...and a simple query generating same monstrous SQL. After looking this over with them I agree with their solution to just use Dapper for the horribly performing queries. They do want to use and stay with EF Core, so maybe someday they'll be able to switch back. This also HUGELY impacts my conversation with devs all over the world about DDD & EF Core so I look forward to the fix. _Finally, please add this to the breaking changes page in the docs._
@salaros
Did you find that the nested joins hurt your query performance? Mine went from nearly negligible (around 1ms) with ef core 2.2 to 1.5 seconds with ef core 3.0 (200k rows in the table).
Me too. Started getting timeouts on entities with many owned entities.
We are in the middle of a large DDD project using EF Core. We have recently upgraded to 3 and all our developers are complaining about the SQL generated from this issue. It has a massive impact.
Optional owned types is a nice to have for us, and I was excited for this feature, but at least that limitation had workarounds. The queries generated now for our required owned types (the majority) are not great, creating views is a desperate solution and certainly not logical for everything, So we are going forge on and hope/pray that there is a fix soon.
@brenwebber conversation I just had with a client last week and this is their plan: if your EF calls are not already isolated/encapsulated and you have time/resources, refactor to isolate queries in their own repos/classes (always good to have that stuff separate from biz logic anyway). Find the queries that are causing pain with the change and separate them out further then switch them all to Dapper. (Not saying "all queries for app to be in one class" just standard separation of concerns to break things apart. At some point, when this is sorted out, it will be easier to switch those back to EF...if you want to. When I'd suggested EF Core/Raw SQL/Views they said that they'd done some comparisons and found Dapper, in many cases, to provide better perf that was significant enough to stick with Dapper. Also guessing you are doing views mapped to entities, not keyless entnties because those have limitations around owned entities. I need to write a blog post but haven't had a chance yet. Hope this is useful and not stating the obvious.
I am still holding out for a fix in 3.1.3. We still have some time before go live and the UAT performance (under low loads) is still fine.
I will re-evaluate in a month or 2 and if there is still no resolution, creating some Dapper based query services sounds like a better work around. Thank you @julielerman, I appreciate the advice.
@brenwebber - This is not going to be fully resolved in 3.1 release. Too risky change to put in a patch.
cc: @ajcvickers
I'm having the same issue as our fellow friends here. I'm moving most of my queries to Dapper. To be honest. once I finish, I'm pretty sure I'm not gonna change it back to EF. It's too much work.
FWIW, I haven't personally compared perf of EF Core/Views/Raw SQL to Dapper. I played with it back with EF6 but not since. I'm definitely curious and I know that team I talked with was going to do some comparisons for their own queries.
This problem should not be treated as an improvement, but treated as a bug, because performance is an essential concern in an ORM. The simplest example highlights this bug, then indicates that the feature was developed, but has not been tested. Until resolved, Owned Types must be a feature NOT recommended in the official documentation.
In addition to @rafaelfgx comments on performance being an essential concern for a CRM, we are not talking about it being 20% slower or even 500% slower -we are talking about a bug which can easily cause an entire system to crash.
@smitpatel I would be very happy with a "partially" resolved / workaround / opt-in solution (using decorators or a linq extension etc.). We would prefer to stick with a single ORM as far as possible, and we still have some time before this becomes a huge issue for us, but more queries are being written every day so we are kind of banking on a solution in the next 6 months.
It's trully amazing!
After 5 months, 21 participants in this conversation, graphs that prove the degrade of performance, too many comments to count but this issue is still not considered a bug! I don't care if _the grass is green or not_, but I was under the impression that producing a valid and decent query to fetch data, is not an "enhancement".
Maybe I'm wrong...
My projects has been upgraded 2.2 to 3.1,
so embarrassment, now whether it's publish or waitu for Net5 for 7months!
We've reclassified this as a bug, but as @smitpatel said it's still too risky to fix in a patch release.
@AndriySvyryd - what sort of timeframe are we looking at for resolution of this bug?
@AndriySvyryd We've reclassified this as a bug, but as @smitpatel said it's still too risky to fix in a patch release.
Thank you, I personally think it was a good decision.
Please, also update the documentation and/or write a blog post to let more people know about this issue. Unfortunately, it looks like there are long months ahead of us before a fix is released, despite your best (and most appreciated) efforts.
If you don't, some developers might not notice until they go to production and with enough data they'll watch their application slow down to a crawl. After so many comments, it seems clear now that people are wasting time because this issue was not addresses properly.
I agree with @MorenoGentili that documentation must be updated and a post on devblog to notify all developers about this performance bug
I'm also still looking at other potential workarounds that are not as drastic as writing raw queries. I have ideas, but it'll probably be the weekend before I get enough coding time to try them out.
(@amrbadawy wrong mention 馃槉)
@AndriySvyryd - what sort of timeframe are we looking at for resolution of this bug?
@davidames It will be fixed before November, but it's hard to say in which 5.0 preview it would be included.
@ajcvickers given that you're doing it over the weekend, let me know if I can help...guinea pig or something. I'm still sick so not getting out much anyway.
@MorenoGentili @amrbadawy are either of you able to either add an issue via the comments on the bottom of the doc's page or even make the change and submit a PR?
My team also experienced a pretty drastic performance degradation after our DB size grew over a certain threshold. FWIW, we were able to mitigate the performance impact with the help of our DBA; we re-built the statistics on our table to help the query planner. We also added a maintenance task to regularly rebuild our statistics.
More info about statistics and its relationship with query plans in this article: https://www.red-gate.com/simple-talk/sql/performance/sql-server-statistics-basics/
@julielerman I added a comment as you suggested. However, I won't send a PR since I'm not sure what the proper wording would be in this case.
Our team is also totally blocked from upgrading by this issue.
It looks like we will create our own fork where we rip out all the table splitting stuff that we don't use anyway and just fix the queries. Anyone done anything similar already?
Generated SQL after fix for query in OP
SELECT [o].[Id], [o].[Title], [o].[Address_City], [o].[Address_Street]
FROM [Order] AS [o]
WHERE [o].[Address_City] = N'Rome'
I can appreciate that this will be in EFCore 5. But that leaves those of us stuck on a .NET Framework project stranded with this incredibly poor-performing bug. Is there really no way to include this in a patch release for v3? Or is there no workaround? I tried to work around it myself by using FromSql, hoping that I could tell it "relax... I'm writing the core SQL for the data" but it just wraps my SQL in the same horrible LEFT JOIN syntax.
I make heavy use of value objects and complex owned entities. Right now my query is generating 114 LEFT JOINS to itself and takes 3 minutes to run! There has to be a way around this in v3! Saying it's a v5-only fix just doesn't cut it :(
@smitpatel any thoughts? Could your latest commit be patched into v3? Even if not offically released... I'll build it myself and use a local nuget repo if I have to! I'm desperate.
Can we make a little extension from the patch and toss it on nuget?
please fix this issue in EF Core 3.x to allow us use it without a huge of changes
@smitpatel any thoughts? Could your latest commit be patched into v3? Even if not offically released... I'll build it myself and use a local nuget repo if I have to! I'm desperate.
I will let @ajcvickers respond to possibilities other than patching. For the patching, it is not possible to patch this. Even though there is 1 commit directly linked to this item, there has been several commits went in to lay down infrastructure to make this change. Some of them involved breaking changes. Further, it touches how EF Core materialize entity, which in itself is a core path for query and too risky for a patch to change.
that makes sense, @smitpatel. I still vote for the temporary solution of insulating the relevant queries and either using raw sql or dapper for now. (with perf tests to compare those options if the diffs are significant enough to care in your solution.)
FWIW today I did a temporary workaround. Most of my 114 queries are value objects of Money, Percentage, Count, EmailAddress, ZipCode, etc. In other words, one-value value objects.
I was mapping them like like this:
builder.OwnsOne(m => m.OwnerPersonalEmail).Property(e => e.Value).HasColumnName(nameof(Recruit.OwnerPersonalEmail));
builder.OwnsOne(m => m.OwnerPreferredPhoneNumber).Property(e => e.Value).HasColumnName(nameof(Recruit.OwnerPreferredPhoneNumber));
Because they're all single-value value objects, that means they map to a single field in the DB, too. So I rewrote them using Value Converters like this:
builder.Property(m => m.OwnerPersonalEmail).HasConversion(EmailAddressValueConverter.Instance);
builder.Property(m => m.OwnerPreferredPhoneNumber).HasConversion(PhoneNumberValueConverter.Instance);
Property instead of OwnsOne. And HasConversion. The converters are pretty basic. Inline, they would look something like: HasConversion(phone => phone == null ? null : phone.Value, str => str == null ? null : new PhoneNumber(str))
Worth noting there's some risk in doing it this way that wasn't risky in the old way. That is, if my rules for PhoneNumber changes, or if I get some bad data in the DB, the old deserialization process from EF would still use the private setter to set the Value property, whereas now it's using the public constructor. If this presents a problem in the future, then I'll have to go the unfortunate route of using reflection to construct my object with my private parameterless constructor, and reflection to set the property value.
With that said, I'm REALLY REALLY hoping we get a patch for V3. 3 Minutes to run a query and 114 self joins... that's gotta be a patch for EF3. This saved me this time (I still have EIGHT self joins). But some objects won't be so easy for me to do.
If this doesn't get patched with V3, I might be looking at going to NHibernate (really don't want to have to rewrite all my configurations to NH!). This project isn't fully CQRS, so I don't feel like going Dapper for reads and slow, complex EF loads for writes is going to be a good solution here, nor is eager loading the entire object graph with Dapper. I need a DDD-ready, lazyloading ORM.
Hope this post helps others in similar situations.
Hi @grexican,
interesting. Thanks. There's an interesting discussion of value objects and value converters in this thread https://github.com/dotnet/efcore/issues/13947 if you haven't seen it yet.
Also, maybe your model specifies this but I'm curious about 'eager loading a graph' to get VO properties. Are you storing them in separate tables?
And re "lazy loading", are you lazy loading those properties?
Not challenging you, just curious. There are a lot of variables to consider. :)
Here's an update on where we are with this at the release level. Unfortunately:
So we're now getting this fixed in the master branch as soon as possible. This will mean:
As soon as the fix is in the daily builds, we will post here and would love people to start testing it. We need to make sure that we understand:
So then if we get this in a daily build/preview with enough testing and validation from customer cases, we will revisit the risk/overhead associated with getting this out as a fully supported release verses having people use 5.0 previews. (We have been brainstorming ways to release more than a preview, and will continue to do so, but I'm not promising at this time that this will happen.)
A note on daily builds. EF Core 3.0 was not normal in terms of the internal churn and state of the daily/preview builds. In EF Core 5.0, we are back to a normal dev flow where we keep all the tests passing on each build. This means that we expect the daily builds to be very stable. Of course, there will be bugs because we're writing a lot of new code. But the good thing about the daily builds is that we often have the ability to fix blocking bugs and get them out in a matter of days.
A note on 5.0 previews. We're planning on shipping these frequently. However, this is coordinated across .NET, and preview 2 requires more coordination than usual. This means the window for preview 2 changes has now closed, and the fix for this will likely make it into preview 3. This is another reason to try the daily builds and not wait for the preview.
@smitpatel @ajcvickers Sorry if this is a silly question but I noticed that this item is mentioned in #19932 which was closed and set in the 3.1.3 patch as a closed issue? Does this mean it should be fixed in 3.1.3? Thanks
@absilver808 We fixed some cases, but not the majority. Read through the discussion here to understand more.
@ajcvickers reading Smit's description in #19932 I'm trying to discern which cases are fixed. "When owned navigation is defined on base type " : does this mean when there is an explicit navigation mapping? I only tested ONE scenario that I already have in place....a simple class with one property that is an owned entity. THere is only the ownsone mapping, no navigation mappings. And the query doesnt change with 3.1.3. When I added a navigation mapping (following examples in the breaking changes doc) there was still no change. Is there an easy list of what scenarios might be covered with this little fix? (I've read the discussion and can't tell, so I"m sorry if I missed it...) Thanks
@absilver808 @julielerman #19932 removes the INNER JOIN
for owned types defined on the base entity type when the query selects the base type and all the derived types.
@AndriySvyryd ahh when inheritance is involved... got it. Glad you were at least able to find some places where you could fix it pre-EFCore 5.
We believe this issue is fixed (except maybe some edge cases) in EF Core 5.0 preview 3, which is now live on NuGet. See the announcement post for full details.
Please, please try this preview and let us know if you're still running into issues so that we can address any remaining issues in future 5.0 previews.
Note that .NET 5 is not required to use EF 5.0. You can try it with your .NET Core 3.1 applications!
This is fixed in the latest .net 5.0 preview.
Is there any chance the fix in EF 5 to be back ported into EF core 3.1? With these joins Owned entity types isn鈥檛 usable at all with EF core 3.1. And since .NET core 3.1 is a LTS version we can鈥檛 really think of upgrade until the next LTS version becomes available.
I know this is late, but maybe consider getting this listed as a breaking change on the EF Core 3 docs?
We're having roughly the same setup as in https://github.com/dotnet/efcore/issues/18299#issuecomment-580196697, and simply calling
_dataContext.Users.Find(userId);
where Users
own another entity and userId
is a primary key, will throw an Invalid operation exception stating that the sequence contains multiple elements (which is obviously nonsense as that can't happen for primary keys, and also isn't the case in the database).
Going from Find(key)
finding the key to throwing an exception for a key that only exists once in the database seems like a textbook definition of a breaking change.
@zornwal - If you are hitting an exception then your issue is not related to this one. There is no error thrown if you are hitting this issue.
@Vake93 - This cannot be backported to 3.1 release.
@smitpatel Alright we managed to find out what was going on, and it seems at least tangentially related to the issue.
We had a base class with a collection of owned entities, from which two other classes inherited. At some point, the collection was moved down into one of the inheriting entities instead, without removing the entries referencing the now collection-less other class. This led to querying the dbset of the base class failing with the mentioned exception when the entity in question was the collection-less one, but had entries remaining in the database. Don't know if this is intended, but it seems pretty obscure anyway. Might be worth checking if issue persists on 5.0 of it isn't intended.
@zornwal - Restating, if you are running into this issue then you will _not_ get any exception. If you are seeing an exception then your issue is not same as this. Please file a new issue with detailed repro.
FullText search is broken because it's being run on the joined data and it's throwing an exception
Cannot use a CONTAINS or FREETEXT predicate on column ......
[t0].[Property_14]
and [t0].[Property_13]
should be [j].[Property_14]
and [j].[Property_13]
for the problem to be fixed.
SELECT [j].[Id]
FROM [TABLE_A] AS [j]
LEFT JOIN (
SELECT [j0].[Id], [j0].[Property_1], [j0].[Property_2], [j0].[Property_3], [j0].[Property_4], [j0].[Property_5]
FROM [TABLE_A] AS [j0]
WHERE [j0].[Property_1] IS NOT NULL
) AS [t] ON [j].[Id] = [t].[Id]
LEFT JOIN (
SELECT [j1].[Id], [j1].[Property_6], [j1].[Property_7], [j1].[Property_8], [j1].[Property_9], [j1].[Property_10], [j1].[Property_11], [j1].[Property_12], [j1].[Property_13], [j1].[Property_14], [j1].[Property_15], [j1].[Property_16]
FROM [TABLE_A] AS [j1]
WHERE [j1].[Property_11] IS NOT NULL
) AS [t0] ON [j].[Id] = [t0].[Id]
WHERE (([j].[Property_22] <> CAST(1 AS bit)) AND (DATEADD(day, CAST(10.0E0 AS int), [t].[Property_1]) > CONVERT(date, GETDATE()))) AND (((([j].[Property_22] <> CAST(1 AS bit)) AND ([j].[State] = 3)) AND (FREETEXT([t0].[Property_14], 'testsearch') OR FREETEXT([t0].[Property_13], 'testsearch'))) AND (DATEADD(day, CAST(10.0E0 AS int), [t].[Property_1]) > CONVERT(date, GETDATE())))
@MoazAlkharfan Please open a new issue with a small repro project
@smitpatel @ajcvickers
When I use FromSqlRaw, it still generates complex and inefficient sql
db.LiveHistory.FromSqlRaw(@"select * from ""LiveHistory""").Select(a => a.Data.RoomId).FirstOrDefault();
SELECT t."RoomId"
FROM (
select * from "LiveHistory"
) AS l
LEFT JOIN (
SELECT l0."Id", l0."RoomId"
FROM "LiveHistory" AS l0
INNER JOIN "LiveHistory" AS l1 ON l0."Id" = l1."Id"
) AS t ON l."Id" = t."Id"
LIMIT 1
db.LiveHistory.Select(a => a.Data.RoomId).FirstOrDefault();
SELECT l."RoomId"
FROM "LiveHistory" AS l
LIMIT 1
@darkflame0 - When you use FromSql*
API on an entity type which has owned entities, we cannot simplify the SQL easily. EF Core does not parse the SQL to look into it what you are selecting from. The owner entity can be coming from a stored procedure or a view or a custom projection. In neither of those case we can assume that owned entity will also come from there only. Even with an example like yours, instead of select *
if you specified all list of columns and did not select columns for owned entities, it wouldn't work without a join. So if we tried to pull owned entity columns, it will be invalid SQL error. Hence on the safer side we generate a join. We are tracking in #21888 to improve experience around this.
Most helpful comment
I think this should be marked as a type-bug instead of a type-enhancement. In fact, as more rows are added to a table, performance progressively degrades to the point it becomes unusable. Users could not be fully aware of this problem; maybe Microsoft should issue an official statement to discourage using owned types in EFCore 3.0.
My model is identical to @matteocontrini's except for the fact it has 2 owned type properties in my entity class instead of just 1. Here's the query generated by EFCore. It's way too complicated: there are LEFT JOINs of subqueries with nested INNER JOINs.
And here's a quick benchmark I performed. The blue line represents a SQL query I typed by hand and the orange line is the query generated by the LINQ provider. As you can see, performance starts degrading very fast as more rows are added to the table. I'm talking about just 2000 rows in a Sqlite database. All needed indexes are in place.