I like the Include().ThenInclude() syntax of EF Core. It would also be great if there was an "AlsoInclude" so that when you want to include another peer several levels deep, you don't have to start at the top (with Include) again. You start getting these really long statements with many include, then include, then include (back to) include, then include, then include, then include, etc. Not a big deal or nothing. It works as is, but syntactically could certainly be more readable if we had a peer level include (ie, whatever entity was "above" the last ThenInclude). Does that make sense?
I don't see how can it be simplified. Mind writing a small example?
Take a look at this example. Note that this example is fairly small to elucidate the need, but a real world example could be more complex depending on your needs.
_dbContext.Things.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(z => z.Level3Property)
.ThenInclude(a => a.Level4Property_A)
.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(y => y.Level3Property)
.ThenInclude(z => z.Level4Property_B)
.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(y => y.Level3Property)
.ThenInclude(z => z.Level4Property_C)
This example, the 4th level entity has 3 nav properties that we want included. With the current Api, you need to start all the way back at the top for each of these, correct?
With the "AlsoInclude" syntax, it would look like this.:
_dbContext.Things.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(z => z.Level3Property)
.ThenInclude(a => a.Level4Property_A)
.AlsoInclude(z => z.Level4Property_B)
.AlsoInclude(z => z.Level4Property_C)
See how the AlsoInclude would reference the same parent that then ThenInclude before it referenced.
One issue I see with this is that it only allows you to go back up one level, so it only helps with a certain subset of cases. I'm not saying we shouldn't add it, just that it doesn't prevent you having to go back to the root entity in all cases. There may be another solution where you do some kind of nesting inside the API call... rather than just chaining calls together at the root level (something like the nested closure pattern we use elsewhere). That may end up being super complex though and the code may be unreadable :smile:.
In order to make something like AlsoInclude()
work we would need the ability to backtrack one level (as @rowanmiller mentioned) and I believe that means we would need ThenInclude()
to return a new version of IIncludeableQueryable<>
that would carry the type from the previous-to-last navigation property in addition to the query's root type and the last navigation property's type.
For completeness here is how one of the other variations previously requested would look like for this example:
``` C#
_dbContext.Things.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(z => z.Level3Property)
.ThenInclude(a => new
{
a.Level4Property_A,
a.Level4Property_B,
a.Level4Property_C
});
Notice that the last call to `ThenInclude()` lands on a type that is not part of the model, so would it make sense to be able to continue chaining more calls to `ThenInclude()`s after that? :fearful:
All these approaches have in common that they require any edge added to the graph of entities to be included to have its starting point on either the query's root type or on the landing type of a previous call to `Include()` or `ThenInclude()`.
Besides specific limitations such as https://github.com/dotnet/roslyn/issues/8237 this has the potential to provide a very nice intellisense experience for simple queries. But beyond certain threshold of complexity intellisense starts getting in the way instead of helping.
Assuming we want to support simple patterns that allow expressing more complex queries in the future, I think we should look into approaches that remove the restriction that edges need to start on a previous type and allow developers to enumerate the edges that need to be traversed without following any particular order, e.g.:
``` C#
_dbContext.Things.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(z => z.Level3Property) // Level3Property is of type Type3
.Include((Type3 t3) => t3.Level4Property_A)
.Include((Type3 t3) => t3.Level4Property_B)
.Include((Type3 t3) => t3.Level4Property_C);
Like AlsoInclude()
this removes the need to start over on each path because in fact on each call you can describe an edge that starts on any type. It does requires specifying the starting type for the edge but as a way to express the shape of the graph is more flexible, e.g. it allows requesting a navigation property that only exists on a derived type to be included in the results.
I remember two examples that followed this approach:
DataLoadOptions.LoadWith()
Edge<T1,T2>()
method that was long ago suggested to us (see the EF6 issue and the original article on it). Both used an object that is separate from the query to describe the shape of the graph but I think just adding the edges inline in the query could also work.
In fluid interfaces there is this concept of Mark
and Return
. I'm not too sure if it is more readable or not:
_dbContext.Things
.Include(t => t.Level1Property)
.ThenInclude(t => t.Level2Property)
.ThenInclude(t => t.Level3Property)
.Mark()
.Include(t => t.Level4Property_A)
.Include(t => t.Level4Property_B)
.Include(t => t.Level4Property_C)
.Return(); // optional if it is the last instruction
Perhaps with an expression?
_dbContext.Things
.Include(t => t.Level1Property)
.ThenInclude(t => t.Level2Property)
.ThenInclude(t => t.Level3Property)
.Mark(
lev3 => lev3
.Include(t => t.Level4Property_A)
.Include(t => t.Level4Property_B)
.Include(t => t.Level4Property_C)
);
I like the Mark / Return syntax myself, and being consistent with other apis would be good.
I wonder why should we have API that manually resolve include duplicates.
By the way once you have joined table in SQL you can work with it several times (obviously) so technically it's more than just API change.
Currently I'm working on to resolve it and do not touch API.
I'm doing includes tree instead of current flat structure.
Will post PR shortly, maybe next week.
We use a path definition API for includes / eager loading in llblgen pro: the main advantage is that you work with nodes to a graph to which the user can then add additional things, like filters / joins / excluding of fields / limits / ordering etc. for that node. We have a couple of variants on this api, here's one built with extension methods which under the hood translates to the more verbose API we have in the framework https://github.com/SolutionsDesign/LLBLGen.Linq.Prefetch. The tests give a good overview of what you can define at what level. This gives no limits on defining the graph and you can still do it from one method call. Additional documentation of our own API: http://www.llblgen.com/Documentation/5.0/LLBLGen%20Pro%20RTF/Using%20the%20generated%20code/Linq/gencode_linq_prefetchpaths.htm
@FransBouma what stops you from Join / Order / Filter / Limit within the basic pattern of Include()/ThenInclude() + LINQ operations?
When you write SQL you join table and can do everything withing your joined table.
Why do I need to specify prefetch, sub paths and a lot of stuff in code directly? How do I read this code couple months later? And how SQL knowledge could help me here with LLBLGen?
From my standpoint Include()/ThenInclude() is just smart joins that could attach data into your object, and when you use filtering or limiting it should perform this without any additional work and even more easy than write this directly in SQL. Isn't this why we all using ORMs?
I worked with LLBLGen a wile ago and it was pretty painful comparing to Entity Framework.
You have to write a lot of code and this result of your work couldn't be easily understood by your team because complex queries will look more complex than generated SQL queries...
Entity Framework Core gives more abilities to generate queries on the fly with Expression Trees so you can feel free to create dynamic query caching, partial query caching and more flexible way to synchronize data over several applications with distributed cache.
You actually can parse any of the EF Core targeted expressions manually and this isn't that painful and this is huge advantage when you can speak with ORM with the powerful Expression Trees instrument.
@FransBouma https://github.com/FransBouma what stops you from Join / Order / Filter / Limit within
the basic pattern of Include()/ThenInclude() + LINQ operations?
It doesn't if you can start from one Include().
When you write SQL you join table and can do everything withing your joined table.
Why do I need to specify prefetch, sub paths and a lot of stuff in code directly? How do I read this
code couple months later? And how SQL knowledge could help me here with LLBLGen?
Eager loading is a graph, e.g. when you have a split-graph from one root, which is the pain point here. If the nodes are joined or not, that's up to the ORM. EF6 joined everything together, EFcore seems to join m:1/1:1 together.
From my standpoint Include()/ThenInclude() is just smart joins that could attach data into your
object, and when you use filtering or limiting it should perform this without any additional work and
even more easy than write this directly in SQL. Isn't this why we all using ORMs?
includes are related sets fetched eagerly, they're not joins. If you have multi-branched graphs, joins aren't going to help you. Which is why you provided a PR for the duplicates, didn't you? :)
I worked with LLBLGen a wile ago and it was pretty painful comparing to Entity Framework.
Why? It's just a linq query, which look the same.
You have to write a lot of code and this result of your work couldn't be easily understood by your
team because complex queries will look more complex than generated SQL queries...
I wonder why considering the linq queries look the same?
Entity Framework Core gives more abilities to generate queries on the fly with Expression Trees so you
can feel free to create dynamic query caching, partial query caching and more flexible way to
synchronize data over several applications in general.
Heh, yeah, 'expression trees' are a novelty only used by EF, right? Every linq provider uses them. Besides, evaluating expression trees is very slow. Any other query system is much faster, and most ORMs have a query system with extension methods (like NHibernate does with QueryOver and we do with QuerySpec) which evaluate much faster and with simpler code.
But, that side, I didn't post the links to start a debate, I just gave a hint how we do it, you know for help? Why the snark towards my work? Isn't this a debate regarding an eager load api?
Oh well...
FB
includes are related sets fetched eagerly, they're not joins. If you have multi-branched graphs, joins aren't going to help you. Which is why you provided a PR for the duplicates, didn't you? :)
Ok ok, I just didn't name it right.
Why? It's just a linq query, which look the same.
Query matters, it depends on what extension methods you have, there the main difference between all LINQ based ORMs.
Besides, evaluating expression trees is very slow. Any other query system is much faster
I would say it's fast enough for ORM and even for in memory cache (complex query).
The slow part here is compilation it seems, expression visitor is fast itself.
And I don't say it's the best, nothing is best. It's just really flexible when you generate query.
It doesn't matter what query system you would use, API matters.
But, that side, I didn't post the links to start a debate, I just gave a hint how we do it, you know for help? Why the snark towards my work? Isn't this a debate regarding an eager load api?
I'm sorry maybe I posted too a lot critic on the LLBLGen side and mine personal feeling of it.
Query matters, it depends on what extension methods you have, there the main difference between all LINQ based ORMs.
Though the set of extension methods in that regard is small: most Linq query methods are the ones in the .NET framework. Eager loading directives are the ones setting things apart, besides smaller ones like excluding fields in a set fetch.
Eager loading APIs are a pain to design, I've now done a couple and IMHO the biggest problem to solve is to guide the developer to specify a graph (with potentially multiple branches) using just 1 method call (which contains lambdas with multiple method calls etc.). This often feels unnatural to developers however the alternatives do too. The one I linked to is done by a community member of ours and is actually very elegant and easy to use. With 'easy to use' I mean: it feels natural to define the graph like that as the statement flow in the code look like the graph paths so you see what is fetched as related set where. The biggest hurdle to solve for people is this:
fetch orders
fetch related Order details per order
fetch related Employee per order
how to specify the order details and employee split graph nodes relative to order so it makes sense in the query? This isn't that simple, especially in a sequence oriented DSL like linq.
I'm sorry maybe I posted too a lot critic on the LLBLGen side and mine personal feeling of it.
That's OK, you likely worked with an older version which had our low-level non-linq API (e.g. 2.6 from 2008 ;)). If you look at the tests I linked to the queries look like EF queries as linq queries all look the same, except of course for the eager loading api which is the point of this thread! ;)
@FransBouma I did a look at this
metaData.Project
.With(p => p.ProjectCompanyLinks
.With(pcl => pcl.Company
.With(c => c.CompanyAddresses),
pcl => pcl.Contact.ContactType));
The main difference with EF Core here from my understanding is recursive model and additionally you can add .With() to any node, instead of using Include()/ThenInclude() attached to the query itself.
By the way this will require (in EF Core) non-trivial expression rewriting and big changes on SQL generation, main query execution flow and a lot of work on relinq to get query part as result operator (I take this would be needed because you should be able work with graph you attached to a query as separated thing).
Well this would be nice to have a such model :)
UPD
But still it's a little complicated, technically this will result in several real queries (as well as every collection Include() in EF Core), but collection Include() in EF Core obvious, this thing isn't :)
@multiarc any chance that could be made as a transformation to EF Core includes statements? I'm thinking about looking into that particular line of enquiry myself.
I don't like AlsoInclude
because the words Also
and Then
aren't logically related this way.
What about .Up().ThenInclude()
instead of .AlsoInclude()
? That answers @rowanmiller's reasonable hesitation that AlsoInclude
only solves one arbitrary class of cases. With Up
, you could just do .Up().Up().ThenInclude()
and I believe that would cover all cases.
Found this thread searching for "AlsoInclude" because I was hoping someone had the same idea as me-- sure enough they did.
Ran into this today where I have
Product (1) -> Variants (n) -> SubscriptionPlan (1) -> Features (n) & Licenses (n)
As I was trying to figure out the chain (which I'm still not even sure how to do properly), it made sense to me that there should be an AlsoInclude, or just at least someway to traverse back up the chain ("also" made sense to me)
_context.Products
.Where(p => p.Id == productId)
.Include(p => p.Variants)
.ThenInclude(pv => pv.SubscriptionPlan)
.ThenInclude(sp => sp.Licenses)
.AlsoInclude(sp => sp.Features)
So, to come all the way back up to Include variants seems counter intuitive. So personally, I liked AlsoInclude. Maybe it only solves one thing, yes, but it would solve that one thing very well :)
For the record though, I liked this pattern suggested by divega
_dbContext.Things.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(z => z.Level3Property) // Level3Property is of type Type3
.Include((Type3 t3) => t3.Level4Property_A)
.Include((Type3 t3) => t3.Level4Property_B)
.Include((Type3 t3) => t3.Level4Property_C);
One issue with the type-based approach is self-referential tables or tables with cycles.
I still think Up()
is simpler and clearer:
c#
_dbContext.Things.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(z => z.Level3Property)
.ThenInclude(t3 => t3.Level4Property_A)
.Up().ThenInclude(t3 => t3.Level4Property_B)
.Up().ThenInclude(t3 => t3.Level4Property_C);
@jnm2 that's just confusing. I have no idea to which element the Level4Property_B is included in. To define a proper graph on one line, you can simply use the scoping of the language.
e.g.:
_context.Products
.Where(p => p.Id == productId)
.Include(p => p.Variants)
.ThenInclude(pv => pv.SubscriptionPlan)
.ThenInclude(sp => sp.Licenses)
.AlsoInclude(sp => sp.Features)
should become:
_context.Products
.Where(p => p.Id == productId)
.Include(p => p.Variants
.Include(pv => pv.SubscriptionPlan
.Include(sp => sp.Licenses)
.Include(sp => sp.Features)
)
)
It simply defines the scope the includes work on with the lambda. It's immediately clear into which elements the elements are merged with.
@FransBouma Up() is at least no more confusing than AlsoInclude, and I think less confusing. But I have to agree that yours is the least confusing so far.
Agree. That's probably the best of everything submitted so far, but would intellisense be able to parse it?
The problem is that we'd have to have an Include extension method on every type to handle navigation properties, not just collections.
We will have a perennial .Include suggested for every type everywhere in the code, along with Equals, GetHashCode, and ToString.
Note for implementer: see note on #2953 since there is overlap between these two features.
If the following
_context.Products .Where(p => p.Id == productId) .Include(p => p.Variants .Include(pv => pv.SubscriptionPlan .Include(sp => sp.Licenses) .Include(sp => sp.Features) ) )
causes
The problem is that we'd have to have an Include extension method on every type to handle navigation properties, not just collections.
We will have a perennial .Include suggested for every type everywhere in the code, along with Equals, GetHashCode, and ToString.
then that could be fixed by throwing in a wrapper, and exposing the wrapper as a second argument to Include?
_context.Products
.Where(p => p.Id == productId)
.Include(p => p.Variants, v=> v
.Include(pv => pv.SubscriptionPlan, s=>s
.Include(sp => sp.Licenses)
.Include(sp => sp.Features)
)
)
```
One of the includes would look a like this:
```cs
IQueryable<T> Include<T,TProperty>(this IQueryable<T> source, Expression<Func<T,TProperty>> selector, Action<IIncludeGraph<T,TProperty>> nested == null)
And I'm guessing, that this could even be implemented on top of the existing system purely as extension methods (and support classes/interfaces), by recursively applying the Includes and ThenIncludes from IIncludeGraph directly on the queryable?
and the same graph could be used for #2953 when mapped to the entity's model?
an .IncludeTypical()
could look up this graph and apply it to the query?
I'm liking the semantics of AlsoInclude()
in the 2nd comment.
Ideally I would love to see an option to configure sort of view
in the DB context as a DbSet, which is mapped not to a specific DB table, but to specific snapshot I can control by all of those Include/ThenInclude/.. (for table scopes) as well as by selecting specific column set(s) via something like IncludeColumns (list of columns) / IncludeExcept (table, list of columns) and so on.
Database views is a great solution for more or less complex snapshots and the only way to get them more or less efficiently is to map raw SQL query or SP call to DbSet. Fluent LINQ query for that would be much better IMHO. And it sounds quite doable. We just need to add some syntax sugar based on additional Extension methods for IQueryable
.
The way I tried to explain (ping me if was not clear, my English muscles get weak on evenings) we could potentially reduce the data we grab from DB as we can get this way only those fields we really need. This is to cover the gap we have now - when query is not too complex to consider some DocumentDB sync and query from there, but already too complex to deal with it as we do with CRUD stuff over relational DB. I mean this is for searches with sorting, paging and filtering (for "reach" UI most of the clients love) rather than for reporting / exporting the huge datasets with 10K+ records with lots of nested details.
Seems like this feature might eventually get me to what I need but maybe it doesn't so here is the use case I really need. We have lookup tables that reference other lookup tables and are heavily used in application tables. Example:
public class School
{
public StateOrProvince StateOrProvince { get; set; }
}
public class StateOrProvince
{
public Country Country { get; set; }
}
To include the children in a query we provide extension methods that pull in all the child entities for the queries that need it. This reduces all this duplicate code that we'd need in all the queries across our system.
public static IQueryable<School> WithChildren ( this IQueryable<School> source )
=> source.IncludeStateOrProvince(x => x.StateOrProvince);
public static IQueryable<T, StateOrProvince> IncludeStateOrProvince<T> ( this IQueryable<T> source, Expression<Func<T, StateOProvince>> property )
=> source.Include(property).ThenInclude(x => x.Country);
Where things fall apart is when this entity is a navigation property of a higher level object. The include
needs to be ThenInclude
.
public class MedicalEducation
{
public School School { get; set; }
}
public static IQueryable<MedicalEducation> WithChildren ( this IQueryable<MedicalEducation> source )
=> source.IncludeSchool(x => x.School);
public static IQueryable<T, School> IncludeSchool<T> ( this IQueryable<T> source, Expression<Func<T, School>> property )
=> //Works: source.Include(property).ThenInclude(x => x.StateOrProvince).ThenInclude(x => x.Country);
//Doesn't: source.Include(property).ThenIncludeStateOrProvince(x => x.StateOrProvince);
public static IIncludeQueryable<T, StateOrProvince> ThenIncludeStateOrProvince<T> ( ... )
The issue is that the return type of ThenInclude
is determined by the order in which child properties are referenced. But a caller of the extension method shouldn't be dependent upon the order of the ThenInclude
calls. Currently I don't see a way around this. One thought was to capture the include of the state which is the object I really want to return but then you lose the ThenInclude
portion of the country.
Being able to create extension methods that can include children for arbitrarily nested related objects is very useful to us and I believe a more flexible include API might give us this. Right now we are stuck replicating the include logic and it isn't fun.
Most helpful comment
Take a look at this example. Note that this example is fairly small to elucidate the need, but a real world example could be more complex depending on your needs.
_dbContext.Things.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(z => z.Level3Property)
.ThenInclude(a => a.Level4Property_A)
.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(y => y.Level3Property)
.ThenInclude(z => z.Level4Property_B)
.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(y => y.Level3Property)
.ThenInclude(z => z.Level4Property_C)
This example, the 4th level entity has 3 nav properties that we want included. With the current Api, you need to start all the way back at the top for each of these, correct?
With the "AlsoInclude" syntax, it would look like this.:
_dbContext.Things.Include(x => x.Level1Property)
.ThenInclude(y => y.Level2Property)
.ThenInclude(z => z.Level3Property)
.ThenInclude(a => a.Level4Property_A)
.AlsoInclude(z => z.Level4Property_B)
.AlsoInclude(z => z.Level4Property_C)
See how the AlsoInclude would reference the same parent that then ThenInclude before it referenced.