Efcore: NavigationExpandingExpressionVisitor

Created on 12 Nov 2019 · 24Comments · Source: dotnet/efcore

@smitpatel

Can you give me a brief overview of what NavigationExpandingExpressionVisitor does.

Queryable functions needs some changes in here and it would save me some time to know the purpose of this class.

Thanks

area-query closed-question customer-reported type-enhancement

Source

pmiddleton

👀2 👍2 😄1

Most helpful comment

It is the visitor which expands navigation. 😄
Sub-Expressions:

NavigationExpansionExpression - Stores information about current queryable. It's source, structure of projection, parameter type etc. This is needed because once navigations are expanded we still remember it to avoid expanding again.
EntityReference - remembers entityType in projection structure above.
NavigationTreeNode - a node in navigation binary tree. Navigation tree is structure of current parameter which would be transparent identifier. Hence binary structure. This allows to easily condense to inner/outer member access.
NavigationTreeExpression - leaf node on navigation tree. Leaves represent projection structures of NEE. They contain Value which could be NewExpression/EntityReference.
OwnedNavigationReference - Since owned navigations are not expanded (since they map differently in different providers). This remembers such references so that they can still be treated like navigation.
IncludeTreeNode - Tree structure of includes for given entityType in EntityReference.

Sub-expression visitors:

ExpandingExpressionVisitor - Expands navigations in the given tree for given source. Also expands navigations for includes if passed the flag.
IncludeExpandingExpressionVisitor - Expands include tree. This is separate and needed because we may need to reconstruct parts of NewExpression to apply includes.
PendingSelectorExpandingExpressionVisitor - NEE remembers Pending selector so we don't expand navigations unless we need to. This visitor applies them when we need to.
ReducingExpressionVisitor - Removes custom expressions from tree and converts it to LINQ again.
EntityReferenceOptionalMarkingExpressionVisitor - Marks EntityReference as nullable when coming from Left Join. Nullability required to figure out if the navigation from this entity should be left join or inner join.
SelfReferenceEntityQueryableRewritingExpressionVisitor - Visitor to allow self reference of query root inside queryfilter/defining query.

Overall process:
When we start with query root, we create NEE with EntityReference. We apply defining query/query filters as needed. Then we translate other queryable methods. Mainly

Select will compose over in pending selector, remembering what is the final shape we need to project. We defer select till end so that we can reuse expanded navigations.
Methods which don't change shape of selector like Where/OrderBy will expand navigations in lambda and update NEE.
OrderBy/ThenBy are kept pending till we encounter ordering chain since we may need to expand navigations and ThenBy has to follow OrderBy.
Joins combine pending selectors from both sides and create new TransparentIdentifier preserving structures.
Include/ThenInclude - populate Include tree and this tree is passed around to appropriate entity reference when expanding.
Single result operators - They mark NEE as non-queryable.
Aggregate/All/Any - They apply pendingselector and apply aggregate operator and returns linq expression since they cannot be composed over and they don't have entity information anymore.
Skip/Take/Distinct/GroupBy/Set operations/Unknown methods - terminates current pending selector by applying it and restarts with new NEE.

There is bunch of code here and there to special case certain things like

ToList on collection navigation
Count on List
AsQueryable after collection navigation
Conversion to enumerable method is member is not a collection navigation which has AsQueryable.

Above is high level overview of how it works. Some of the things could change based on bugs but not expecting a lot of change in this area. Let me know if you have questions about how to handle specific method.

smitpatel on 12 Nov 2019

❤4 👀2

All 24 comments

It is the visitor which expands navigation. 😄
Sub-Expressions:

NavigationExpansionExpression - Stores information about current queryable. It's source, structure of projection, parameter type etc. This is needed because once navigations are expanded we still remember it to avoid expanding again.
EntityReference - remembers entityType in projection structure above.
NavigationTreeNode - a node in navigation binary tree. Navigation tree is structure of current parameter which would be transparent identifier. Hence binary structure. This allows to easily condense to inner/outer member access.
NavigationTreeExpression - leaf node on navigation tree. Leaves represent projection structures of NEE. They contain Value which could be NewExpression/EntityReference.
OwnedNavigationReference - Since owned navigations are not expanded (since they map differently in different providers). This remembers such references so that they can still be treated like navigation.
IncludeTreeNode - Tree structure of includes for given entityType in EntityReference.

Sub-expression visitors:

ExpandingExpressionVisitor - Expands navigations in the given tree for given source. Also expands navigations for includes if passed the flag.
IncludeExpandingExpressionVisitor - Expands include tree. This is separate and needed because we may need to reconstruct parts of NewExpression to apply includes.
PendingSelectorExpandingExpressionVisitor - NEE remembers Pending selector so we don't expand navigations unless we need to. This visitor applies them when we need to.
ReducingExpressionVisitor - Removes custom expressions from tree and converts it to LINQ again.
EntityReferenceOptionalMarkingExpressionVisitor - Marks EntityReference as nullable when coming from Left Join. Nullability required to figure out if the navigation from this entity should be left join or inner join.
SelfReferenceEntityQueryableRewritingExpressionVisitor - Visitor to allow self reference of query root inside queryfilter/defining query.

Overall process:
When we start with query root, we create NEE with EntityReference. We apply defining query/query filters as needed. Then we translate other queryable methods. Mainly

Select will compose over in pending selector, remembering what is the final shape we need to project. We defer select till end so that we can reuse expanded navigations.
Methods which don't change shape of selector like Where/OrderBy will expand navigations in lambda and update NEE.
OrderBy/ThenBy are kept pending till we encounter ordering chain since we may need to expand navigations and ThenBy has to follow OrderBy.
Joins combine pending selectors from both sides and create new TransparentIdentifier preserving structures.
Include/ThenInclude - populate Include tree and this tree is passed around to appropriate entity reference when expanding.
Single result operators - They mark NEE as non-queryable.
Aggregate/All/Any - They apply pendingselector and apply aggregate operator and returns linq expression since they cannot be composed over and they don't have entity information anymore.
Skip/Take/Distinct/GroupBy/Set operations/Unknown methods - terminates current pending selector by applying it and restarts with new NEE.

There is bunch of code here and there to special case certain things like

ToList on collection navigation
Count on List
AsQueryable after collection navigation
Conversion to enumerable method is member is not a collection navigation which has AsQueryable.

smitpatel on 12 Nov 2019

❤4 👀2

Ok you are either the worlds faster typist or you had that stored away in a word document :)

Thanks this will help me work through the issues I'm having in here better .

pmiddleton on 12 Nov 2019

Typed on the fly lol!

smitpatel on 13 Nov 2019

😄1

@smitpatel, @ajcvickers

Just an update. I have about half of my unit tests working for queryable functions. My plan is to finish things up over xmas break so I can get PR ready in Jan.

I had to break down and get a new computer. My poor 8 year old Core 2 2600k was taking 2-3 minutes to build and start a unit test :)

pmiddleton on 11 Dec 2019

👍2 😄1

@pmiddleton Glad you powered-up. Looking forward to the PR!

ajcvickers on 12 Dec 2019

@smitpatel - Design question for you.

I have this test query I'm digging through while trying to figure out how things are working (it uses the data in the UDF tests)

var t = (from c in context.Customers
                        select new
                        {
                            Order = c.Orders.Where(o => o.QuantitySold > 2).ToList()
                        }).ToList();

Why do queries which have collection projections need to go down the _clientEval route in RelationalProjectionBindingExpressionVisitor? There is no actual method eval client side. Why couldn't the regular eval path be modified to make this work?

What am I missing here.

pmiddleton on 24 Dec 2019

Projection Contains a collection. SelectExpression's select clause cannot contain collections (only scalar values), hence we need to generate a join and iterate over multiple rows on client side to product correct results. Which makes it client eval. Another way to look at it, anything which is non-client eval, means there is one-to-one mapping between member access path (ProjectionMember) to SqlExpression so that if you access that particular member in binding later (let's say OrderBy), then you can use it. But when you think about collection projection, can you really put that collection in OrderBy? Hence after this projection, the query is pretty much non-composible, hence client eval.

smitpatel on 25 Dec 2019

Thanks Smit. I knew that mapping had to happen during materialization, I just didn't connect that with the _clientEval flag. My brain was stuck thinking that only meant evaluating method calls client side when they were not translatable.

pmiddleton on 26 Dec 2019

@smitpatel @ajcvickers

I need some ideas.

I have run into a problem materializing child collections which originate from a queryable function. The result type of a queryable function might not have a primary key defined (nor may it be possible to define one). CustomShaperCompilingExpressionVisitor expects the child collection to have a key that it can use to identify which collection any given part of a result row needs to be placed into.

I've been trying to figure out a way to either generate a temp key, or work around needing one.

I've been able to get things working if you only project a single collection by using the parents id as the key. This quickly fails if you introduce a second collection.

Have you run into this scenario before and if so how did you deal with it?

For reference this is the type of query I am talking about.

var results = (from c in context.Customers
               select new
               {
                    c.Id,
                    OrderCountYear = context.GetCustomerOrderCountByYear(c.Id).ToList()
               }).ToList();

pmiddleton on 31 Dec 2019

Likely dupe of https://github.com/aspnet/EntityFrameworkCore/issues/15873

It is not possible to project out a group if you don't have a way to correlate with outer row.

smitpatel on 31 Dec 2019

Yea from looking at the code that was what I was afraid of. I was holding out hope there was something I was missing :)

I can either leave in the support I have for a single collection, or remove it and blanket say all projected collections require an id.

What do you think?

pmiddleton on 31 Dec 2019

What does single collection query look like?

smitpatel on 31 Dec 2019

That is the example I posted above. You can have a single QF collection with no other collections.

You key off of the parent id. When that changes you know to start another collection.

So the sql for the above generates

SELECT [c].[Id], [c].[LastName], [o].[Count], [o].[CustomerId], [o].[Year]
FROM [Customers] AS [c]
OUTER APPLY [dbo].[GetCustomerOrderCountByYear]([c].[Id]) AS [o]
ORDER BY [c].[Id]

With this data from the unit tests.

Id  LastName    Count   CustomerId  Year
1   One          2       1           2000
1   One          1       1           2001
2   Two          2       2           2000
3   Three            1       3           2001
4   Four             NULL       NULL     NULL

I'm not sure if that will be useful enough vs the confusion it will generate when someone tries to do multiple collections.

pmiddleton on 31 Dec 2019

Not supported.

smitpatel on 31 Dec 2019

Status update.

I got all of my query unit tests passing over xmas vacation. I now just need to finish up work on some model validation changes.

I should still be on track to have a PR ready sometime in Jan.

pmiddleton on 6 Jan 2020

The return type of a QueryableFunction needs to be registered as an Entity in order for everything to work. However a table shouldn't be created in the database as this is not a persisted type.

I'm looking for a way to flag the entity so as to not generate a table. Is there something like this already in place?

pmiddleton on 8 Jan 2020

@smitpatel @ajcvickers @bricelam - Can anyone point me in the right direction for any existing code for not generating a table for entities? This is one of the last things I have yet to do.

pmiddleton on 10 Jan 2020

We need #2725. We hacked it in ToView using a marker annotation...

https://github.com/dotnet/efcore/blob/3ae7e2806c07d9e11d29bbdf5f0807a7acdcfa4a/src/EFCore.Relational/Extensions/RelationalEntityTypeBuilderExtensions.cs#L281

...and ignoring it in the model differ.

(Note well, viewDefinition below is the annotation object, not the annotation's value)

https://github.com/dotnet/efcore/blob/3ae7e2806c07d9e11d29bbdf5f0807a7acdcfa4a/src/EFCore.Relational/Extensions/RelationalEntityTypeExtensions.cs#L295-L297

bricelam on 11 Jan 2020

@ajcvickers @bricelam Does anyone there have issues with this solution when running ReSharper? When I have ReSharper on I keep running out of memory issues. I had to turn it off to keep things stable.

pmiddleton on 14 Jan 2020

@pmiddleton Not sure if anyone uses R# routinely on the team. A couple of us use Rider and I'm not aware of any issues with it.

ajcvickers on 14 Jan 2020

@ajcvickers - Ok thanks. Yea R# uses up a ton of ram and I'm bumping into the 2 GB 32 bit process limit. If you could walk down the hall to the VS team and have them build as 64 bit :) haha

pmiddleton on 14 Jan 2020

@pmiddleton They will just say to uninstall R#. Visual Studio has been making a big push over the last few years to be natively productive-- that is, not require third-party extensions for productivity. The reason being that R# (and probably similar extensions) use a lot of resources.

ajcvickers on 14 Jan 2020

Weird, I was sure R# ran out-of-process precisely to avoid the 2GB limit... But who knows exactly what's going on there.