Runtime: Proposal: Queryable.Append, Queryable.Prepend

Created on 6 Dec 2016 · 20Comments · Source: dotnet/runtime

Important note: There is a chance this API could be breaking. People who are already calling Append on some Queryable will be calling into the Enumerable version, which returns an IEnumerable<T>. Meanwhile, the proposed overloads return an IQueryable<T>. However, these APIs have been out for less than a year and people seem to agree that keeping symmetry with Enumerable is more valuable.

Per discussion at https://github.com/dotnet/corefx/pull/14186#discussion_r90985817 We should add Queryable.Append and Prepend extension methods since they already exist on Enumerable.

Proposal

c# namespace System.Linq { public static class Queryable { public static IQueryable<TSource> Append<TSource>(this IQueryable<TSource> source, TSource element); public static IQueryable<TSource> Prepend<TSource>(this IQueryable<TSource> source, TSource element); } }

cc @JonHanna @bartdesmet

api-approved area-System.Linq up-for-grabs

Source

jamesqo

👍1

Most helpful comment

I think we're pretty much all on the same page based on the discussion, so we can decide whether the (re)compile-time breaking change potential of adding Append and Prepend to Queryable is a show-stopper. If we want to restore the symmetry, we better do it sooner than later though.

With regards to the factory methods such as Repeat, Empty, and Range, we've put these as extension methods on the IQueryProvider equivalent in Rx. This basically states that Enumerable.* are shorthand syntax for linqToObjects.Provider.* (modulo the quoting), and that the Queryable.* equivalents only have a meaning when the query provider is specified. The only issue today is that query providers typically don't expose themselves to the world other than through the Provider property on a returned IQueryable<T>. It'd look like this:

ctx.Provider.Range<int>(0, 10)

The CreateQuery (extension) method with an Expression<Func<T>> parameter is something we've done over here a while ago, as a shorthand syntax to capture a query expression containing things such as factory methods. It's merely a workaround for a language limitation IMO, namely the lack of general quotations (i.e. not piggybacking on lambda conversion). We just pretend that () => in () => whatever is an intuitive way to quote an expression, but IMO it's confusing syntactic noise (over e.g. <@ in F# or the quotation syntax in LISP). The whole Queryable contraption could really be seen as a workaround for the lack of more generalized quotes.

bartdesmet on 6 Dec 2016

👍2

All 20 comments

We already have [IEnumerable.Concat(IEnumerable second)](https://msdn.microsoft.com/en-us/library/bb302894(v=vs.110).aspx). We plan to add Enumerable.Singleton (or CreateForm in issue dotnet/runtime#15136. Isn't that enough?
Are these use cases so common to justify yet-another API for simplicity?

karelz on 6 Dec 2016

@karelz This is an existing API on Enumerable that needs to be added to Queryable. See https://github.com/dotnet/corefx/blob/master/src/System.Linq/src/System/Linq/AppendPrepend.cs

jamesqo on 6 Dec 2016

@karelz we already have these methods on Enumerable, the question is whether they should be also put on Queryable.

And I'm not 100% sure myself.

The likes of Repeat and Range don't have a Queryable equivalent because they're starting with elements and producing sequences and there isn't really an applicable equivalent. Whatever comes in dotnet/runtime#15136 would be the same.

Any method that works on one or more sequences (the majority of them) have parity between Enumerable and Queryable.

Append and Prepend are novel in that they do involve an element being turned into part of a sequence, but they also start with a sequence. When they were added I thought of the first part of that and reasoned that they didn't belong on queryable, but after @jamesqo mentioned it I thought further and I agreed that since they start with an IEnumerable<T> they do indeed have a reasonable equivalent for IQueryable<T> that should be provided for by Queryable.

But I can still see that this might be more trouble than it's worth for providers to try to produce a result that can then be queried again usefully. In the absence of a Queryable method, calling Append or Prepend treats the IQueryable<T> as an IEnumerable<T> and does the appending or prepending in memory. Maybe that's the best that can be reasonably hoped for.

So personally I'm torn on this as a design decision. I'd be interested to hear what @divega and @bartdesmet think in terms of implementation at the provider end (I looked a tiny bit into that side of things when I used to contribute to Npgsql, but what seems very difficult to me might seem more straightforward to those who have done more with that sort of problem).

JonHanna on 6 Dec 2016

I see - it was introduced in dotnet/runtime#14728 in .NET Core 1.0. I let @VSadov @OmarTawfik decide if they want to add it to Queryable.

karelz on 6 Dec 2016

Personally, I think Enumerable and Queryable should remain symmetric. Otherwise, it becomes hard to reason over what happens locally versus under control of a query provider, and impossible for query providers to translate intent. Also, the Enumerable methods leak into inner lambdas anyway (due to lambda conversion to expression trees), so those operators become visible to query providers anyway.

For example, in Bing, we use expression trees to transplant intent to other nodes in a compute fabric using verbatim expression serialization. Having some operators be "left behind" breaks the assumption of distribution and leaves residual computation behind. While a query provider could decide that's the best strategy, we should not prevent query providers from inspecting the intent and deciding on what to do with it.

There's a caveat with the current situation though... Adding these as extension methods to IQueryable<T> is a compile-time breaking change, because calls will get bound to Queryable rather than Enumerable, thus handing out these expressions to query providers. These providers could start failing on it (although some would simply split the remote parts from the local parts using AsEnumerable on behalf of the user). The fix would be for users to put an explicit AsEnumerable call if they run into this.

bartdesmet on 6 Dec 2016

👍1

That's a good point, and speaks also to the importance of doing such additions at the same time, when they happen.

JonHanna on 6 Dec 2016

although some would simply split the remote parts from the local parts using AsEnumerable on behalf of the user

Which also has the advantage of being pretty trivial. As long as there's an easy way for a provider to implement it, even if not necessarily the best way, a lot of my concern goes away. That's knocked me off the fence and onto supporting.

JonHanna on 6 Dec 2016

Makes sense to me to have these new operators defined for IQueryable<T>.

I also agree on the need to keep Enumerable and Queryable as much as possible symmetric. Although it might be obvious and a bit off-topic, I think some exceptions are justified:

Such plain static methods as Enumerable.Repeat(), Range() and Empty() are interesting in that they allow for the creation of IEnumerable<T> query roots from data - or from no data in the case of Empty(). Their IQueryable<T> counterparts would need to also provide the means to inject an IQueryProvider. There are probably various different patterns we could choose to support for that, but I cannot think of any that would make them really symmetric.
LINQ methods that produce collection results such as ToList(), ToDicionary() and any methods defined in ToCollection.cs by definition cross the boundary from IQueryable<T> to IEnumerable<T>. They implementations consist of an enumeration and the creation of a new data structure to hold the results. I believe it could be useful to have Queryable versions of these to allow for capturing their intent so that the query provider can have a say on how they get executed, perhaps with the goal of making them more efficient. But I doubt that the Queryable versions could be symmetric to their Enumerable counterparts (in particular the binary compatibility concern @bartdesmet already mentioned would be a factor).

In the end I value Queryable operators very much because of how they enable very smooth and terse query composition starting with an IQueryable<T> root and ending on either element or collection results. For all other needs I am generally ok with saying that the answer is to _quote_ the whole query expression from start to finish, e.g. we (or anyone else) could introduce API to enable writing code like this:

C# var someQuery = someQueryContext.CreateQuery(() => Enumerable.Range(1,3).ToList());

If everyone was happy following this approach, Queryable wouldn't be needed at all, but being able to express most common queries without a nested lambda is quite nice.

@bartdesmet, in fact I believe you suggested this kind of API to me long time ago. Have you had any new insights on this?

divega on 6 Dec 2016

ctx.Provider.Range<int>(0, 10)

bartdesmet on 6 Dec 2016

👍2

If we want to restore the symmetry, we better do it sooner than later though.

I very much agree with @bartdesmet; I think if we don't act now, we'll forget about this issue for like a year and by then it'll be too late to do anything. @karelz It seems like everyone is in favor of adding these methods, can this be marked ready for review?

jamesqo on 30 Jan 2017

I would like to have area experts (@VSadov @OmarTawfik) to make the final call on readiness of this API for review - they are the first-level review, before we do the "big one" :)

That said, it seems to be consistent with previous direction of Queryable/Enumerable, so I expect it will be simple to get ack from @VSadov / @OmarTawfik.

karelz on 30 Jan 2017

Yes. I agree with all points.
I also though initially that since one of the inputs is an element, not in a sequence form, it would prevent to have meaningful symmetry, but it looks like it is all doable.

It will be a breaking change, but the scenario should not be very common yet.
It seems better to make the change now.

VSadov on 30 Jan 2017

I agree that we should keep the symmetry. ack 👍

OmarTawfik on 30 Jan 2017

@VSadov @OmarTawfik please update the bug accordingly in that case (api-needs-work -> api-ready-for-review, milestone: Future). (yes, I could do it myself, but I won't scale to watch over every single issue in CoreFX repo, so I need area owners to learn what to do ;-))

If there is breaking change impact (even small), it should be mentioned in the top-most proposal. So that it can be discussed as part of bigger API review. All relevant info should be there (ideally in very brief way, reading pages of text is not fun either ;-)).

karelz on 30 Jan 2017

👍1

@karelz I will update the top.

jamesqo on 30 Jan 2017

👍1

@jamesqo thanks!

karelz on 30 Jan 2017

Type parameter should be TSource to match its mirror and similar cases in Queryable.

Apart from that, LG2M. Really, the API can't have any other shape and maintain the analogy with Enumerable.

JonHanna on 30 Jan 2017

T->TSource updated in top-most post.
Thanks @VSadov for marking the issue!

karelz on 30 Jan 2017