Efcore: Update: Relational: Potential optimization in MERGE SQL batching path?

Created on 9 May 2017 · 16Comments · Source: dotnet/efcore

(Based on feedback from https://www.brentozar.com/archive/2017/05/case-entity-framework-cores-odd-sql)

Right now when using the MERGE batch update path, we perform a final ORDER BY on the server to retrieve the store gen'd values in the correct batch order. However, it could be that the values are already in the correct order, so we could perform the ordering on the client, lazily, by detecting out of order rows when they occur. NB. This would sometimes require client buffering of the reader.

area-perf closed-fixed type-enhancement

Source

anpete

Most helpful comment

EDIT: Looks like @anpete beat me to the punch. The below confirms his findings.

My reading of #4080 is that non-deterministic output from the multi-row INSERT caused the client to incorrectly reconcile inserted identities with their respective entities. Is that correct?

If so, I think we'd be fine here, because the above construct still allows the explicit _Position column for deterministic ordering. Indeed, if client-side sorting were possible (as mentioned in the OP), then both variables could be eliminated, leaving only a single MERGE statement.

zac-torkelson on 10 May 2017

😄1 👍1

All 16 comments

cc @divega

anpete on 9 May 2017

Another comment from https://www.brentozar.com/archive/2017/05/case-entity-framework-cores-odd-sql/#comment-2398183:

Zac Torkelson May 9, 2017 4:42 pm
Were Table Value Constructors (https://docs.microsoft.com/en-us/sql/t-sql/queries/table-value-constructor-transact-sql) considered instead of the temporary toInsert table variable? I have had good results with this technique in the past and would be interested in your benchmarking results.

Something like (apologies for the formatting and any syntax errors as I am presently mobile):

  MERGE [Weapons] USING (VALUES
  (@p0, @p1, 0),
  (@p2, @p3, 1),
  (@p4, @p5, 2)
  ) AS i ([IsPointy], [Name], [_Position]) ON 1=0
  WHEN NOT MATCHED THEN
  INSERT ([IsPointy], [Name])
  VALUES (i.[IsPointy], i.[Name])
  OUTPUT INSERTED.[WeaponId], i._Position
  INTO @inserted0;

However this looks very similar to multi-row INSERT syntax, which turned out not to have deterministic order (https://github.com/aspnet/EntityFramework/issues/4080).

divega on 9 May 2017

@divega The Table Value Constructor pattern still allows for an explicit order sentinel and so it doesn't matter whether the order is deterministic. It looks to be a slightly cleaner syntax by omitting one TV.

anpete on 10 May 2017

👍1

EDIT: Looks like @anpete beat me to the punch. The below confirms his findings.

My reading of #4080 is that non-deterministic output from the multi-row INSERT caused the client to incorrectly reconcile inserted identities with their respective entities. Is that correct?

zac-torkelson on 10 May 2017

😄1 👍1

I think we'd be fine here, because the above construct still allows the explicit _Position column for deterministic ordering

@anpete @zac-torkelson you are right. BTW, thanks for the feedback!

divega on 10 May 2017

@AndriySvyryd Thoughts on this for 2.0? Should it be punted for now? /cc @divega

ajcvickers on 9 Jun 2017

I think it can go in. These aren't breaking changes.

AndriySvyryd on 9 Jun 2017

I removed the ORDER BY and ran a microbenchmark inserting 100x10000 rows with identity and saw no measurable difference so this doesn't seem to be worth doing.

AndriySvyryd on 10 Jun 2017

@AndriySvyryd Nice! Can you share the query plans for both?

anpete on 10 Jun 2017

The query plan shows that ordering takes up a big part of the execution time, but the actual duration is the same. I don't have enough expertise to interpret this.
noorder
order

AndriySvyryd on 12 Jun 2017

Looks like the merge takes so long that the subsequent select duration is insignificant
merge

AndriySvyryd on 12 Jun 2017

Interesting, thanks for looking at this. Great to understand what's going on here.

anpete on 12 Jun 2017

@anpete @AndriySvyryd should we close this issue as fixed then?

I wonder whether we should still track offloading ORDER BY from the database server. Perhaps some benefit would be observable under load, but we don't have time to test it now.

divega on 13 Jun 2017

@divega Agree, it's hard for me to imagine that there is no scenario where this doesn't show an improvement. There is also some benefit in just having simpler SQL.

anpete on 13 Jun 2017

We should close this and track further optimization in separate issues when we find scenarios that need improvement.
While the SQL would be simpler client code would be much more complicated and could degrade overall performance in some scenarios.

AndriySvyryd on 13 Jun 2017

Ok, created https://github.com/aspnet/EntityFramework/issues/8849 to track ORDER BY removal. Closing as fixed in preview2.

divega on 13 Jun 2017

Was this page helpful?

0 / 5 - 0 ratings