Efcore: Large DbContext Startup Time

Created on 7 Aug 2017  路  32Comments  路  Source: dotnet/efcore

Describe what is not working as expected.

Large DbContext (1000) tables takes about 40 seconds before the first query. Any plans to add view generation at build time to EF Core, just like EF 6 has?
Thanks.

Further technical details

EF Core version: 1.0.4
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Windows 10
IDE: Visual Studio 2017

area-perf closed-fixed type-enhancement

All 32 comments

Are you able to share your DbContext?

@AndriySvyryd are there any significant improvements for this in 2.0?

@SergeyBarskiy what exact version are you using?

@divega 1.0.4 right now.
@ErikEJ I reverse engineered our main product database, so I am not sure. I could check with my boss on legal aspects of doing this. Ordinarily we require NDA in place. Is this a must?

It will most likely help the team find the hotspots to have a repro. Have you tested with 2.0 ( you wrote 2.0 in your initial message) ?

@divega There haven't been any specific significant perf improvements in model building in 2.0

@ErikEJ Typo. I am testing at work, and I do not have 2.0 on my box there, like I do at home. In any case to reveal the schema I have to get a permission from legal. I could test on 2.0 if anyone would like to know the results.

Also, back to original question, is view pre-generation a thing for EF Core?

@SergeyBarskiy we have played with the idea of "compiled models" (see https://github.com/aspnet/EntityFramework/issues/1906) which would be a way to produce an artifact or set of artifacts at design time or compile time which would include (but not be limited to) the pre-computed code first model. The feature is currently in the backlog.

I could test on 2.0 if anyone would like to know the results.

It would absolutely be great if you could try your model with 2.0 (doing so should be easier very soon).

If it turns out to still be too slow and you could share the model (it can be privately through email) so that we can profile it please let us know.

I am going to close the issue for now but if there is any follow up action feel free to re-activate.

I got an OK to share compiled model, @divega . If you provide email address, I can send it to you.
Thanks.

Thanks @SergeyBarskiy. My email is like my alias here + @ + the company I work for + .com.

Didn鈥檛 you have a chance to test the performance with 2.0 final (we released it yesterday).

@divega Re-opened and assigned to you for now so we don't forget about this. We can re-triage based on the results of looking at the repro.

I emailed you the project, @divega
Things have gotten worse in 2.0 from 1.0, I went from 40 seconds to 2 minutes. memory usage was pretty high too, over 110MB.

Here is Program.cs listing:

using Microsoft.EntityFrameworkCore;
using System.Linq;
using System;
using System.Diagnostics;

namespace EFLargeContext
{
    class Program
    {
        static void Main(string[] args)
        {
            var builder = new DbContextOptionsBuilder<MyDbContext>();
            builder.UseSqlServer("Server=.;Database=Edge-Sql;Trusted_Connection=True;MultipleActiveResultSets=true");
            var timer = new System.Diagnostics.Stopwatch();
            timer.Start();
            using (var ctx = new MyContext(builder.Options))
            {
                timer.Stop();
                Console.WriteLine($"Total ms to create {timer.ElapsedMilliseconds}");

                timer.Start();
                var setting = ctx.Settings.FirstOrDefault(s => s.Settingid == "06600547-5D1D-475C-98AB-BB22E519C2E9");
                timer.Stop();
                Console.WriteLine($"Total ms to first query {timer.ElapsedMilliseconds}");
                timer.Reset();

                timer.Start();
                setting = ctx.Settings.FirstOrDefault(s => s.Settingid == "06D8A69C-E5B2-4004-88E8-C4BB0562448D");
                timer.Stop();
                Console.WriteLine($"Total ms to second query {timer.ElapsedMilliseconds}");
                timer.Reset();


                timer.Start();
                setting.Stringvalue = "x";
                ctx.SaveChanges();
                timer.Stop();
                Console.WriteLine($"Total ms to first save {timer.ElapsedMilliseconds}");
                timer.Reset();ac

                timer.Start();
                setting.Stringvalue = "y";
                ctx.SaveChanges();
                timer.Stop();
                Console.WriteLine($"Total ms to second save {timer.ElapsedMilliseconds}");
                timer.Reset();
            }
        }
    }
}

Here is the output from dotnet run
Total ms to create 4993
Total ms to first query 126078
Total ms to second query 17
Total ms to first save 121
Total ms to second save 10

@ajcvickers I got the repro model from @SergeyBarskiy. This can now be assigned in triage to someone that is going to perform the actual investigation and then I can forward the repro. Thanks.

@divega - Can you send me project?

@SergeyBarskiy Can you share create table script for INSTOREITEMTRANSFER table. From what I suspect you generated this model from database using reverse engineering and there are bugs in rev eng causing decimal/numeric types to be scaffolded incorrectly. I would like to repro those bugs too.

Sure. Here you go, Smit.

CREATE TABLE [dbo].[INSTOREITEMTRANSFER](
    [INSTOREITEMTRANSFERID] [char](36) NOT NULL,
    [INBINIDFROM] [char](36) NOT NULL,
    [INBINIDTO] [char](36) NOT NULL,
    [TRANSFERQUANTITY] [numeric](38, 5) NOT NULL,
    [ITEMID] [char](36) NOT NULL,
 CONSTRAINT [PK_INStoreItemTransfer] PRIMARY KEY CLUSTERED 
(
    [INSTOREITEMTRANSFERID] ASC
))

@SergeyBarskiy - Sorry to ask for more info
Can you share table creation script for table BLLICENSEWFSTEP & CATRANSACTION?

CREATE TABLE [dbo].[BLLICENSEWFSTEP](
    [BLLICENSEWFSTEPID] [char](36) NOT NULL,
    [BLLICENSEID] [char](36) NOT NULL,
    [WFTEMPLATESTEPID] [char](36) NULL,
    [WFSTEPTYPEID] [int] NOT NULL,
    [PRIORITYORDER] [int] NOT NULL,
    [STARTDATE] [datetime] NULL,
    [ENDDATE] [datetime] NULL,
    [DESCRIPTION] [nvarchar](max) NULL,
    [WORKFLOWSTATUSID] [int] NOT NULL,
    [ROWVERSION] [int] NOT NULL,
    [LASTCHANGEDON] [datetime] NOT NULL,
    [LASTCHANGEDBY] [char](36) NOT NULL,
    [REQUIREDSUBPLPLANID] [char](36) NULL,
    [NAME] [nvarchar](50) NOT NULL,
    [WORKFLOWCOMPLETETYPEID] [int] NULL,
    [ICON] [varbinary](max) NULL,
    [DAYSTOCOMPLETE] [int] NULL,
    [BLLICENSEWFPARENTSTEPID] [char](36) NULL,
    [VERSIONNUMBER] [int] NULL,
    [SORTORDER] [int] NOT NULL,
    [GENERALREASON] [nvarchar](max) NULL,
    [AUTOCOMPLETED] [bit] NOT NULL,
    [NOPRIORITY] [bit] NOT NULL DEFAULT ((0)),
    [DUEDATE] [datetime] NULL,
    [IDENTITYCOLUMN] [bigint] IDENTITY(1,1) NOT NULL,
 CONSTRAINT [PK_BLLicenseWFStep] PRIMARY KEY NONCLUSTERED 
(
    [BLLICENSEWFSTEPID] ASC
)
)

CREATE TABLE [dbo].[CATRANSACTION](
    [CATRANSACTIONID] [char](36) NOT NULL,
    [CATRANSACTIONTYPEID] [int] NOT NULL,
    [CATRANSACTIONSTATUSID] [int] NOT NULL,
    [CATILLSESSIONID] [char](36) NOT NULL,
    [PARENTTRANSACTIONID] [char](36) NULL,
    [GLOBALENTITYID] [char](36) NULL,
    [GLOBALENTITYNAME] [nvarchar](max) NOT NULL,
    [TRANSACTIONDATE] [datetime] NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
SET ANSI_PADDING ON
ALTER TABLE [dbo].[CATRANSACTION] ADD [CREATEDBY] [char](36) NOT NULL
ALTER TABLE [dbo].[CATRANSACTION] ADD [RECEIPTNUMBER] [nvarchar](50) NULL
SET ANSI_PADDING OFF
ALTER TABLE [dbo].[CATRANSACTION] ADD [NOTE] [varchar](max) NULL
ALTER TABLE [dbo].[CATRANSACTION] ADD [CHANGEBACK] [money] NOT NULL CONSTRAINT [DEF_CATransacton_ChangeBack]  DEFAULT ((0))
ALTER TABLE [dbo].[CATRANSACTION] ADD [ROWVERSION] [int] NOT NULL CONSTRAINT [DF_CATransaction_RowVersion]  DEFAULT ((1))
ALTER TABLE [dbo].[CATRANSACTION] ADD [LASTCHANGEDON] [datetime] NOT NULL CONSTRAINT [DF_CATransaction_LastChangedOn]  DEFAULT (getdate())
SET ANSI_PADDING ON
ALTER TABLE [dbo].[CATRANSACTION] ADD [LASTCHANGEDBY] [char](36) NOT NULL
ALTER TABLE [dbo].[CATRANSACTION] ADD [TRANSACTIONGROUPNUMBER] [nvarchar](50) NOT NULL CONSTRAINT [DEF_CATransaction_GroupNumber]  DEFAULT ('UNASSIGNED - '+CONVERT([varchar](10),getdate(),(110)))
ALTER TABLE [dbo].[CATRANSACTION] ADD [OFFICEID] [char](36) NULL
ALTER TABLE [dbo].[CATRANSACTION] ADD [EXTERNALSYSTEMID] [nvarchar](50) NULL
ALTER TABLE [dbo].[CATRANSACTION] ADD [DEPOSITREFNBR] [nvarchar](50) NULL
ALTER TABLE [dbo].[CATRANSACTION] ADD [EXTERNALBATCHNBR] [nvarchar](50) NULL
 CONSTRAINT [PK_CATransaction] PRIMARY KEY NONCLUSTERED 
(
    [CATRANSACTIONID] ASC
)

@SergeyBarskiy - How about HEARING table script? Including all the constraints defined on it.
The project you provided indicate that reveng scaffolded nvarchar(max) for a column which is involved in a index. That is invalid for SqlServer. I am trying to get a repro for that and verify if it is working for latest codebase. Thanks.

Scaffolding created some number of configurations with just nvarchar without length. Hearing must have been one of those. I manually replaced them all with nvarchar(max) to get the sample to run. This was flagged as known issues. I think it occurs for nvarchar(4000) columns.

CREATE TABLE [dbo].[HEARING](
    [HEARINGID] [char](36) NOT NULL,
    [HEARINGTYPEID] [char](36) NOT NULL,
    [HEARINGSTATUSID] [char](36) NOT NULL,
    [SUBJECT] [nvarchar](4000) NOT NULL,
    [LOCATION] [nvarchar](4000) NOT NULL,
    [COMMENTS] [nvarchar](max) NULL,
    [STARTDATE] [datetime] NOT NULL,
    [ENDDATE] [datetime] NOT NULL,
    [ROWVERSION] [int] NOT NULL,
    [EXCHANGEITEMID] [varchar](200) NULL,
    [SHOWONCALENDAR] [bit] NOT NULL DEFAULT ((1)),
 CONSTRAINT [PK_Hearing] PRIMARY KEY CLUSTERED 
(
    [HEARINGID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

SET ANSI_PADDING ON
GO

ALTER TABLE [dbo].[HEARING]  WITH CHECK ADD  CONSTRAINT [FK_Hearing_HearingStatus] FOREIGN KEY([HEARINGSTATUSID])
REFERENCES [dbo].[HEARINGSTATUS] ([HEARINGSTATUSID])
GO

ALTER TABLE [dbo].[HEARING] CHECK CONSTRAINT [FK_Hearing_HearingStatus]
GO

ALTER TABLE [dbo].[HEARING]  WITH CHECK ADD  CONSTRAINT [FK_Hearing_HearingType] FOREIGN KEY([HEARINGTYPEID])
REFERENCES [dbo].[HEARINGTYPE] ([HEARINGTYPEID])
GO

ALTER TABLE [dbo].[HEARING] CHECK CONSTRAINT [FK_Hearing_HearingType]
GO

@SergeyBarskiy Do you have index defined which would correspond to this code.

entity.HasIndex(e => new { e.Startdate, e.Comments, e.Hearingid, e.Hearingtypeid })
                    .HasName("IX_HEARING_ID_TYPE");

For some reason reveng generated it in code. But then Comments cannot be nvarchar(max) (SqlServer does not allow max when using with key/index). It seems like reveng generated some non-existent index.

Total ms to create 2137
Total ms to model gen 41238
Total ms to drop db 208
Total ms to create db 20937
Total ms to first query 86
Total ms to second query 7

Since there is no data, Queries are not returning results but fetching one result shouldn't be very slow to materialize also query compile time would be slower than runtime. SaveChanges couldn't be verified.

RevEng issues https://github.com/aspnet/EntityFrameworkCore/issues/9530 https://github.com/aspnet/EntityFrameworkCore/issues/9526

Handing over to @AndriySvyryd since Model building is the only slow step in pipeline.

@smitpatel Comment columns is nvarchar(4000). Reverse engineer did not handle it., so I manually changed it to max without looking.

We've improved the model building perf a bit, but your best bet would be waiting for Compiled Models #1906

One thing that I found that helped me with my large model which has about 1,400 tables was to use the fluent API overloads that accept a string instead of a lambda function as in the following.

            modelBuilder.Entity<A21IndirectCostRecoveryAccount>().Property(nameof(A21IndirectCostRecoveryAccount.Id)).HasColumnType("DECIMAL(10, 0)");

I was having problems running out of stack space when I used lambdas because I had so many method calls in OnModelCreating(). I forget what the exact difference was, but, in addition to cutting the memory footprint down, it also sped things up. My model initializes in about 10 seconds now. There were times in EF 6 when it was more like 2 minutes. If you have lots of fluent API calls, using strings seems to be the way to go as far as I can tell.

@jemiller0 Thanks for the info. @divega @bricelam @AndriySvyryd Should we consider a reverse engineering mode that generates string-based calls (using nameof)? Also, this may be worth thinking about for the compiled model.

We need templates.

I just double checked to make sure I was remembering things correctly. I have about 1,300 entities. I think switching to using string version of the method calls mainly only helped cut down the memory footprint. I just tried checking the performance. Two runs using lambdas took 00:00:13.2868370 and 00:00:13.5996700. Two runs using strings took 00:00:11.7494176 and 00:00:11.2098554. So, it's not much of an improvement. Mainly, I made that switch to get around running out of stack space.

One thing I've always wondered is whether it would be possible to make it so that EF lazy initializes the model as needed? I guess you would need to have different methods containing the fluent API calls for each table, and call those as the tables are referenced. I don't know NHibernate does it, but, it starts up with no lag at all. It seems like initialization is doing a push, when it should be doing a pull as needed. I'm assuming doing something like that would be a major architectural change.

Should we consider a reverse engineering mode that generates string-based calls (using nameof)?

Two runs using lambdas took 00:00:13.2868370 and 00:00:13.5996700. Two runs using strings took 00:00:11.7494176 and 00:00:11.2098554.

Something I have seen in the past (with EF6) is attribute-based code first mapping being significantly faster than fluent mappings for large models. The hypothesis was that this was because of the creation and parsing of expressions so I would also have expected the string versions to be faster as well. But I wonder if there are any cases in which we could either leverage an existing attribute or come up with a new attribute to make things cheaper. I am not even sure we or the compiler team have done any targeted effort to optimize the lambda expression path.

On my model, I use attributes where I can and use the fluent APIs where I can't. I would prefer to use attributes for everything. I'm using my own code generator now. From what I remember the EF built in one uses the fluent API. It would be nice to allow a user to choose which they prefer. I found the attributes easier to use because they are at the same place in the source code, right next to the properties. It would be nice if some of the attributes could be enhanced to support things like foreign keys that have more than one column. Personally, I don't like composite keys and can understand trying to keep the attributes simple though. I'm dealing with a legacy database though that has a lot of composite keys. I think one other problem that I ran into was certain providers defaulting things like decimals to really wide values in the database. So, I am explicitly setting the SQL data type for those. Maybe I could change those to using attributes and it would speed it up. There are a few places, where I have conditional code where I check to see what provider is being used. For example, if it's an XML field, it sets the SQL data type to XML for SQL Server or PostgreSQL, but, sets it to a CLOB for MySQL since it doesn't have an XML data type. I always wondered if the attributes were what was slowing things down compared to NHibernate, which uses XML mapping files by default. I never tried NHibernate using attributes. From what I've heard, reflection is slow. So, I wondered if that was one of the issues. It is interesting that using attributes would be faster than using the fluent API.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vsfeedback picture vsfeedback  路  98Comments

abdu292 picture abdu292  路  99Comments

0xdeafcafe picture 0xdeafcafe  路  189Comments

rowanmiller picture rowanmiller  路  85Comments

divega picture divega  路  100Comments