My app runs on .NET Framework 4.7.2 and was running an older Quartz version 3.0.7 for over a year with low CPU utilization, and a few weeks ago we upgraded Quartz to 3.2.3 and we noticed an immediate 40% CPU increase due to having this query being executed much more often with the newer version.
SELECT *
FROM QRTZ_LOCKS WITH(UPDLOCK,ROWLOCK)
WHERE SCHED_NAME = @schedulerName
AND LOCK_NAME = @lockName
Version used
Version: 3.2.3
To Reproduce
Don't have code to reproduce, but my app creates simple jobs and at a given time it has 10s of them executed on 4 different VM instances.
Expected behavior
The CPU should remain as it was with the older version.
btw, there are no errors reported by Quartz or our app, but this is the only side effect we see with version 3.2.3
Can you post your scheduler factory configuration (stripping credentials etc)?
Sure.
==== processorScheduler ====
<scheduler name="processorScheduler">
<quartz>
<property key="quartz.scheduler.instanceName" value="ProcessorScheduler" />
<property key="quartz.scheduler.instanceId" value="AUTO" />
<property key="quartz.scheduler.idleWaitTime" value="1000" />
<property key="quartz.scheduler.exporter.type" value="Quartz.Simpl.RemotingSchedulerExporter, Quartz" />
<property key="quartz.scheduler.exporter.port" value="1111" />
<property key="quartz.scheduler.exporter.bindName" value="ProcessorScheduler" />
<property key="quartz.scheduler.exporter.channelType" value="tcp" />
<property key="quartz.scheduler.exporter.channelName" value="httpQuartz" />
<property key="quartz.threadPool.type" value="Quartz.Simpl.DefaultThreadPool, Quartz" />
<property key="quartz.threadPool.threadCount" value="20" />
<property key="quartz.jobStore.type" value="Quartz.Impl.AdoJobStore.JobStoreTX, Quartz" />
<property key="quartz.serializer.type" value="binary" />
<property key="quartz.jobStore.clustered" value="true" />
<property key="quartz.jobStore.clusterCheckinInterval" value="1000" />
<property key="quartz.jobStore.misfireThreshold" value="60000" />
<property key="quartz.jobStore.dataSource" value="default" />
<property key="quartz.jobStore.driverDelegateType" value="Quartz.Impl.AdoJobStore.SqlServerDelegate, Quartz" />
<property key="quartz.jobStore.tablePrefix" value="QRTZ_" />
<property key="quartz.jobStore.useProperties" value="true" />
<property key="quartz.dataSource.default.connectionString" value="Server=xxx;Database=quartz;user id=xxx;PWD=xxx;" />
<property key="quartz.dataSource.default.provider" value="SqlServer" />
</quartz>
</scheduler>
<scheduler name="notificationScheduler">
<quartz>
<property key="quartz.scheduler.instanceName" value="notificationScheduler" />
<property key="quartz.scheduler.instanceId" value="notificationSchedulerInstance" />
<property key="quartz.scheduler.proxy" value="true" />
<property key="quartz.scheduler.proxy.address" value="tcp://xxx:2222/notificationScheduler" />
<property key="quartz.threadPool.type" value="Quartz.Simpl.DefaultThreadPool, Quartz" />
<property key="quartz.jobStore.type" value="Quartz.Impl.AdoJobStore.JobStoreTX, Quartz" />
<property key="quartz.jobStore.misfireThreshold" value="60000" />
<property key="quartz.jobStore.dataSource" value="default" />
<property key="quartz.jobStore.driverDelegateType" value="Quartz.Impl.AdoJobStore.SqlServerDelegate, Quartz" />
<property key="quartz.jobStore.lockHandler.type" value="Quartz.Impl.AdoJobStore.UpdateLockRowSemaphore, Quartz" />
<property key="quartz.jobStore.tablePrefix" value="QRTZ_" />
<property key="quartz.jobStore.useProperties" value="true" />
</quartz>
</scheduler>
==== notification scheduler ====
<scheduler name="notificationScheduler">
<quartz>
<property key="quartz.scheduler.instanceName" value="notificationScheduler" />
<property key="quartz.scheduler.instanceId" value="AUTO" />
<property key="quartz.scheduler.idleWaitTime" value="1000" />
<property key="quartz.scheduler.exporter.type" value="Quartz.Simpl.RemotingSchedulerExporter, Quartz" />
<property key="quartz.scheduler.exporter.port" value="2222" />
<property key="quartz.scheduler.exporter.bindName" value="notificationScheduler" />
<property key="quartz.scheduler.exporter.channelType" value="tcp" />
<property key="quartz.scheduler.exporter.channelName" value="httpQuartz" />
<property key="quartz.threadPool.type" value="Quartz.Simpl.DefaultThreadPool, Quartz" />
<property key="quartz.threadPool.threadCount" value="20" />
<property key="quartz.jobStore.type" value="Quartz.Impl.AdoJobStore.JobStoreTX, Quartz" />
<property key="quartz.serializer.type" value="binary" />
<property key="quartz.jobStore.clustered" value="true" />
<property key="quartz.jobStore.clusterCheckinInterval" value="1000" />
<property key="quartz.jobStore.misfireThreshold" value="60000" />
<property key="quartz.jobStore.dataSource" value="default" />
<property key="quartz.jobStore.driverDelegateType" value="Quartz.Impl.AdoJobStore.SqlServerDelegate, Quartz" />
<property key="quartz.jobStore.tablePrefix" value="QRTZ_" />
<property key="quartz.jobStore.useProperties" value="true" />
<property key="quartz.dataSource.default.connectionString" value="Server=xxx;Database=quartz;user id=xxx;PWD=xxx;" />
<property key="quartz.dataSource.default.provider" value="SqlServer" />
</quartz>
</scheduler>
I'll need some time to investigate but I guess the biggest change has been that the query has been parametrized which also makes it look like being run two times more frequently if you have two separate schedulers (earlier there was different SQL string for each of them).
That's correct, I do have multiple schedulers which all have similar configs as the above.
Thanks for looking into it.
Hi Marko, touching base for any update?
I'm hoping to have time to work with this weekend. I haven't found anything obvious causing such performance regression but I think there's something between 2.x and 3.x that can be improved.
Hi Marko, did you have the change to take a look at this buddy?
Sorry, no big wins so far. I've discussed this with SQL Server DBA and he couldn't find any obvious reasons by testing, so it shouldn't be on DB side as it runs against same database page which should be super fast (small row count, two columns). If you have time to profile to pinpoint something obvious that I'm missing that would be super.
Unfortunately, we can't profile. But, I have some other insight to share with you.
Before the day we released our app with the upgraded Quartz, this query was being executed 250k times a day. Right after we released Quartz v3.0.7 the query execution times increased to 350k times a day (a jump of 125k). Our application data traffic hasn't changed a bit.
Also, we noticed that your query is doing UPDLOCK, ROWLOCK on a SELECT statement, any reason why it needs to when it's not updating? I think if you change this to NOLOCK would be better.
SELECT *
FROM QRTZ_LOCKS WITH(UPDLOCK,ROWLOCK)
WHERE SCHED_NAME = @schedulerName
AND LOCK_NAME = @lockName
Something in the v3.0.7 is making this query to run more often.
I just talked again to my DBA and he did some more digging and found out in the old version this UPDATE query was used to get run, and since the upgrade, this query disappeared and the new SELECT above showed up.
Notice also, the Sched_name was coming out as a text and not a parameter.
(@lockName nvarchar(14))
UPDATE QRTZ_LOCKS
SET LOCK_NAME = LOCK_NAME
WHERE SCHED_NAME = 'MaintenanceScheduler' AND LOCK_NAME = @lockName
And the old SELECT statement was this compared to the new one.
Before Jan 27
(@lockName nvarchar(14))
SELECT * FROM QRTZ_LOCKS WITH (UPDLOCK,ROWLOCK)
WHERE SCHED_NAME = 'MaintenanceScheduler'
AND LOCK_NAME = @lockName
Hope this helps.
Thank you for the update and information.
Also, we noticed that your query is doing UPDLOCK, ROWLOCK on a SELECT statement, any reason why it needs to when it's not updating? I think if you change this to NOLOCK would be better.
This would mean that there would be no locks to protect from concurrent access, so I wouldn't go there 馃槈
Notice also, the Sched_name was coming out as a text and not a parameter.
This is the exact behavior that was intended after #818 was merged. You should see a lot more queries now with this exact SQL if you have multiple schedulers, they all use the same query plan thanks to using the query parameter instead of hard-coding the scheduler name into query which causes different plans.
Re-reading your notificationScheduler
configuration:
<property key="quartz.jobStore.lockHandler.type" value="Quartz.Impl.AdoJobStore.UpdateLockRowSemaphore, Quartz" />
This will cause it not to use the optimized SQL statement intended for SQL Server. See https://github.com/quartznet/quartznet/blob/f612e3f66ab27e221b5632269ef5c25c0fe8bcc5/src/Quartz/Impl/AdoJobStore/JobStoreSupport.cs#L492-L521 for the logic.
I've removed this property and also upgraded to the latest version, will let you know how it behaves.
Hi Marko, just an update. The CPU is back to normal behaviour after the above updates.
Thanks for your help buddy.
Great to hear, thanks for closing the loop.