Hello,
We were migrating a database containing the Hangfire schema.
The migration script failed, so the hangfire schema was there, but without the seed data.
As a result, when the app pool was starting up Hangfire tried to create the tables again even though they were there already, so the script was failing and the pool startup would crash out. We fixed it by copying the seed data into the tables.
Cheers,
Steven
Hi, please tell more details. Is there any error message? What migration do you mean? What does "tables are there, but empty" mean?
I would guess that he's referring to the Version that is held within the Hangfire.Schema table. If this isn't found the script would fail when it tries to create the tables:
https://github.com/HangfireIO/Hangfire/blob/master/src/Hangfire.SqlServer/Install.sql#L68
Yes, @yngndrw is right. The issue is what he has referenced.
I'll try to explain anyway.
The migration comment is only meant to give you context, but you may ignore it if it confuses matters.
"Tables are there but empty" means that the tables are there but contain no data, i.e. they are empty.
The error is cannot create table (as it exists already).
Thanks, guys. Tables, indexes, constraints and all the data that is modified by a migration script are considered as integral. So why are tables here, but empty?
I think we're getting confused here.
I will try to simplify it to avoid further confusion:
The script that Hangfire uses will fail if the hangfire tables are there, but have no data in them.
As this can happen and in our case has happened, the hangfire script needs to have some defensive code to ensure it's not trying to create a table when it's already there.
I can create a pull request if you like.
Steven, actually, I totally understand you, and know what to do to solve your problem. Defensive code will definitely help in your case. Yes, we can add some checks for the existence of tables, indexes, constraints, foreign keys, columns, their types and data, but this will complicate the migration script. In fact, schema table is here to avoid this complication.
When we add all of these checks, we are claiming that it is OK, when sometimes older migrations will be executed against a newer scheme, but it will be too hard to write a new migration in this case and test it against unknown newer schema versions.
Consider there are V1, V2, V3 versions, your database is at V3, but V1 and V2 migrations will also be executed. V1 and V2 must be compatible with state V3, however it is hard to both see the future and change migrations for each new state.
However, complicated migrations is not the only problem. Sometimes migration scripts contain data migrations as well. And sometimes older migrations executed against a newer scheme results in error, or even in data loss (simplest example - changing column type and loss the precision). Schema version prevents from these cases, because it fails fast, at the first migration as in your case.
And I'm asking about what happened that you have empty tables, because want to know what is better – delete empty tables, or think about all of the consequences of adding checks.
Ok so that's why I said ignore the word migration before. I wasn't referring to migrating hangfire schemas, which hangfire itself manages just fine (we haven't had a problem when upgrading versions ever).
We were migrating (moving from one environment to another) our database and our script (nothing to do with hangfire) failed after copying the hangfire tables. That's why we ended up with the hangfire tables being there but with no data in them.
I hope that makes more sense :)
Yep, this definitely makes sense. But why not to migrate the data as well?
Like I said the script failed so the data (that was supposed to be copied) didn't get copied. We were using redgate sql toolbelt, so missed the fact that the script didn't run successfully.
We only realised something went wrong, when we tried to fire up our site.
Ah, sorry, missed that sentence :-( So, redgate sql toolbelt failed an application migration without any notification, leaving database in inconsistent state. I think Hangfire did its job well, failing with an error in case, where the database state was really wrong.
We could improve the message in a such case – in the first migration, if the first table already exists and schema version is null, we can throw more meaningful error that Hangfire structures already created, but schema version is absent.
But what to advice a user in a message, delete tables or add the correct schema number to the corresponding table?
Well, strictly speaking Hangfire didn't work well in this case.
An assumption was made that as there is no version number, there are no tables either. As a result it tried to create tables that already existed. The error was raised by SQL saying that the table already exists not hangfire itself.
My advice would be to review that assumption as it's not entirely accurate.
As for what to do in that case, raising a meaningful error message at the very least would be good.
Perhaps adding checks for the tables and not trying to recreate them would be a good compromise.
Just continuing on with the migration process would be error prone as you don't know what state the database is in (It could be V3 with no data, or it could be V1 with no data.) and handling these scenarios would result in the migration script becoming far more complicated than it needs to be.
I would suggest that an SQL error should be thrown. (Something along the lines of "The Hangfire database is in an inconsistent state and should be either restored from backups or recreated from scratch") I'd also suggest that the Hangfire dashboard should also display the error as a large red notification across the top of the page, to make it clear than urgent administrative action is required.
I would advise against just deleting the tables as you don't know what state they are in or why it has ended up in this state. (Maybe just the Hangfire.Schema table has been cleared ?) This scenario requires manual intervention to be safe.
@TheBeardedLlama,
An assumption was made that as there is no version number, there are no tables either. {...} My advice would be to review that assumption as it's not entirely accurate.
This is not an assumption, this is the way how install.sql works. The only assumption is that _only_ install.sql script may create/update/delete database objects related to Hangfire. And this is even not an assumption, this is _requirement_, otherwise nobody can guarantee that Hangfire uses the correct objects.
As for what to do in that case, raising a meaningful error message at the very least would be good.
IMO this is the best message what can be done, how do you think?
_Hangfire database migration script failed: There is already an object named 'Job' in the database. Please fix the problem and re-run it again._
Perhaps adding checks for the tables and not trying to recreate them would be a good compromise.
Consider that somebody implemented custom schemas support for Hangfire.SqlServer. And a user already has the Job table, as this name is very generic. For this case, normal execution (without any error) is very terrific.
@yngndrw,
I would suggest that an SQL error should be thrown. (Something along the lines of "The Hangfire database is in an inconsistent state and should be either restored from backups or recreated from scratch")
Please see one of my answers above.
I'd also suggest that the Hangfire dashboard should also display the error as a large red notification across the top of the page, to make it clear than urgent administrative action is required.
IMO regular exception message is enough here, there is awesome .UseErrorPage methods that highlight all the problems.
I would advise against just deleting the tables as you don't know what state they are in or why it has ended up in this state. (Maybe just the Hangfire.Schema table has been cleared ?) This scenario requires manual intervention to be safe.
There are too many problems to advise something here, and all of them require manual intervention. IMO, if someone relies on using automatic migrations, application crash on startup is the best thing when they are failing. Fail-fast strategy is one of things why I love .NET Framework.
I appreciate what you're saying regarding the assumption. Nevertheless, we are talking about a database that is not under hangfire's control nor is it its owner. As such it cannot make that assumption.
What I mean is that the script should check first and not just assume. Because by assuming it's doing something that could potentially fail as it did in my case.
Raising an exception and stopping sounds like the most appropriate solution.
I would suggest that the error message is as specific as possible. So, in the case that I encountered, I'd expect the error raised to be along the lines of: I was expecting to find a version number, but didn't find anything, the version table should not be empty. Find the data and put it back... come on! do it! get to the chopper!
[Bump]
Any update on this issue?
Our SQL security is locked down in production so of course I am not able to use the auto creation of tables, instead we must use a deployment script, can anyone advise on how to circumvent the default behaviour which assumes the executing user has permission over the database?
Thanks in advance
Just in case anyone else ends up in the same situation, I found this page which explains how to disable auto generation : http://docs.hangfire.io/en/latest/configuration/using-sql-server.html
..this is the specific code I used :
var options = new SqlServerStorageOptions
{
PrepareSchemaIfNecessary = false
};
GlobalConfiguration.Configuration.UseSqlServerStorage("
The problem persist. Anyone have a way to solve it?
Note: This is an absolutely legitimate situation to be in if you are restoring a BACPAC file to SQL Azure in the cloud.
Restoration of a SQL Azure database is not an atomic operation, so it is very possible for the tables to be created but the data not to have been restored. In that case this error is exactly what you want to occur - but it would be nice if it was a friendlier message.
Solved with @mikemardon suggested changes
var options = new SqlServerStorageOptions
{
PrepareSchemaIfNecessary = false
};
GlobalConfiguration.Configuration.UseSqlServerStorage("", options);
@dilhan2013 .. nope mate still exist :(
Hangfire.SqlServer, Version=1.6.21.0,
This seems to be resolved in Version 1.7.2
Most helpful comment
Note: This is an absolutely legitimate situation to be in if you are restoring a BACPAC file to SQL Azure in the cloud.
Restoration of a SQL Azure database is not an atomic operation, so it is very possible for the tables to be created but the data not to have been restored. In that case this error is exactly what you want to occur - but it would be nice if it was a friendlier message.