Sorry for the long title, feel free to rename :)
The upgrade docs for 3.3 state that 3.0.any -> 3.3.1 is a supported path, but mentions no caveats. Additionally, the 3.3 Upgrade FAQ says upgrading any 3.x version will be very straightforward, with no mention of caveats.
However, I was unable to upgrade from 3.0.4 to 3.3.1 using the following steps and received several error messages, none of which were indicative of the underlying problem.
These steps were executed within a docker environment, using the 3.0.4-enterprise tag and the 3.3.1-enterprise tag.
Steps to reproduce
/data of each containerExpected behavior
Neo4j upgrades database and runs as a causal cluster
Actual behavior
The database fails startup with the following error message:
...
Caused by: org.neo4j.kernel.impl.storemigration.StoreUpgrader$DatabaseNotCleanlyShutDownException: The database is not cleanly shutdown. The database needs recovery, in order to recover the database, please run the old version of the database on this store.
at org.neo4j.kernel.impl.storemigration.UpgradableDatabase.checkUpgradeable(UpgradableDatabase.java:124)
at org.neo4j.kernel.impl.storemigration.StoreUpgrader.migrateIfNeeded(StoreUpgrader.java:132)
at org.neo4j.kernel.impl.storemigration.DatabaseMigrator.migrate(DatabaseMigrator.java:101)
at org.neo4j.kernel.NeoStoreDataSource.upgradeStore(NeoStoreDataSource.java:573)
at org.neo4j.kernel.NeoStoreDataSource.start(NeoStoreDataSource.java:435)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
Sometimes, and I wasn't able to isolate the conditions, the error message would instead be something like:
ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@f88bfbe' was successfully initialized, but failed to start. Please see the attached cause exception "Unable to find transaction 23289 in any of my logical logs: Couldn't find any log containing 23289". Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@f88bfbe' was successfully initialized, but failed to start. Please see the attached cause exception "Unable to find transaction 23289 in any of my logical logs: Couldn't find any log containing 23289".
org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@f88bfbe' was successfully initialized, but failed to start. Please see the attached cause exception "Unable to find transaction 23289 in any of my logical logs: Couldn't find any log containing 23289".
I was able to start up the causal cluster as expected without the volume mount.
Fix/Workaround
I was able to successfully start the causal cluster if I first upgraded the 3.0.4 backup by starting a 3.3.1 node in single instance mode and letting it perform the upgrade, then using that upgraded DB as the seed for the causal cluster.
It seem like that step could be an acceptable upgrade path, but it should be well documented, and perhaps a descriptive error message should be added in the next patch release.
This is intended behaviour and you are doing the correct "workaround". I guess "please run the old version of the database on this store" could be slightly clearer. Database here refers to the database software. I will bring the feedback back to the team though.
Thanks for your reply @martinfurmanski.
This may be intended behavior as you understand it, but that means there is a glaring omission in the documentation. The links in the initial post show that the documentation does not make any mention of exceptions to the supported upgrade path, however, this is definitely an exception. Can you link to me any official documentation that discusses this workaround or of errors that may arise when upgrading in this manner? If none exists, I'm happy to file an issue for improvement to documentation, but I don't believe that this issue is resolved until some actionable changes are outlined.
Thanks!
@freethejazz Thank you for your feedback! I have added notes in the documentation that an upgrade must be performed as an action separate from, for example, seeding a cluster. The updated docs will be published within a week.
Awesome, thanks for an updating and adding details to the documentation!