Moor: Database is locked after hot restart and sometimes in production

Created on 24 Sep 2020  路  18Comments  路  Source: simolus3/moor

I am using a single instance of my database with VmDatabase as executor.

When the app starts , data is loaded from remote and inserted into the database in several transactions.
If I hot restart the app during this time, there is a chance, that the database is locked after the restart.

This would be ok if it was only happening during development but I am seeing the same stacktraces in Sentry from production users, very rarely but it happens. I am not sure how that can happen.

I have a similar problem with Sembast using sqflite_ffi (https://github.com/tekartik/sembast_sqflite/issues/5) which seems to have been solved by not using the ffi implementation.

Since both use sqlite3, I wonder if it is related or if you have any idea what could prevent this.

All 18 comments

Thanks for the report and for linking the original issue.

It strongly sounds like this is due to multiple VmDatabase's being active at the same time. However, I'm not yet sure how this could _sometimes_ happen after a hot reload (and even outside of that in production). If you set a breakpoint on the VmDatabase factory and hot-reload, does it get hit?

I am using get_it and the injector instance is empty after a hot-restart (everything works fine with hot-reload), Everything get's recreated.

If there is a 2nd instance then I don't know where. I am aware of the multiple instances problem, however I don't think it is the case here.

I am suspecting the production problem occurring due to the app being killed when another activity comes to foreground (camera), as this has so far only happened on low powered Android devices.

If there is a 2nd instance then I don't know where

I think I see the problem now. The Dart object gets destroyed of course, but we don't have a chance to ever call sqlite3_close() on the underlying database handle.

It sounds like we could use https://github.com/dart-lang/sdk/issues/35770 to fix this, the docs say that Dart_NewWeakPersistentHandle_DL also invokes the finalizer if the isolate shuts down.

I am suspecting the production problem occurring due to the app being killed when another activity comes to foreground

Is there a way for the Dart runtime to terminate without the entire app process terminating too? If the entire process goes away there shouldn't be an issue I think.

That looks interesting. So you think the native instance ist still around while the Dart VM restarted. Is there a way to locate an existing native instance when the VM starts and possible reconnect/cleanup?

Is there a way for the Dart runtime to terminate without the entire app process terminating too?

I have no idea, I will see check the next time it happens, need to see if there is some memory pressure before this occurs.

So you think the native instance ist still around while the Dart VM restarted

Yes, exactly. We create a database connection by calling sqlite3_open_v2 and keep a reference to that. However, we only call sqlite3_close when the database is closed in Dart. So destroying a VmDatabase object (through GC or because the Dart VM restarts) without properly closing it will leak the underlying sqlite struct.

Leaking resources is bad enough, but when the underlying database is in a transaction, it blocks a second connection trying to write. I'm pretty sure that's what you're seeing here.

Is there a way to locate an existing native instance

I'm not aware of any approach that would work in pure-Dart. Sqflite works around this by essentially storing these references in Java, so that they can survive Dart VM restarts.

When that api is available, using finalizable handles to auto-dispose the native resources is probably the best approach here. I'll try to setup a small Flutter-specific package to implement the sqflite workaround for VmDatabase until then.

I found a way to do this in pure Dart. We can open a named in-memory database to keep track of open database connections! That database would have the same lifecycle as the problematic connections locking your actual app database.

On the develop version of moor, we create an in-memory sqlite3 database to store active database connection pointers. This database can survive Dart VM restarts, but is bound to the process otherwise. I added the VmDatabase.closeExistingInstances() method to call sqlite3_close_v2 on all database pointers collected in there.

You could try out the develop version of moor and add this to your main:

void main() {
  // Put this somewhere before you open your first VmDatabase
  assert(() {
    VmDatabase.closeExistingInstances();
    return true;
  }());

  // remaining main
}

It's very important to only call closeExistingInstances when there aren't any Dart database instances (not even on another isolate). Otherwise you'd get VmDatabase instances with a closed database connection, which can cause segfaults.
There _shouldn't_ be any problem doing that on release builds too, but since we're not sure a VM stop is causing the lock here the risk might not be worth it.

Alas, I hope we won't need this workaround for long.

I'll try this tomorrow.

This crashes on iOS 14 Simulator, I have not tested it on a real device, older iOS or Android yet.

Process:               Runner [72153]
Path:                  /Users/USER/Library/Developer/CoreSimulator/Devices/05A6121D-D7CA-4461-AC34-39CD23AFD037/data/Containers/Bundle/Application/AC107C35-411E-4D3D-9D70-3972E0D68307/Runner.app/Runner
Identifier:            Runner
Version:               0.16.0 (1)
Code Type:             X86-64 (Native)
Parent Process:        launchd_sim [60982]
Responsible:           SimulatorTrampoline [54873]
User ID:               502

Date/Time:             2020-10-02 16:12:11.157 +0200
OS Version:            Mac OS X 10.15.6 (19G2021)
Report Version:        12
Bridge OS Version:     4.6 (17P6610)
Anonymous UUID:        579C5240-5ED6-9E09-51A7-87D96C0DC66C

Sleep/Wake UUID:       5221B93C-23BE-457E-A720-D9239346A299

Time Awake Since Boot: 110000 seconds
Time Since Wake:       14000 seconds

System Integrity Protection: enabled

Crashed Thread:        2  io.flutter.1.ui

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x000000011332000a
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [72153]

VM Regions Near 0x11332000a:
    SQLite page cache      0000000113300000-0000000113310000 [   64K] rw-/rwx SM=PRV  
--> 
    MALLOC metadata        0000000113360000-0000000113361000 [    4K] r--/rwx SM=ZER  NWMallocZone_0x113360000 zone structure

Application Specific Information:
CoreSimulator 732.17 - Device: iPad Pro (12.9-inch) (4th generation) (05A6121D-D7CA-4461-AC34-39CD23AFD037) - Runtime: iOS 14.0 (18A372) - DeviceType: iPad Pro (12.9-inch) (4th generation)

Thread 0:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib          0x00007fff5dc9fdfa mach_msg_trap + 10
1   libsystem_kernel.dylib          0x00007fff5dca0170 mach_msg + 60
2   com.apple.CoreFoundation        0x00007fff203a7e77 __CFRunLoopServiceMachPort + 316
3   com.apple.CoreFoundation        0x00007fff203a258d __CFRunLoopRun + 1284
4   com.apple.CoreFoundation        0x00007fff203a1b9e CFRunLoopRunSpecific + 567
5   com.apple.GraphicsServices      0x00007fff2b773db3 GSEventRunModal + 139
6   com.apple.UIKitCore             0x00007fff24660af3 -[UIApplication _run] + 912
7   com.apple.UIKitCore             0x00007fff24665a04 UIApplicationMain + 101
8   com.example         0x000000010cd83ddb main + 75 (AppDelegate.swift:5)
9   libdyld.dylib                   0x00007fff20257415 start + 1

Thread 1:: com.apple.uikit.eventfetch-thread
0   libsystem_kernel.dylib          0x00007fff5dc9fdfa mach_msg_trap + 10
1   libsystem_kernel.dylib          0x00007fff5dca0170 mach_msg + 60
2   com.apple.CoreFoundation        0x00007fff203a7e77 __CFRunLoopServiceMachPort + 316
3   com.apple.CoreFoundation        0x00007fff203a258d __CFRunLoopRun + 1284
4   com.apple.CoreFoundation        0x00007fff203a1b9e CFRunLoopRunSpecific + 567
5   com.apple.Foundation            0x00007fff20846e61 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 209
6   com.apple.Foundation            0x00007fff208470d0 -[NSRunLoop(NSRunLoop) runUntilDate:] + 72
7   com.apple.UIKitCore             0x00007fff24717a45 -[UIEventFetcher threadMain] + 464
8   com.apple.Foundation            0x00007fff2086f521 __NSThread__start__ + 1042
9   libsystem_pthread.dylib         0x00007fff5dcdc109 _pthread_start + 148
10  libsystem_pthread.dylib         0x00007fff5dcd7b8b thread_start + 15

Thread 2 Crashed:: io.flutter.1.ui
0   ???                             0x000000011332000a 0 + 4617011210
1   libsqlite3.dylib                0x00007fff21bf421d sqlite3LeaveMutexAndCloseZombie + 637
2   libsqlite3.dylib                0x00007fff21c0338f sqlite3Close + 1439

@simolus3 This is brilliant! Simple et efficient. I copied your database_tracker to sqflite_common_ffi since it solves the same issue. I slightly adapt your solution: https://github.com/tekartik/sqflite/blob/master/sqflite_common_ffi/lib/src/database_tracker.dart

  • I only store the database pointer and close automatically all existing databases if the tracker table exists
  • I added a catch on the insert. sqlite tend to reuse the same pointer and somehow I had a unit test showing this issue (constraint issue inserting twice the same pointer) in some edge case scenario

I have a manual test that was showing the issue (open, begin transaction, hot restart, open, begin transaction) which is now fixed.

I will likely do something similar (close the database instead of trying to use a connection like i do in Android/iOS/MacOS)

Great! And thanks for pointing me towards your initial workaround which I used for inspiration.

Did you see the problem this caused on iOS? It might be related to sqlite re-using pointers where we might end up closing the same database twice.

Did you see the problem this caused on iOS

I only tested on Android and Linux. When a database is closed and another is opened, typically the pointer is the same (just saw this using print)

@simolus3 Do you need more information on that iOS crash or do you already have a suspicion?

I suspect that this would happen because we somehow close the same database more than once. So far I didn't have access to a Mac so I couldn't verify this though. If you want to help, adding a debug print here (something like print('sqlite3_close_v2(_handle.address)')) would help to verify this.

It is always a different pointer and only gets called once. The app immediately crashes afterwards.

flutter: #################################
flutter: sqlite3_close_v2(140232487987744)
flutter: sqlite3_close_v2(Pointer<sqlite3>: address=0x7f8a6ba1ea20)
flutter: #################################
Lost connection to device.

Same happens on a real iOS device.

On Android this actually does not get executed or I can't get it to log.

markOpen gets called, markClose doesn't seem to get called. And in closeExisting the result is always empty.

Yeah I can reproduce this on Android too. The cached in-memory database used by the tracker seems to always be empty after a stateless restart, it doesn't even contain the table :/ So far I don't know why that happens, it doesn't seem like the VM would call dlclose or anything.

Either way, that just makes it even weirder since I'd expect the real database to be closed too if in-memory databases vanish. Maybe the lock we're seeing here is file based and we can free it by deleting some files manually.

I created a sqlite(sqlcipher actually) plugin using ffi, and I got the same issue, database get locked after hot-reload. What I don't understand is that, when I reload the app, there is no database operation existing, thus there should not be any transaction happening, why it's still get locked?

any news on it?
Struggling with it too...

Was this page helpful?
0 / 5 - 0 ratings