Sonarr currently uses WAL mode for journals with SQLite. WAL mode has some advantages, but one major disadvantage is that it can not safely be used over non-local filesystems (https://sqlite.org/wal.html); docker for windows and other virtualization systems using CIFS mounted host paths often fail with sqlite locking or corruption errors when using WAL with sqlite file on host shared paths.
Providing an option to disable WAL mode (perhaps using standard DELETE mode) for transactions would be very useful for virtualizing Sonarr, or other cases where the config files and sqlite databases need to live on a SMB/CIFS/NFS path.
Could we have some sort of config file option or command line option that disables WAL mode journaling throughout the program?
This was already addressed for OSX in #167 - may as well just make it an advanced option, so we can set it when needed, and by default on OSX.
docker and other virtualization systems often fail with sqlite locking or corruption errors when using WAL with sqlite file on host shared paths.
Do you have a link for that with some evidence? I'd expect a linux docker to behave largely the same. A linux docker in a windows hyper-v, that's different, since it's essentially a network share unless you mount a datacontainer or mobylinux mount, instead of a windows host mount/share.
I'm not saying it can't go wrong, docker is a special animal, but I need more info to be able to determine the best course of action.
sqlite databases need to live on a SMB/CIFS/NFS path.
It shouldn't, like never. Even other synchronization modes aren't reliable over networks, it's horrible for performance too.
This was already addressed for OSX in #167 - may as well just make it an option.
Euh, no, rule 1 of the Fool Proof Handbook is to never add an option that the user must set to avoid breaking stuff. Either detect the edge-case and deal with it automatically, or throw a big fat warning saying it's unsupported. :smile:
For example, we might be able to detect if the db is on a known reliable fs and use wal in those cases. Or detect it's on a network or cloud drive and simply refuse to start. Or inside a docker, and force non-wal mode. Things like that.
But as I said, need more info.... please
Updated the description a bit.
To be clear, as the title of the issue said, I'm only discussing docker for windows; which uses CIFS/SMB to mount host paths. It mounts them with the "nobrl" option, which causes lock requests not to be sent to the server (https://github.com/docker/for-win/issues/11). This is unique to docker for windows, though similar problems arise on docker for osx.
If your solution is that network paths are not supported for the database files, then that's fine; it just means that anyone using docker for windows will have massive problems; and perhaps a startup warning that the appdata filesystem must be local would be nice.
I agree that requiring the user to add an option for normal behavior is bad; the flip side of that rule is that anything you set automatically should be able to be overridden by the user. Go ahead and set it on OSX; but let the end user override it if they want. I don't think application code should know about every edge case; that's what configuration files or advanced command line options are for.
At any rate, there are various complaints of Sonarr (and Radarr, and Plex) not working right/at all/being corrupted on docker for windows; CIFS is, I believe, the root cause.
If your solution is that network paths are not supported for the database files, then that's fine; ...; and perhaps a startup warning that the appdata filesystem must be local would be nice.
That's my preferred solution for network shares, coz it's just inviting disaster regardless of sync mode.
For docker for windows i'd just recommend not mount on the windows host, but mount on the mobyvm or use a data volume or datacontainer (volumes_from). I think we might be able to detect that scenario and force non-wal mode, but I'll have to do some testing of /proc/mount show how the volume is mounted.
Tnx for the info. btw: Docker for Windows via hyper-v (win10) or virtualbox (win7)?
I agree that requiring the user to add an option for normal behavior is bad; the flip side of that rule is that anything you set automatically should be able to be overridden by the user. Go ahead and set it on OSX; but let the end user override it if they want. I don't think application code should know about every edge case; that's what configuration files or advanced command line options are for.
In our experience you shouldn't. Yes, advanced users are quite capable of making those decisions. But Sonarr isn't intended for advanced users and any option is likely to be abused/misused (we have empirical evidence on that, and dozens of wasted support hours to drive the point home). So any (hidden/config-file only) option should be carefully considered, and avoided as much as possible. There usually is a better solution.
As long as the average user doesn't even bother reading the info tooltips in the UI... _sigh_ I digress.
I'd argue that if Sonarr errs on the side of caution, and only use WAL in cases it knows it would work. Then no option is needed. We just need to find out if that's feasible.
My understanding of the teminology is that Docker for Windows uses Hyper-V (Windows 10 only), while Docker Toolbox for Windows uses Virtualbox (Windows 7+). In this issue, I'm discussing Hyper-V with MobyLinuxVM; the host paths are from the parent Windows host; using a config path in the MobyLinuxVM is an option, but there is no easy way to tell docker to do that; and afaik, all docker volume drivers on windows will use CIFS as well.
I'm a long-time backend services coder, so don't think so well about normal user usability issues grin. That said, perhaps taking a progressive enhancement approach for things like WAL mode might be better: by default, use the most compatible journaling mode (DELETE, iirc); if you detect a supported filesystem, enable WAL mode. This will allow for usage on unknown filesystems without code changes.
Still, I agree using sqlite on what is essentially a network filesystem is a bad idea; I just don't know of a better solution. The only other options I can think of involve rsync/unison with inotify; and that has it's own problems.
Really, though, this isn't a Sonarr problem; it's a Docker for Windows problem, that they made worse by disabling file locking.
Hi,
look at this config file. Maybe it's worth a shot: https://system.data.sqlite.org/index.html/artifact?ci=trunk&filename=System.Data.SQLite/Configurations/System.Data.SQLite.dll.config
And just for information: If docker is running on top of a Linux system (virtualized in Hyper-V or not), path mapping works as expected and the database works as expected.
I'm running a Linux VM inside Hyper-V that contains a docker environment containing Sonarr. The storage backend is LVM and the config and data paths are mapped into the container. Works.
Have a look at creater_container():
# cat /etc/docker/containers/sonarr.on
container_name="sonarr"
container_hostname="$container_name"
container_image="linuxserver/sonarr"
container_update_auto=1
function stop_container {
docker stop "$container_name"
}
function start_container {
docker start "$container_name";
}
function delete_container {
docker rm "$container_name"
}
function create_container {
docker create \
-e PUID=1002 \
-e PGID=1006 \
--hostname "$container_hostname" \
--ip 10.1.1.3 \
--name "$container_name" \
--net vidnet \
--restart always \
-v /etc/ssl/certs:/etc/ssl/certs:ro \
-v /dev/rtc:/dev/rtc:ro \
-v /srv/data/sonarr/config:/config \
-v /srv/data/sabnzbd/downloads:/downloads \
-v /srv/data/sonarr/pickup:/pickup \
-v /srv/data/sonarr/recycle:/recycle \
-v /srv/videos/tv:/tv \
"$container_image"
}
function canbestopped_container {
return 0;
}
And then in /srv/data/sonarr/config:
# ls {logs,nzbdrone}.*
logs.db logs.db-shm logs.db-wal nzbdrone.db nzbdrone.db-shm nzbdrone.db-wal
Cu
@Grimeton - of course it does. We're specifically discussing Docker for Windows, which uses MobyLinuxVM running on HyperVM, with paths on the windows host. In this (standard) configuration, any paths on the Windows host are mounted via SMB/CIFS.
@lokkju Yeah the question came up, so I clarified it.
I have an armv7 docker swarm cluster, running a sonarr container among a lot of other things. This cluster, have a glusterfs server on all the nodes, setup as a replication. I mount locally on all nodes using glusterfs fuse filesystem to localhost. In short, I have a local filesystem on all the nodes with the same data.
This works for everything, but sonarr, that corrupt the sqlite3 database in average once each two days in average.
My workaround is to backup the database (.dump) every hour. If database corrupts, it automatically remove all sqlite databases and restore a new one from the last working dump.
Would be nice to have an option "use WAL" or something like that on the configuration to get rid of this. Or support external relational databases (mysql, postgres, ...). I think external databases would be a lot of work mainly because of version migrations, advanced selects, so on, but the option to use wal or not, should be simple to add.
Just some update, the latest way to use docker on windows is LCOW, which uses "linuxkit" running inside hyper-v. It seems they now use 9p to share the volumes, which also results in a lot of errors and makes any container that uses sqlite in WAL mode unusable or any locking operation for that matter.
I got the answer #1385 that this is a Microsoft problem. Its still crazy to me, that after all those years docker can't handle sqlite + WAL on windows. Its a real shame since LCOW works great otherwise and is a huge improvement over the old mode and docker toolkit.
This isn't a Docker/Windows/CIFS issue. I get the same behaviour on Docker Swarm on Ubuntu using NFS. Oddly, this worked fine with Kubernetes even though the NFS server was the same.
As others have mentioned, there actually are valid scenarios in which you may have to mount configuration and the database from a network share.
I also run Sonarr within Docker Swarm (with only one replica), and it is quite common for the container to be moved from one node to another when the original one goes offline or while re-balancing load. Local storage isn't an option in this scenario.
For it to work on CIFS, the nobrl option was necessary, and when using NFS, it is very common for background tasks to throw the "The database is locked" error in the logs. Fortunately I haven't seen database corruption yet.
It clearly wasn't built with the intention to be deployed in this manner, so I'm quite surprised that it actually works and performance isn't bad at all. But the database locking is indeed still an issue, and at some point I assume I'll start seeing database corruption.
Given that supporting remote databases would be a huge rewrite, it seems the WAL setting might be an interesting workaround.
It would be great if Sonarr could determine what it should use on its own, but it might be a bit difficult. I mount the network shares on the hosts, and use docker bind volumes, so Sonarr would just see it as any other volume. Maybe it could try to execute one of these commands that cause locking problems, and make the suggestion to change the setting, or something like that.
This isn't a Windows only issue. I get the same errors on Rancher 2.1 Kubernetes, using NFS Persistent Volume to a ReadyNAS NFSv4.
My research shows it is a known issue with sqlite not playing nice with NFS's locking and the answer might be to allow nolock as an option.
I'd like to chime in too with this problem. I use a Docker container for Sonarr. It only happens when I use NFS as the datastore. This would be great to get working as others also want their persistent data stored on a NFS server. The nolock mount option does nothing in my case. Sonarr appears to be functioning just fine giving these System.Data.SQLite.SQLiteException (0x80004005): database is locked errors, but I could see it leading greater problems.
My NFS mount options are:
[Mount]
What=freenas.lan:/mnt/Pergamum/Docker
Where=/nas/freenas.lan/Docker
Type=nfs4
Options=noatime,nolock,soft,rsize=32768,wsize=32768,timeo=900,retrans=5,_netdev
The
nolockmount option does nothing in my case.
Oh well, it was a stab in the dark and thanks for eliminating that.
I would love to have the priority bumped on this issue. Out of 30 containers, Lidarr, Radarr & Sonarr are the only applications I run that cannot use NFS for application data. :(
For now I have just stored their application data to the VM instead of my NFS share.
Can confirm - this is still an issue in the Sonarrv3 previews.
REALLY sucky...
Just wanted to give my $.02. I have the same issue. Tryong to run sonarr in a kubernetes cluster has been... Painful.
I ended up having a container first grabbing a copy of the data from the nfs share and then putting it on a local share. Then start sonarr and have another container do a copy back to the nfs share to have a somewhat reliable backup of it.
It's gross. The db is going to get corrupted someday because the container is gonna crash in the middle of the transfer. And while it's in no way sonarr's fault that sqlite is garbage over network share... It would be really nice to have a fix, or be able to run against mysql/postgres/...
And yeah like mentioned before, the nolock option doesn't solve that issue.
Looks like it does this with sqlite on nfs for me too. nolock did not fix the issue. @Xaelias I will probably do the same thing as you.
@markus101 @Taloth would it be possible to include a start up argument to disable wal? Seeing that it's disabled on OSX it would be nice to make it configurable for people that use NFS shares.
As linked in the comment above, I'm getting 'database disk image is malformed' errors I have my persistent docker storage mounted by a glusterfs share. I tried using the local disk as suggested by @markus101 and I'm not getting the errors anymore, but I really want the safety and redundancy of the glusterfs server I painstakingly setup.
Can't we opt for a separate mariadb or postgres db instead of sqlite?
FYI to everyone saying "nolock didn't work", the option @yamlCase mentioned/linked is a SQLite option used when loading the database file, not an NFS mount option.
+1 would love to see an enhancement to address this.
Grafana and Sonarqube have the option to connect to external persistent data stores such as MySQL. It would be wonderful to see something similar with Sonarr.
This is still a headache for me. If an config flag to disable WAL mode isn't an option, how about just an environment variable that advanced users can set?
Did retest this with all the latest docker/lcow/windows stuff and still get disk I/O error and NzbDrone.Core.Datastore.CorruptDatabaseException.
Docker version master-dockerproject-2019-06-05, build c02f389c
Kernel Version: 10.0 18362 (18362.1.amd64fre.19h1_release.190318-1202)
Operating System: Windows 10 Pro Version 1903 (OS Build 18362.145)
4.19.27-linuxkit
Seems the 9p filesystem still lacks compatible locking options and many linux containers wont run correctly via LCOW, see: linux-containers
connectionBuilder.JournalMode = OsInfo.IsOsx ? SQLiteJournalModeEnum.Truncate : SQLiteJournalModeEnum.Wal;
If it's work with Osx why couldn't it be used with Linux?
@ggzengel that is exactly what I asked. There should be a start up parameter that disables wal
Hitting the same issues here with config directories hosted on NFS and mounted to container running via rancher2/k8s.
System.Data.SQLite.SQLiteException (0x80004005): database is locked
database is locked
at System.Data.SQLite.SQLite3.Step (System.Data.SQLite.SQLiteStatement stmt) [0x00088] in <61a20cde294d4a3eb43b9d9f6284613b>:0
at System.Data.SQLite.SQLiteDataReader.NextResult () [0x0016b] in <61a20cde294d4a3eb43b9d9f6284613b>:0
at System.Data.SQLite.SQLiteDataReader..ctor (System.Data.SQLite.SQLiteCommand cmd, System.Data.CommandBehavior behave) [0x00090] in <61a20cde294d4a3eb43b9d9f6284613b>:0
at (wrapper remoting-invoke-with-check) System.Data.SQLite.SQLiteDataReader..ctor(System.Data.SQLite.SQLiteCommand,System.Data.CommandBehavior)
at System.Data.SQLite.SQLiteCommand.ExecuteReader (System.Data.CommandBehavior behavior) [0x0000c] in <61a20cde294d4a3eb43b9d9f6284613b>:0
at System.Data.SQLite.SQLiteCommand.ExecuteNonQuery (System.Data.CommandBehavior behavior) [0x00006] in <61a20cde294d4a3eb43b9d9f6284613b>:0
at System.Data.SQLite.SQLiteCommand.ExecuteNonQuery () [0x00006] in <61a20cde294d4a3eb43b9d9f6284613b>:0
at Marr.Data.QGen.UpdateQueryBuilder1[T].Execute () [0x0003b] in C:BuildAgentwork5d7581516c0ee5b3srcMarr.DataQGenUpdateQueryBuilder.cs:157
at Marr.Data.DataMapper.Update[T] (T entity, System.Linq.Expressions.Expression1[TDelegate] filter) [0x00000] in C:\BuildAgent\work\5d7581516c0ee5b3\src\Marr.Data\DataMapper.cs:674
at NzbDrone.Core.Datastore.BasicRepository1[TModel].Update (TModel model) [0x0002a] in C:BuildAgentwork5d7581516c0ee5b3srcNzbDrone.CoreDatastoreBasicRepository.cs:125
at NzbDrone.Core.Tv.SeriesService.UpdateSeries (NzbDrone.Core.Tv.Series series, System.Boolean updateEpisodesToMatchSeason) [0x000a9] in C:BuildAgentwork5d7581516c0ee5b3srcNzbDrone.CoreTvSeriesService.cs:160
at NzbDrone.Core.Tv.RefreshSeriesService.RefreshSeriesInfo (NzbDrone.Core.Tv.Series series) [0x00213] in C:BuildAgentwork5d7581516c0ee5b3srcNzbDrone.CoreTvRefreshSeriesService.cs:110
at NzbDrone.Core.Tv.RefreshSeriesService.Execute (NzbDrone.Core.Tv.Commands.RefreshSeriesCommand message) [0x00072] in C:BuildAgentwork5d7581516c0ee5b3srcNzbDrone.CoreTvRefreshSeriesService.cs:175
at NzbDrone.Core.Messaging.Commands.CommandExecutor.ExecuteCommand[TCommand] (TCommand command, NzbDrone.Core.Messaging.Commands.CommandModel commandModel) [0x000f6] in C:BuildAgentwork5d7581516c0ee5b3srcNzbDrone.CoreMessagingCommandsCommandExecutor.cs:95
at (wrapper dynamic-method) System.Object.CallSite.Target(System.Runtime.CompilerServices.Closure,System.Runtime.CompilerServices.CallSite,NzbDrone.Core.Messaging.Commands.CommandExecutor,object,NzbDrone.Core.Messaging.Commands.CommandModel)
at System.Dynamic.UpdateDelegates.UpdateAndExecuteVoid3[T0,T1,T2] (System.Runtime.CompilerServices.CallSite site, T0 arg0, T1 arg1, T2 arg2) [0x00035] in <35ad2ebb203f4577b22a9d30eca3ec1f>:0
at (wrapper dynamic-method) System.Object.CallSite.Target(System.Runtime.CompilerServices.Closure,System.Runtime.CompilerServices.CallSite,NzbDrone.Core.Messaging.Commands.CommandExecutor,object,NzbDrone.Core.Messaging.Commands.CommandModel)
at NzbDrone.Core.Messaging.Commands.CommandExecutor.ExecuteCommands () [0x00027] in C:BuildAgentwork5d7581516c0ee5b3srcNzbDrone.CoreMessagingCommandsCommandExecutor.cs:41 `
@markus101 @Taloth would you be open to a MR that adds a command line argument to disable WAL? It's only a work around but it's honestly all we have besides adding a way to connect to an external DB.
@onedr0p Preferably not, as mentioned before it's a nice workaround for advanced user, but you don't want users to actively have to configure something for it work, if it can be helped at all. (The irony of that statement doesn't escape me with respect to how long this issue has been open.)
I'd prefer it it works in reverse: use wal only if it's on a local drive.
I can whip something up on a v3 feature branch, but that will have to be tested on various setups.
In fact, if you wish you can try make the necessary change yourself: Run IDiskProvider.GetMount(...) on the appdata dir during the ConnectionStringFactory call, it should contain the necessary info to determine whether the appdata dir is a local drive and use WAL/Journal accordingly. Getting an IDiskProvider instance is possible by adding it to the ConnectionStringFactory constructor.
I'd prefer it it works in reverse: use wal only if it's on a local drive.
The issue has nothing to-do with local vs network, as the underlying problem is about locking and other FS features, which are not implemented on some FS or work "funny", limited in others. As noted we get similar errors with a 9P "local" filesystem on Docker for Windows mounts. So if WAL needs certain FS features to work correctly, that it needs to probe for the specific features on the FS itself.
@Taloth like @Andy2244 said it is much more than network filesystems, but that is what most people are struggling with including me.
I have opened PR #3180 to add a start up arg to disable wal. I have tested locally and should work :)
that it needs to probe for the specific features on the FS itself.
That's a valid point. It might be doable but I'm not sure if we can do that properly for all supported platforms, it's worth an attempt.
Although I have to note that 9P is a network filesystem protocol, not a local filesystem. I'm not sure if it's detected as such by mono and/or /proc/mounts but that we can deal with.
moby used to connect to the windows host via CIFS, also a network filesystem protocol and is already detected as such by sonarr.
PR updated to https://github.com/Sonarr/Sonarr/pull/3183
Jumping in to share my shitty hack _workaround_ for Kubernetes users that rely on NFS for persistence; use a sidecar container, mount an ext4 image file backed by a local disk or ram, and fsfreeze it every 5 minutes to copy a snapshot of Sonarr's database files.
YMMV, but this seems to work because sqlite still makes atomic writes to disk, even in WAL mode. Using fsfreeze on a real filesystem like ext4 prevents Sonarr from writing further changes until you've finished copying them off to NFS storage.
Tradeoff is that you might lose the last 5 minutes of activity if there's an unexpected outage.
Show YAML
yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: sonarr
name: sonarr
spec:
replicas: 1
selector:
matchLabels:
app: sonarr
strategy:
type: Recreate
template:
metadata:
labels:
app: sonarr
spec:
containers:
- command:
- sh
- -c
- |-
dd if=/dev/zero of=/ramdisk/image.ext4 count=0 bs=1 seek=400M; \
mkfs.ext4 /ramdisk/image.ext4; \
mount /mnt/sonarr-ramdisk-mount /ramdisk/image.ext4; \
cp -fvp /sonarr-config/*.* /mnt/sonarr-ramdisk-mount; \
while true; do \
sleep 890; \
sync /mnt/sonarr-ramdisk-mount/*.*; \
fsfreeze --freeze /mnt/sonarr-ramdisk-mount; \
sleep 10; \
cp -fvp /mnt/sonarr-ramdisk-mount/*.* /sonarr-config/; \
fsfreeze --unfreeze /mnt/sonarr-ramdisk-mount; \
done;
image: ubuntu
imagePullPolicy: Always
lifecycle:
preStop:
exec:
command:
- umount /mnt/sonarr-ramdisk-mount;
name: sonarr-config
resources: {}
securityContext:
privileged: true
volumeMounts:
- mountPath: /ramdisk
name: ramdisk
- mountPath: /sonarr-config
name: sonarr-config
- mountPath: /mnt/sonarr-ramdisk-mount
mountPropagation: Bidirectional
name: sonarr-ramdisk-mount
- command:
- sh
- -c
- until [ -f "/config/sonarr.db" ]; do sleep 1; done; /init
env:
- name: PUID
value: "1000"
- name: PGID
value: "1000"
image: linuxserver/sonarr:preview
imagePullPolicy: IfNotPresent
name: sonarr
ports:
- containerPort: 8989
name: sonarr
protocol: TCP
readinessProbe:
tcpSocket:
port: 8989
resources: {}
volumeMounts:
- mountPath: /config
mountPropagation: HostToContainer
name: sonarr-ramdisk-mount
- mountPath: /config/Backups
name: sonarr-config
subPath: Backups
- mountPath: /config/MediaCover
name: sonarr-config
subPath: MediaCover
- mountPath: /config/logs
name: sonarr-config
subPath: logs
- mountPath: /config/xdg
name: sonarr-config
subPath: xdg
- mountPath: /media
mountPropagation: HostToContainer
name: media
volumes:
- emptyDir:
medium: Memory
sizeLimit: 400M
name: ramdisk
- name: sonarr-config
persistentVolumeClaim:
claimName: sonarr-config
- emptyDir: {}
name: sonarr-ramdisk-mount
- hostPath:
path: /tmp/media
type: Directory
name: media
A few tips for those on network filesystems:
local_lock=all.lookupcache=none, noac,sync,sharecache,forcedirectio, locks.mandatory-locking: forceddirect-io-mode=enable when mountingperformance.strict-o-direct: onperformance.stat-prefetch: offperformance.write-behind: offperformance.open-behind: offNote: This probably won't apply to windows as locks are completely broken there
Hope this helps!
Adding yet another voice to the chorus of people here; we definitely need a workaround for filesystems which don't support WAL.
@btowntkd the issue we found was that disabling WAL causes major lag when using a large collection of series. I still want this problem to be solved don't get me wrong. It seems like SQLite and Sonarr don't play well for our use case. :(
I have also experienced this issue when running Sonarr with /config mapped to a volume on a remote NFS or CIFS share.
+1, would be awesome to see this issue get resolved.
@btowntkd the issue we found was that disabling WAL causes major lag when using a large collection of series. I still want this problem to be solved don't get me wrong. It seems like SQLite and Sonarr don't play well for our use case. :(
So then - can we not put this behind an off by default setting? Personally, I would much rather deal with lag than deal with weekly db corruption.
So then - can we not put this behind an off by default setting? Personally, I would much rather deal with lag than deal with weekly db corruption.
I'm not sure I agree no. Weekly DB corruption sounds like you have something else going on.
Me and a couple other people have shown how we work around this specific problem. I've been running this for months now. And never once had an issue.
Don't get me wrong, I would love for Sonarr to have a support for real postgres or something. But until someone does the work, I don't think there is a real solution. Now yes, a flag that allows enabling/disabling the offending option. Sure, that can't really hurt. The default should probably stay what it currently is IMO.
@Xaelias - so you're agreeing with me then, a flag that is off by default can't really hurt?
@SerialVelocity 's solution doesn't work for me, I'm actually doing something similar to your solution myself but as you say, "it's gross".
@fergalmoran I may have misread your proposition. Yeah we probably agree then.
Update for the Windows/docker users out there.
I just re-tested the latest Docker Desktop CE 2.1.6.0 (Edge) for Windows and got working sqlite + WAL for Sonarr/Headphones containers!
This is using the default hyper-v "Linux Container" backend + "shared drive" feature, not Lcow/WSL2. So maybe give it a try again and see if the sqlite db's don't corrupt anymore and maybe even network shares might work, not sure how smb over those new "shared drives" behaves. I did not test the latest inotify stuff, but if it works as well a lot of containers should now run correctly from windows bind mounts.
This means if all is stable, we can finally use the Docker CE to get our containers working "natively" on Win10. Currently as hack, i use a hyper-v vm with ClearLinux/Docker + reverse samba4 server for my Sonarr setup, so this "new" way via direct bind mounts might finally work.
2.1.5.0 introduced this:
New file sharing implementation: Docker Desktop introduces a new file sharing implementation which uses gRPC, FUSE, and Hypervisor sockets instead of Samba, CIFS, and Hyper-V networking. The new implementation offers improved I/O performance.
2.1.6.0 this:
Docker Desktop now supports inotify events on shared filesystems for Windows file sharing.
All my quick tests that failed before now work correctly, while using a bind mount from my host NTFS drive. _(You need to add your drive to share via settings and than can directly use it, but make sure the folders exists.)_
Examples via PowerShell:
docker run -it --name=sonarr -v f:\docker\test2:/config -e PGID=0 -e PUID=0 -p 8989:8989 linuxserver/sonarr
docker run -it --name="headphones" -v f:\docker\test1:/config -p 8181:8181 linuxserver/headphones
PS: I assume the same stack (_gRPC, FUSE, and Hypervisor sockets_) is utilized for there WSL2 backend, while the old experimental Lcow backend (_Windows Containers_) will not use it, since it has no "shared drives" option.
Maybe someone brave can test the latest DD/Edge version with latest WSL2(_19018_) and check if it behaves correctly as well? I'm confused how Docker Desktop + WSL2 actually works regarding bind mount from the host.
Just set up a homelab cluster based on Nomad and NFS as a shared data store. Quite discouraged after hours of efforts to find this issue and realize I can't (at least with my skills) get Sonarr up and running. One more vote for some movement on a "real" solution for this this or a flag that can be set.
@natelandau If iSCSI is an option for you, using it as persistent volume will avoid this problem with SQlite and NFS.
So I came here from a thread on reddit about Bazarr. I have the following setup:
So the interesting part is that Sonarr, Radarr and Lidarr are all running fine on the virtual DSM, with the configuration stored on the NFS share. I installed Bazarr, and it immediately failed with a locking error which is obviously related to WAL.
Moving Bazarr container onto the 'real' DSM, and storing the config in exactly the same place on the volume, just using the direct path rather than mounting that folder as an NFS share, works just fine.
What's weird is that in theory, Sonarr, Radarr and Lidarr should all fail with the NFS share, if they have WAL enabled...
Either way, another vote here for a DISABLE_SQLITE_WAL option for all of these containers. :)
Jumping in to share my ~shitty hack~ _workaround_ for Kubernetes users that rely on NFS for persistence; use a sidecar container, mount an ext4 image file backed by a local disk or ram, and
fsfreezeit every 5 minutes to copy a snapshot of Sonarr's database files.YMMV, but this seems to work because sqlite still makes atomic writes to disk, even in WAL mode. Using
fsfreezeon a real filesystem like ext4 prevents Sonarr from writing further changes until you've finished copying them off to NFS storage.Tradeoff is that you might lose the last 5 minutes of activity if there's an unexpected outage.
Show YAML
Came here looking for a solution for the same type of SQLite WAL database corruption issues on Gluster... Seems like the above workaround from @putty182 may be a good idea... Now just have to try to figure out how to translate the workaround to GlusterFS using docker swarm services?
Jumping in to share my ~shitty hack~ _workaround_ for Kubernetes users that rely on NFS for persistence; use a sidecar container, mount an ext4 image file backed by a local disk or ram, and
fsfreezeit every 5 minutes to copy a snapshot of Sonarr's database files.
YMMV, but this seems to work because sqlite still makes atomic writes to disk, even in WAL mode. Usingfsfreezeon a real filesystem like ext4 prevents Sonarr from writing further changes until you've finished copying them off to NFS storage.
Tradeoff is that you might lose the last 5 minutes of activity if there's an unexpected outage.
Show YAMLCame here looking for a solution for the same type of SQLite WAL database corruption issues on Gluster... Seems like the above workaround from @putty182 may be a good idea... Now just have to try to figure out how to translate the workaround to GlusterFS using docker swarm services?
Today I moved from glusterfs 3.x to 7.x, both Sonarr and Radarr are no longer corrupting their databases for me it seems so far, doing some testing at the moment to confirm.
I feel like SQLite's WAL-mode is might be unfairly attacked in this thread. The suggestion that Sonarr should introduce an option to disable it entirely seems like an unnecessarily blunt instrument.
The SQLite page on WAL does indeed say that "WAL does not work over a network filesystem". I think it says this though because the way WAL is implemented is by using shared memory, in this case though a memory-mapped file. Since this method of sharing memory though a memory-mapped file isn't well-supported by network filesystems, SQLite can't make the necessary correctness guarantees for separate hosts that are reading the SQLite database off a network share. However, if you can guarantee that you have no more than one machine using the SQLite database, I don't think that there's anything _inherent_ to the way WAL mode works that it should cause corruption. I imagine that for most deployments of Sonarr, having a single instance deployed is reasonable (I can't imagine many people are load-balancing Sonarr or have it set up in HA configuration).
I experience database is locked errors myself when running Sonarr in a Kubernetes cluster, with its configuration database served from a NFSv3 share mounted with the nolock option. I wanted to create a reproducible test case that demonstrates the problem, but I have not been successful so far. What I have tried is just reading and writing to a WAL-mode SQLite database on a NFS share mounted inside a Docker container. I ran the example scripts listed in a blog post about WAL and those executed just fine. This demonstrates that at least in very simple scenarios, WAL seems like it could be just fine.
I'm interested in reproducing the corruption and locking errors that we see in Sonarr, but I think I need to learn more about how Sonarr interacts with its SQLite database in order to do it. Specifically: does Sonarr _read_ from the SQLite database concurrently from separate threads? Does it _write_ to the SQLite database concurrently from separate threads? Does it use WAL in EXCLUSIVE locking mode?
no plans to change this at this time; closing per markus
IMO this is a big issue for some users @bakerboy448 & @markus101. I know having an external db may never be supported but maybe some future versions of sqlite may have some features to mitigate this.
Anyways, would it make sense to document this on the FAQ that Sonarr's application data is not supported over NFS/network shares and link to this issue?
Has anybody tried putting the DB on a GFS2 share? I have a 3-node Kubernetes cluster, and I fancy creating a "local" PV as a shared LVM thin partition formatted to GFS2 attached to the nodes. In theory, only one pod will access the DB, so it would not matter which node writes the SQLite file on the shared partition. This filesystem was specifically created for shared access, I wonder if SQLite would work on it fine.
@immanuelfodor if running in a kubernetes cluster I suggest iSCSI or other block storage like rook-ceph, longhorn or openebs.
@immanuelfodor if running in a kubernetes cluster I suggest iSCSI or other block storage like rook-ceph, longhorn or openebs.
Is iSCSI a solution that will allow the database on shared storage to work without these issues?
Based on @onedr0p 's suggestion, I've started to experiment with Piraeus (wrapper of Linstor which is wrapper of DBRD) to provide high speed NVMe storage for my cluster. It can also use iSCSI under the hood as network block storage protocol.
Find some of my questions about Piraeus usage here: https://github.com/piraeusdatastore/piraeus-operator/issues/125
My experiment is not yet complete to share the final conclusions regarding SQLite, I have had a busy week since then.
@2fst4u not using kubernetes, but on Docker my issue had go away once I switched from NFS to iSCSI
We're starting to get off in the weeds, but yes, iSCSI will "solve" this issue because under the hood, it works with local copies of the files. It just uses network (async) to report these changes to the NAS.
So there is no reason for SQLite to freak out when used with iSCSI.
Most helpful comment
Just wanted to give my $.02. I have the same issue. Tryong to run sonarr in a kubernetes cluster has been... Painful.
I ended up having a container first grabbing a copy of the data from the nfs share and then putting it on a local share. Then start sonarr and have another container do a copy back to the nfs share to have a somewhat reliable backup of it.
It's gross. The db is going to get corrupted someday because the container is gonna crash in the middle of the transfer. And while it's in no way sonarr's fault that sqlite is garbage over network share... It would be really nice to have a fix, or be able to run against mysql/postgres/...
And yeah like mentioned before, the nolock option doesn't solve that issue.