The Elder Scrolls Online hangs at random points in the game. Sometimes it hangs after 30 minutes, sometimes after 5 hours, sometimes it doesn't hang at all. Also these hangs don't seem to be related to actions in the game, but it mostly hangs when I'm not AFK. In case the game hangs the music continues to play in the background, but the image doesn't change at all anymore.
I first experienced hangs with Lutris, using lutris 5.5-2 and dxvk 1.6. I don't have the crash logs from these crashes anymore, but I still have the entries that got logged to the syslog:
Apr 12 01:15:45 benziuminator kernel: NVRM: GPU at PCI:0000:01:00: GPU-6168685b-aea9-0fec-d3d9-b7b1398e05a3
Apr 12 01:15:45 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=693, Graphics Exception on GPC 0: 3D-CT KIND Violation. Coordinates: (0x3a0, 0x180)
Apr 12 01:15:45 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=693, Graphics Exception: ESR 0x500420=0x80000100 0x500434=0x18003a0 0x500438=0x1800 0x50043c=0x100fb
Apr 12 01:15:45 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=693, Graphics Exception on GPC 1: 3D-CT KIND Violation. Coordinates: (0x380, 0x180)
Apr 12 01:15:45 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=693, Graphics Exception: ESR 0x508420=0x80000100 0x508434=0x1800380 0x508438=0x1800 0x50843c=0x100fb
Apr 12 01:15:45 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=693, Graphics Exception on GPC 2: 3D-CT KIND Violation. Coordinates: (0x390, 0x180)
Apr 12 01:15:45 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=693, Graphics Exception: ESR 0x510420=0x80000100 0x510434=0x1800390 0x510438=0x1800 0x51043c=0x100fb
Apr 12 01:15:45 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 13, pid=693, Graphics Exception: ChID 0042, Class 0000a097, Offset 0000194c, Data 00000000
I then compiled dxvk v1.6-44-gce3d0ab4, and installed it in a clean wineprefix. I was able to play for a day or two without crashes, but then it crashed again. Luckily I saved the logs this time:
wine.log
eso64_d3d11.log
eso64_dxgi.log
The following was logged to syslog this time:
Apr 24 16:13:01 benziuminator kernel: NVRM: GPU at PCI:0000:01:00: GPU-6168685b-aea9-0fec-d3d9-b7b1398e05a3
Apr 24 16:13:01 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 8, pid=691, Channel 0000002a
Today, I recompiled dxvk using the latest master commit and also set DXVK_LOG_LEVEL=debug and then had the game hangs up again. Previously my PC was able to recover from a hang up, I was able to get a process manager to the foreground after a few seconds and could then kill the game. This time I wasn't able to do anything. The only thing I could still do was to login via ssh. I then was able to kill the game, but that didn't change anything, since Xorg apparently crashed. The hung up game was still displayed, the terminal on my second monitor (connected via the internal intel gpu) also didn't do anything. I wasn't able to kill Xorg, reboot didn't work either, has to press the reset button.
Syslog:
Apr 27 00:58:18 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 62, pid=705, 0c83(1780) 00000000 00000000
Apr 27 01:01:19 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 32, pid=13862, Channel ID 00000016 intr 00800000
[...]
Apr 27 01:02:50 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 9, pid=94986, Channel 0000004c Intr 00000001
Apr 27 01:02:50 benziuminator kernel: NVRM: Xid (PCI:0000:01:00): 9, pid=94986, Channel 0000004c Intr 00000001
[...]
Apr 27 01:06:15 benziuminator kernel: INFO: task nv_queue:707 blocked for more than 122 seconds.
Apr 27 01:06:15 benziuminator kernel: Tainted: P OE 5.6.6-arch1-1 #1
Apr 27 01:06:15 benziuminator kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 27 01:06:15 benziuminator kernel: nv_queue D 0 707 2 0x80004080
Apr 27 01:06:15 benziuminator kernel: Call Trace:
Apr 27 01:06:15 benziuminator kernel: ? __schedule+0x2e8/0x7a0
Apr 27 01:06:15 benziuminator kernel: ? __switch_to_asm+0x34/0x70
Apr 27 01:06:15 benziuminator kernel: schedule+0x46/0xf0
Apr 27 01:06:15 benziuminator kernel: schedule_timeout+0x231/0x310
Apr 27 01:06:15 benziuminator kernel: __down+0x8d/0xe0
Apr 27 01:06:15 benziuminator kernel: ? __schedule+0x2f0/0x7a0
Apr 27 01:06:15 benziuminator kernel: down+0x3b/0x50
Apr 27 01:06:15 benziuminator kernel: os_acquire_mutex+0x31/0x40 [nvidia]
Apr 27 01:06:15 benziuminator kernel: _nv033293rm+0xc/0x20 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? _nv034168rm+0xb6/0x170 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? _nv009008rm+0x2f/0x130 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? rm_execute_work_item+0x3d/0xc0 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? os_execute_work_item+0x46/0x60 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? _main_loop+0x83/0x130 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? kthread+0xfb/0x130
Apr 27 01:06:15 benziuminator kernel: ? _raw_q_schedule+0x70/0x70 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? kthread_park+0x90/0x90
Apr 27 01:06:15 benziuminator kernel: ? ret_from_fork+0x35/0x40
Apr 27 01:06:15 benziuminator kernel: INFO: task GLXVsyncThread:30078 blocked for more than 122 seconds.
Apr 27 01:06:15 benziuminator kernel: Tainted: P OE 5.6.6-arch1-1 #1
Apr 27 01:06:15 benziuminator kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 27 01:06:15 benziuminator kernel: GLXVsyncThread D 0 30078 13867 0x00000084
Apr 27 01:06:15 benziuminator kernel: Call Trace:
Apr 27 01:06:15 benziuminator kernel: ? __schedule+0x2e8/0x7a0
Apr 27 01:06:15 benziuminator kernel: schedule+0x46/0xf0
Apr 27 01:06:15 benziuminator kernel: schedule_timeout+0x231/0x310
Apr 27 01:06:15 benziuminator kernel: __down+0x8d/0xe0
Apr 27 01:06:15 benziuminator kernel: ? preempt_count_add+0x68/0xa0
Apr 27 01:06:15 benziuminator kernel: down+0x3b/0x50
Apr 27 01:06:15 benziuminator kernel: os_acquire_mutex+0x31/0x40 [nvidia]
Apr 27 01:06:15 benziuminator kernel: _nv033293rm+0x15/0x20 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? _nv034168rm+0xb6/0x170 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? _nv034116rm+0x22/0xd0 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? _raw_spin_lock_irqsave+0x26/0x50
Apr 27 01:06:15 benziuminator kernel: ? _nv000909rm+0x1c9/0x940 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? _raw_spin_unlock_irqrestore+0x20/0x40
Apr 27 01:06:15 benziuminator kernel: ? rm_ioctl+0x54/0xb0 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? nvidia_ioctl+0x41/0x8a0 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? nvidia_ioctl+0x5b3/0x8a0 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? nvidia_frontend_unlocked_ioctl+0x37/0x50 [nvidia]
Apr 27 01:06:15 benziuminator kernel: ? ksys_ioctl+0x87/0xc0
Apr 27 01:06:15 benziuminator kernel: ? __x64_sys_ioctl+0x16/0x20
Apr 27 01:06:15 benziuminator kernel: ? do_syscall_64+0x4e/0x150
Apr 27 01:06:15 benziuminator kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 27 01:07:01 benziuminator systemd-logind[543]: Power key pressed.
Apr 27 01:07:01 benziuminator systemd-logind[543]: Powering Off...
Apr 27 01:07:01 benziuminator systemd-logind[543]: System is powering down.
Other log files:
wine.log
eso64_d3d11.log
eso64_dxgi.log
I also tried recording an api trace, but I can get 15 fps with that at most and I really can't play the game with 15 fps for several hours.
I tried playing the game with vkd3d to see if that crashes as well, but because of graphic issues I can't really play it like that for several hours. The graphic artifacts are pretty funny though:
https://screens.totally.rip/2020/04/5e961c5eb13f4.mp4
And last but not least: I don't know if this really is a bug in dxvk or rather in the nvidia driver (or maybe also an issue with my gpu?). I watched the temperatures while playing, the gpu temperature was around ~85掳C at a maximum, so I don't think it has anything to do with that.
Sounds like a driver bug. The game is known to run fine for people on more modern GPUs.
In case you're running out of memory, make sure to close everything that might be consuming VRAM (web browser!!!, disable Steam hardware acceleration, disable desktop effects, etc). Not sure if that would still cause troule with current drivers though.
Hm, I didn't even think of the possibility that my VRAM might not be enough. I'll monitor the VRAM while playing from now on, and see how much memory is used when the game crashes again.
It just crashed again with a VRAM usage of only 1440MiB of 1996MiB.
eso64_dxgi.log
eso64_d3d11.log
dmesg:
[42412.756517] NVRM: GPU at PCI:0000:01:00: GPU-6168685b-aea9-0fec-d3d9-b7b1398e05a3
[42412.756520] NVRM: Xid (PCI:0000:01:00): 8, pid=25339, Channel 0000002a
So if this is a driver bug, there's not much I can do, right?
Out of curiosity. Have you already tried to rollback wine-staging from 5.7 to 5.6 or even 5.5 and/or using an older dxvk? Wine-staging 5.7 has the .NET CoreRT patch which might influence it.
I didn't try an older dxvk version yet, but the last game crash (where I posted the VRAM usage) was with the custom lutris wine version lutris-5.5.2-x86_64.
I don't use lutris myself so I cannot comment on it. For me it is just plain old wine-staging ;)
It did crash with wine-staging 5.6. I'll try again with 5.5, but I doubt this will fix the hangs.
I'll install Elder Scrolls to see if it will run on my setup. What was your last setup it works?
I haven't found a setup yet where it just works without game crashes. It does work on all setups I've tried so far, sometimes even for a really long time (8+ hours of playing) without crashes/hangs. But eventually it will crash on any setup for me.
So it didn't work well in wine-staging 5.5 and earlier either? What if you install dxvk without dxgi using wine's?
I didn't really test it on wine-staging 5.5 and earlier yet. I started playing TESO just a month ago and at first used lutris to run it. I just installed wine-staging 5.5 and also switched to wine's implementation of dxgi. I'll report here if it crashes again.
I think nVidia have a lot of instability in their drivers. Understandably very hard to pinpoint as you say, sometimes you can crash in 30 minutes, other times you can play for hours. (Similar to my experience aswell). This means most usable logging is out of the question.
I have tried, but kinda given up on it, to figure out a reproducibly way to cause these crashes but have so far gotten nowhere with that.
It just crashed again with wine-staging 5.5 and wine's dxgi. Same errors as before (VK_ERROR_DEVICE_LOST). Yesterday it ran for 8 hours straight without crashing.
I think nVidia have a lot of instability in their drivers.
Yes, after my experience with Elder Scrolls Online on linux that really seems to be the case. What still bugs me is that nobody else seems to be having these issues.
Tbh i think "nobody else having these issues" actually is "everyone else are having these issues, but are mostly used to Linux being a unstable gaming platform" :)
Lets face it. If you are a SERIOUS gamer, you do NOT game hours on end on Linux. So those that ARE having issues are either a) testers, og b) casual gamers.
I can bet $$ that if there was competetive gamers in eg. LoL tournaments and whatnot that kept crashing mid fight and loosing.. It would not have gone unnoticed. It is VERY few that actually game for many hours and those few that do kinda "live with it" (including me, although i sometimes post about it when i am bothered).
My oppinion tho.
@SveSop please stop pulling things out of your arse. Crashing every 20 minutes with device lost errors is absolutely not normal, although certain hardware might not have the most reliable GPU drivers.
Also, "serious gamer", really?
@SveSop
i play league of legends on linux since ever and never got a crash in game with both WINED3D and DXVK.
What are you talking about dude?
If you have a broken ass setup that's your problem i guess?
@doitsujin Wow... Angry much?
Now, since you feel obliged to put words in my mouth, could you please quote me where i said it was normal to crash every 20 minutes? Did i say that? Really?
I said SOMETIMES... if you interpret SOMETIMES to mean EVERY, well... maybe you should open a dictionary. Just so we are clear:
sometimes you can crash in 30 minutes, other times you can play for hours.
The OP also did not mention EVERY 20 minutes, cos he said:
Sometimes it hangs after 30 minutes, sometimes after 5 hours, sometimes it doesn't hang at all.
So, please refrain from making things up - Or to use your own words - Pull things out of your arse.
As to "serious gamer" - I feel things like e-sport is not heavily represented by ppl playing under Linux or wine. If you have other numbers, feel free to link some statistics.
If you are doing progress raiding in a MMO, or playing in something of a gaming event - Dota2, LoL and so on, chances are you are NOT using Linux. That is what i meant with "serious gamer".
Playing random games for an hour here and there is not what I consider "serious gamer". If a casual player crashes after playing for 2 hours.. the next day after 30 minutes... When he play the game 1 week later, it does not crash.. Things like that - It is most likely considered "meh.. it just happens sometimes with Linux/Proton/Steam/Wine/whatever".
Do you disagree with that? Do you feel that i stepped on your toes by not considering casual gamers "serious"?
@kassindornelles I used LoL as an example. I did not say that "LoL crashes every 20 minutes" or whatever you guys make up i said.
I do not play LoL. Some of the games i play SOMETIMES crashes for no apparent reason - Kinda like the OP describes. Does this mean EVERY game does? Nope. Does it mean i cannot play for hours every day for weeks without issues in the games i DO play? Nope.
It means SOMETIMES it crashes for no real apparent reason. This seems to be worse/better depending on what driverbranch from nVidia I use. The Vulkan beta driver branch seems to come up as the most stable when it comes to DXVK and VKD3D.
Well so help debug the issue and fix it,
go bitch about linux being unstable on reddit or whatever, this is not the right place.
go bitch about linux being unstable on reddit or whatever, this is not the right place.
? Whaa??
lmao
I literally played Elder Scrolls Online for 8 hours straight without a single hiccup a few weeks ago.
So don't know what you're going on about.
I literally played Elder Scrolls Online for 8 hours straight without a single hiccup a few weeks ago.
Which GPU do you have? Which wine/proton/lutris version are you using? Also which dxvk version?
Also can we please put an end to the other off topic discussion now and focus on the actual issue?
@SveSop no, we're not talking about esports now. The world isn't black and white, you can still care about your performance in competitive games or tackle harder MMO content without being the kind of player who takes part in esports tournaments. Stability issues are absolutely not acceptable or expected in that kind of scenario.
And guess what, whenever some wine/dxvk/driver update fucks up popular games like World of Warcraft, Overwatch, you name it, people do notice and report these problems, which often results in TKG or GloriousEggroll bisecting wine regressions or Lutris folks reporting problems with a new Nvidia driver release to Nvidia.
So no, your assumption that most Linux users just "live with it" is just blatantly untrue.
I myself have played Clan Battles in World of Warships, done raids in FFXIV, all on Linux, and know a number of people in the community who do this somewhat regularly. This isn't the kind of thing anyone would do if they constantly had to fear game crashes.
@jkhsjdhjs Hi!
I advise you to send this information/bug-report in detail to NVIDIA via their email linux-bugs [@] nvidia.com.
The more details you provide the better to help them fix the issue.
Also don't forget to send the nvidia-bug-report.sh logs, PROTON version and other system info too.
Thanks, I'll send a report when it crashes again!
The errors OP has mentioned are similar enough to RAM issues I've previously had, thought I would leave some insight here.
With ESO I used to get these kind of game visuals freeze, but background music and sound effects remain playing a lot under Windows 10. Other games, such an Unreal Engine games would occassionally crash with a "device lost" sort of error, as would DMC5 under DX12, again relating to the device being lost. Within the event viewer it would show nvlddmkm errors, occasionally being as specific saying "event 13" with "subchannel mismatch" (linux XID 13 equivalent).
Sometime later I switched to linux to figure out the exact cause. These kind of errors would manifest in linux as well, resulting in the following XID errors: 12, 13, 31, 32, 69. The most common of these would be an XID 13 Subchannel mismatch.
Though later diagnosing it turned out that these were RAM errors which wouldn't show up in memtest86 no matter how long I tested, but I could eliminate all errors by halving out usable memory with the memmap= kernel parameter (and then kept halving out the unused till i found when it error-ed).
I started playing ESO again recently, but this time on linux. For the hours ive played so far (lutris shows ~60, versions 5.5 and 5.6) I've only ever had ONE freeze with background noise that occurred on the character select screen and didn't output any XID errors.
The amount of different errors the OP has posted lead me to believe its some kind of hardware issue. Those errors are different to what I was getting, but they're similar enough that it could be worth ruling out memory issues first:
memmap= kernel param to artificially limit)Also as a side note, when experiencing XID errors, having kde connect preloaded with a killall command (for example killall eso64.exe) would still function, so I could use my phone to quickly kill any app that was hanging the system, and so unfreeze it. Nice alternative to ssh :)
Thanks a lot for your experience report! A memory error would explain why I'm the only one having these issues. I currently don't know if these Xid errors also occur with other games, I only started noticing them recently when I started playing ESO. Sadly my system logs only go back until Mon 2020-02-10 22:07:59 CET. I did a grep Xid on my system logs anyways, here are the results:
Xid_errors.log
So I'm seeing Xid errors 8,13,31,32 and 62 with ESO. Not all of these can be caused by system memory corruption according to this table, but not all of the errors you mentioned can be either, so I'll definitely remove one of the two 8GB memory sticks and under-clock the video memory and test again.
If this doesn't help I'll check with only the other memory stick installed, and if that doesn't help try to lower the RAM frequency.
Also thanks for the hint with the kde connect app, I do use KDE, but unfortunately I own an iPhone, for which the app is not avaiable :(
A memory error would explain why I'm the only one having these issues.
To be fair, you might also very well be the only person trying to run the game on a Kepler-based GPU. These things are very old and don't get much testing with DXVK (I have a GTX 670 lying around, but no ESO).
I'd assume a software issue rather than defective hardware, but with Nvidia drivers it's pretty much impossible for me to tell what exactly is going wrong, that's usually something NV have to debug on their end if it's reproducible.
Fair, I can see just two reports with a GTX 6xx on protondb.com, both say the game runs fine but they specified a play time of <= 10 hours each.
Anyways, I will still try playing the game with just one memory stick, just to rule out all possible causes.
My game crashed with one memory stick just like it crashed with the other one. Both times with Xid 8.
Thus I'm now fairly certain that it's not an issue with my system memory.
I ran nvidia-bug-report.sh both times and will now send the created log files to nvidia.
As a regular ESO player, I will add my two-cents here.
I have too encounter the hang issue, where the screen stays on the art page with music. But I have attributed to the poor quality of the client and server. The hangs never happen to me while playing the game, it always during a port to another zone/instance, logging in or logging out.
The fix was to alt-tab to an console and "killall eso64.exe". Then re-start the game again.
My only frustrating issue is when in large combat, FPS drops to less then 10. But the millisecond after I died, the FPS goes instantly back to 30 FPS. With the same number of players and activity in view.
This is on a gentoo AMD threadripper with nvidia system.
@Techwolf What GPU and driver are you using?
@K0bin
info: GeForce RTX 2070:
info: Driver: 440.82.0
info: Vulkan: 1.1.119
I traded my GTX 660 against the GTX 750 Ti of my brother a week ago. According to benchmarks it performs slightly worse, but my game doesn't seem to crash anymore!
I currently don't play as excessively as before, but it didn't crash in a week, so it seems it really was the nvidia driver's fault.
I'll leave this issue open, but will eventually close it if it doesn't crash in the next few weeks.
Most helpful comment
@SveSop no, we're not talking about esports now. The world isn't black and white, you can still care about your performance in competitive games or tackle harder MMO content without being the kind of player who takes part in esports tournaments. Stability issues are absolutely not acceptable or expected in that kind of scenario.
And guess what, whenever some wine/dxvk/driver update fucks up popular games like World of Warcraft, Overwatch, you name it, people do notice and report these problems, which often results in TKG or GloriousEggroll bisecting wine regressions or Lutris folks reporting problems with a new Nvidia driver release to Nvidia.
So no, your assumption that most Linux users just "live with it" is just blatantly untrue.
I myself have played Clan Battles in World of Warships, done raids in FFXIV, all on Linux, and know a number of people in the community who do this somewhat regularly. This isn't the kind of thing anyone would do if they constantly had to fear game crashes.