Commit dba9781f26d53548baa8e9552960ec8107db55c5 (PR #7992 [Wayland] Fix toggle fullscreen from @Sunderland93 ) leads to the core dump found below with sway master (compiled with wlroots master) and Vulkan video driver (does not dump with GL).
Backtrace with debugging symbols: file
PID: 25312 (retroarch)
UID: 1000 (***)
GID: 1000 (***)
Signal: 11 (SEGV)
Timestamp: Wed 2019-01-23 22:33:04 CET (2min 1s ago)
Command Line: retroarch
Executable: /usr/bin/retroarch
Control Group: /user.slice/user-1000.slice/session-1.scope
Unit: session-1.scope
Slice: user-1000.slice
Session: 1
Owner UID: 1000 (***)
Boot ID: ***
Machine ID: ***
Hostname: ***
Message: Process 25312 (retroarch) of user 1000 dumped core.
Stack trace of thread 25312:
#0 0x00007fdf950570f8 wl_egl_window_resize (libwayland-egl.so.1)
#1 0x000055d57ec5115a n/a (retroarch)
#2 0x00007fdf8fbc36d0 ffi_call_unix64 (libffi.so.6)
#3 0x00007fdf8fbc30a0 ffi_call (libffi.so.6)
#4 0x00007fdf9504ef5f n/a (libwayland-client.so.0)
#5 0x00007fdf9504b6ca n/a (libwayland-client.so.0)
#6 0x00007fdf9504cc0c wl_display_dispatch_queue_pending (libwayland-client.so.0)
#7 0x000055d57ec5215a n/a (retroarch)
#8 0x000055d57ec74d10 n/a (retroarch)
#9 0x000055d57ea9b28e n/a (retroarch)
#10 0x000055d57ea98acc n/a (retroarch)
#11 0x000055d57ebed475 n/a (retroarch)
#12 0x000055d57ebefeae n/a (retroarch)
#13 0x000055d57ea56552 n/a (retroarch)
#14 0x000055d57ea5764d n/a (retroarch)
#15 0x000055d57eb30c1c n/a (retroarch)
#16 0x000055d57ea50d7e n/a (retroarch)
#17 0x00007fdf90a36223 __libc_start_main (libc.so.6)
#18 0x000055d57ea4dcbe n/a (retroarch)
Stack trace of thread 25313:
#0 0x00007fdf9507dafc pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1 0x000055d57ea68e83 n/a (retroarch)
#2 0x000055d57ec4dcff n/a (retroarch)
#3 0x00007fdf95077a9d start_thread (libpthread.so.0)
#4 0x00007fdf90b0db23 __clone (libc.so.6)
Stack trace of thread 25322:
#0 0x00007fdf9507de5b pthread_cond_timedwait@@GLIBC_2.3.2 (libpthread.so.0)
#1 0x000055d57ee97215 n/a (retroarch)
#2 0x00007fdf90e31063 execute_native_thread_routine (libstdc++.so.6)
#3 0x00007fdf95077a9d start_thread (libpthread.so.0)
#4 0x00007fdf90b0db23 __clone (libc.so.6)
Stack trace of thread 25314:
#0 0x00007fdf9507dafc pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1 0x00007fdf87fe5d14 n/a (libvulkan_intel.so)
#2 0x00007fdf87fe5a38 n/a (libvulkan_intel.so)
#3 0x00007fdf95077a9d start_thread (libpthread.so.0)
#4 0x00007fdf90b0db23 __clone (libc.so.6)
The core dumps in second call of wl_egl_window_resize:
GNU gdb (GDB) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/retroarch...(no debugging symbols found)...done.
[New LWP 12844]
[New LWP 12854]
[New LWP 12851]
[New LWP 12852]
Core was generated by `retroarch --verbose'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f93054bc0f8 in ?? ()
[Current thread is 1 (LWP 12844)]
(gdb) start
warning: Unexpected size of section `.reg-xstate/12844' in core file.
Temporary breakpoint 1 at 0x8d5e0
Starting program: /usr/bin/retroarch
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Temporary breakpoint 1, 0x00005555555e15e0 in main ()
(gdb) break wl_egl_window_resize
Breakpoint 2 at 0x7ffff5fac0f0
(gdb) continue
Continuing.
[New Thread 0x7fffe9974700 (LWP 14756)]
xkbcommon: ERROR: Key "<LFSH>" added to modifier map for multiple modifiers; Using Lock, ignoring Shift
Thread 1 "retroarch" hit Breakpoint 2, 0x00007ffff5fac0f0 in wl_egl_window_resize ()
from /usr/lib/libwayland-egl.so.1
(gdb) continue
Continuing.
INTEL-MESA: warning: Haswell Vulkan support is incomplete
[New Thread 0x7fffe8840700 (LWP 14805)]
[New Thread 0x7fffe2b6f700 (LWP 14807)]
[Detaching after fork from child process 14808]
Thread 1 "retroarch" hit Breakpoint 2, 0x00007ffff5fac0f0 in wl_egl_window_resize ()
from /usr/lib/libwayland-egl.so.1
(gdb) stepi 4
0x00007ffff5fac0f8 in wl_egl_window_resize
() from /usr/lib/libwayland-egl.so.1
(gdb) stepi
Thread 1 "retroarch" received signal SIGSEGV, Segmentation fault.
0x00007ffff5fac0f8 in wl_egl_window_resize
() from /usr/lib/libwayland-egl.so.1
(gdb)
last good: 6ca9afbd577d5d5f4bb8cadf59c94180854a24b6
first bad: dba9781f26d53548baa8e9552960ec8107db55c5
Video driver: vulkan
https://github.com/swaywm/sway/commit/1a1133dcc5fb03773dfc3df3af04325245f7d67awlroo
https://github.com/swaywm/wlroots/commit/c41d01306de59235256d96902cced49a8eef15e9
Would you mind getting a backtrace with debugging symbols?
@Sunderland93 Any ideas?
I know that almost all resizing code in Wayland context (specially toplevel resize) is wrong. But when I added #7992 it worked fine on my Sway and Rootston setup... I'll try to find a solution, but I need help from more experienced Wayland developers.
@orbea I am not sure if I find the time today. But I should be able to do it no later than tomorrow.
@Sunderland93 You might get good advice if you ask in a wayland related irc channel on freenode or maybe even a more general channel like #dri-devel.
I don't have a wayland setup so I don't think I'll be much help here...
Hm, I have no issues with Sway and wlroots/rootston from master, and RetroArch-master. Also works perfectly on Weston
@benutzer193 RetroArch is crashes on rootston too?
I have tested it now with rootston and I receive the same Core Dump.
You find the full backtrace here. The dump has been created with retroarch master.
As at the time of the dump the warning message regarding missing assets was printed, I updated the assets via a functioning install, but this did not change the core dump.
@benutzer193 did you use Vulkan or OpenGL driver?
The dump only occurs in with vulkan driver.
I just verified that it does not dump with GL.
Ok. Unfortunately, I currently doesn't have Vulkan capable hardware for test it and find a way out.
@benutzer193 You may also want to get debugging symbols for /usr/lib/libwayland-client.so.0, /usr/lib/libwayland-egl.so.1 and /usr/lib/libffi.so.6. This may be an upstream bug?
I have only compiled wayland with symbols for now and it's interesting.
The dump seems to be because egl/wayland-egl.c can't be found:
Backtrace
I tried adding egl/wayland-egl.c, but it seems like I am not doing it correctly.
I still receive the dump including the message that the file is not available with following changes:
https://github.com/libretro/RetroArch/compare/d99f32a...benutzer193:master
Can someone give me a hint whats wrong?
That is because egl/wayland-egl.c is not part of the RetroArch codebase and I'm guessing there might be a missing null pointer check in that file.
See here.
It would be worth asking a wayland developer what's going wrong here, I still think this might be an upstream bug.
Yeah, forget what I said about the file missing...this was just gdb unable to find the file.
It seems you are correct, with my wayland fork it does not dump anymore:
https://github.com/benutzer193/wayland-1/commit/da9358efcf9c8ca5f730ad6e6c6bef86c5420962
Will open an issue upstream tomorrow.
Thanks for all the help. I would say you can add an upstream label ;)
There might still be a way to avoid this issue in RetroArch, but yea it would be good to see what upstream has to say.
I just wanted to open an issue upstream and saw that the issues are on gitlab and not github.
I currently do not have a gitlab account. If someone already has an account and can open the issue I would appreciate it. Otherwise I will create an account and check out gitlab.
I was able to fix the dump for me, but I am not sure, if the coding changes are okay. Perhaps someone can take a look and test: https://github.com/libretro/RetroArch/compare/1ee66d520469d5fc7465e298b60923c93bee284e...benutzer193:8e26b13578f58af2a9779bc0a33956af7d53a947
it looks good in GDB: breakpoint in create is reached first and after that only resize.
You should be able to sign into gitlab with a github account and post issues.
I have created an account now and opened an issue.
That is because egl/wayland-egl.c is not part of the RetroArch codebase and I'm guessing there might be a missing null pointer check in that file.
Not really: it is just not safe to pass NULL to that function. Seemingly for RetroArch it's fine for the native window to sometimes be null, but we can't make that determination for everyone, so we don't ignore NULL windows in resize requests.
When the user toggles full screen with RetroArch it triggers a context_reset which will tear down the window and recreate it. I suppose this means we don't want to use wl_egl_window_resize in these cases.
Of course fixing the bigger and long standing context_reset issue would be nicer, but we have a shortage of qualified graphics programmers... There is also an issue for this. https://github.com/libretro/RetroArch/issues/4721
Heh, unfortunately we've all got a shortage of developers. Indeed if you're destroying and recreating the window, just skip the resize call and recreate the window at the correct new dimensions.
@benutzer193 or @Sunderland93 Would you be willing to apply the test the suggested change? I would guess this looks right, but I have very little knowledge of wayland. :)
https://github.com/libretro/RetroArch/compare/1ee66d520469d5fc7465e298b60923c93bee284e...benutzer193:8e26b13578f58af2a9779bc0a33956af7d53a947
I don't have any knowledge as well, but I'll open a PR
Most helpful comment
You should be able to sign into gitlab with a github account and post issues.