Retroarch: [Wii U] DSI Error on launching Roms with Retroarch 1.67

Created on 21 Aug 2017  Â·  78Comments  Â·  Source: libretro/RetroArch

img_0368

DSI error ocurring randomly when launching ROMs under 1.67.

FBA mainline core seems to trigger it most often. Usually the first ROM will load ok, while subsequent ROMs may trigger the error.

Setup:
Wii U 5.5.1, CBHC, launching Retroarch from channel. Same setup working fine previously, tried fresh .cfg as well with same results.

wiiu

All 78 comments

@QuarkTheAwesome is this related to your getopt changes?

I doubt it. I don't really see a situation where it could affect much unless the Wii U port was relying on a pretty ridiculous edge case.
@retrob0t; could you upload/link to the RPX that caused this particular error? There's two frees in config_file_save, and figuring out which one it is can only really be done by cross-referencing the addresses in the error with the executable that made it.
I had a look through the config file code, and I'll admit, I'm stumped as to what it could be.

@QuarkTheAwesome Wii U 1.67 stable release, fbalpha_libretro.rpx. I can upload here if you can't download it from the buildbot for some reason.

Is this still happening?

I don't know, I've been primarily testing the N3DS port. (Haven't had a lot of luck on that front, either, but that's another story).

@retrob0t Have you tried the latest nightly?

@QuarkTheAwesome I'm seeing a very similar crash with 1.6.8.

It's also crashing in the main_exit(), although in my case the stack trace implicates the input driver. Here's the DSI error, transcribed:

DSI: Instr at 01049448 bad write to unmapped memory at 5B436F76
r0  00000000 r1  1062EC78 r2  1050B000 r3  1064C040 r4  11186600
r5  10AAD23C r6  0000000D r7  11186614 r8  00000040 r9  11186654
r10 00000000 r11 00000054 r12 5B436F6E r13 1050B000 r14 00000000
r15 00000000 r16 00000000 r17 00000000 r18 00000000 r19 00000000
r20 00000000 r21 00000000 r22 00000000 r23 00000000 r24 00000000
r25 00000000 r26 00000000 r27 00000001 r28 00000000 r29 10510000
r30 1064C000 r31 1064C040 lr  01049924 sr1 00007072 dsi 42000000
ctr 0104A304 cr  40000224 xer 00000000
01049448(518A943B):coreinit.rpl|CoreInitDefaultHeap
01049924(518A9917):coreinit.rpl|CoreInitDefaultHeap
1062ED30(        ):<unknown>
0104A35C(518AA34F):coreinit.rpl|MEMFreeToExpHeap
0D04366C(0004366C):ffl_app|input_driver_deinit_mapper
0D0249DC(000249DC):ffl_app|command_event
0D0203DC(000203DC):ffl_app|rarch_ctl

The above was nestopia_libretro.rpx launched via Haxchi/HBC, no CFW and no coldboot. I've also gotten DSI errors in SNES9x.

@gblues I think you've shed a serious amount of light on the problem. Take a look:

You crashed by reading 5B436F76, a small offset away from r12's value; 5B436F6E. I suspect this was the value that somehow got passed into a free. Here's that free, but the code there looks pretty good and input_driver_mapper is static, so the problem isn't any explicit modification of that variable. That's when I noticed: 5B436F6E is [Con in ASCII.
Looking for strings like that in the codebase:

$ grep -ir "\"\[Con" *
command.c:   snprintf(s, len, "[Config]: %s \"%s\".",
command.c:      RARCH_ERR("[Config]: %s\n", msg_hash_to_str(MSG_CONFIG_DIRECTORY_NOT_SET));
command.c:      RARCH_WARN("[Config]: %s\n",
command.c:            strlcpy(msg, "[Config]: Config directory not set, cannot save configuration.",
command.c:            RARCH_LOG("[Config]: [overrides] %s\n", msg);
command.c:            RARCH_ERR("[Config]: [overrides] %s\n", msg);
configuration.c:      RARCH_LOG("[Config]: Loading default config.\n");
configuration.c:         RARCH_LOG("[Config]: found default config: %s.\n", path_get(RARCH_PATH_CONFIG));
configuration.c:   RARCH_LOG("[Config]: loading config from: %s.\n", path_get(RARCH_PATH_CONFIG));
configuration.c:   RARCH_ERR("[Config]: couldn't find config at path: \"%s\"\n", path_get(RARCH_PATH_CONFIG));

All the matches are to do with the config stuff; which is where OP's problems stemmed from! Since this problem seems to be pretty much exclusive to the config code, I'm going to pre-emptively rule out the logging stuff (at least for now). That leaves us with the snprintf and the strlcpy in command.c. The thing is, both calls write into buffers on the stack, which should be more or less safe. More investigation is needed, I say.

(Must say, that stack pointer (r1) does feel kind of close to some of the other addresses in the register dump...)

What does this mean, exactly?

Hmm.. I think you're getting warm.

Speculation:

The snprintf in command_event_save_config on line 1390 isn't writing directly to a buffer on the stack. It's writing to a pointer provided by the caller.

One of the callers is command_event_save_core_config(), which has a bunch of stack-allocated variables. If one of those was getting smashed, the value of "msg" could be getting clobbered, causing the call to command_event_save_config to overwrite an arbitrary memory location--in my case, the handler for the input driver.

It's likely that the input driver handle and the config file handle pointers are near each other in memory.

This is a blind commit, but see if it makes any difference -

https://github.com/libretro/RetroArch/commit/29b421512a4b74f67e51cf82ca90668b71715306

If it doesn't prevent the DSR crashes, I hope this will at least bring us closer to the actual problem at hand so we can fix it.

EDIT: Fixed a small logic error I made -

https://github.com/libretro/RetroArch/commit/c0567266b1f66699ed5205d9901f2fef4d7f87bc

I had a look at how the heap system works on Wii U, and I think this issue might not be as simple as "bad pointer passed into free". I've pushed some changes to my exception handler to fix the addresses in brackets (they're supposed to be relative addresses, so you can jump straight to the offending instruction in a disassembler) and print out the opcode that caused the crash. Hopefully this will help me pinpoint the exact source of the DSI and trace what actually happened - as it stands now; I'm having trouble finding the crash.

I can't test it right now (so I'm not submitting a PR) but if someone could reproduce the crash with these changes I'd appreciate it.

(related: @gblues; what firmware version are you running?)

EDIT: forgot the link

Pretty sure I'm on latest; I'll double-check when I get home.

Will BuildBot make a nightly with these commits in it? If not, I'll need some help setting up a working build system--last time I tried, it didn't go very well.

@QuarkTheAwesome firmware info:

  1. 5.5.2 U
  2. WUP-101(02)
  3. FW401779291
  4. HATC-0211-3375

I'm using Haxchi via exploited DS game.
HBL 1.4

I should be able to do some testing tonight. I managed to finally get compilation to succeed, although I’m still not getting an *.rpx but maybe I’ll give the *.elf a shot. The build bot *.elf version was even more problematic than the rpx though so I’m skeptical.

Still, getting the damn thing to build is still a step up from previous efforts.

If you need to setup a toolchain for wiiu on linux, here's an easy way using this script I made for Travis-CI:

https://github.com/libretro/libretro-super/blob/master/travis/build-wiiu.sh

just call it like this: CORE=fceumm ./build-wiiu.sh

make sure libretro-super is already cloned in your home folder. and yes, that /home/buildbot/tools path is necessary and cannot be changed because the tools hardcode that path... you could however symlink it if you wanted. also, that build-wiiu.sh script should only be called once since it downloads the toolchain each time... for subsequent runs, just use the last line in the script to compile again.

Hah! Would be nice if that was in the Wii U build documentation. Thanks, I’ll give it a shot.

That ended up not helping with whatever was going on that was preventing the rpx from being generated. Whatever it was, it was environmental because I was able to spin up an Ubuntu 17.04 VM and it build successfully (including the RPX) first time. The build-wiiu.sh script saved me some time on getting the VM ready. :)

Now I'm just waiting for the build to finish, and I'll be able to try to crash it again.

DSI: Instr at 01049448 (908C0008) bad write to unmapped memory at 5
r0  00000000 r1  1062ECB8 r2  1050B000 r3  1064C040 r4  10ACAD58 
r5  10AAD23C r6  0000000D r7  10ACAD6C r8  00000040 r9  10ACADAC
r10 00000000 r11 00000054 r12 5B436F6E r13 1050B000 r14 00000000
r15 00000000 r16 00000000 r17 00000000 r18 00000000 r19 00000000
r20 00000000 r21 00000000 r22 00000000 r23 00000000 r24 00000000
r25 00000000 r26 00000000 r27 00000002 r28 00000000 r29 10510000
r30 1064C000 r31 1064C040 lr  01049924 sr1 00007072 dsi 42000000
ctr 0104A304 cr  40000224 xer 00000000
01049448(30D2943B):coreinit.rpl|CoreInitDefaultHeap
01049924(30D29917):coreinit.rpl|CoreInitDefaultHeap
10AAE198(        ):<unknown>
0104A35C(30D2A34F):coreinit.rpl|MEMFreeToExpHeap
0D04372C(0004372C):ffl_app|input_driver_deinit_mapper
0D024BD0(00024BD0):ffl_app|command_event
0D020478(00020478):ffl_app|rarch_ctl

This is, again, with the NESTopia core, built from @QuarkTheAwesome's retroarch fork (which, yes, includes those commits @twinaphex made, confirmed via git log)

The execution sequence:

  1. Launch Retroarch via the HBL
  2. use "Open Content" and choose my game (game loads successfully)
  3. Go to menu, use "Close content" (game unloads successfully)
  4. use "Open Content" again and pick a different game (crash)

Of course, my first crash earlier in the thread was on first load. So it's not exactly what you call consistent (my favorite kinds of bugs).

@QuarkTheAwesome I noticed a logic error in your modified handler (you used strcmp() when you meant !strcmp()), and I minimized the DSI line so the bad memory location doesn't go off the screen. I'll have a new build in a few minutes.

Here's a new DSI error with the fixed strcmp and refactored DSI line that doesn't go off the screen.

DSI: Instr at 01049448 (908C0008) bad wr: unmapped memory: 5B436F76
r0  00000000 r1  1062ECB8 r2  1050B000 r3  1064C040 r4  10ACAD24 
r5  10AAD23C r6  0000000D r7  10ACAD38 r8  00000040 r9  10ACAD78
r10 00000000 r11 00000054 r12 5B436F6E r13 1050B000 r14 00000000
r15 00000000 r16 00000000 r17 00000000 r18 00000000 r19 00000000
r20 00000000 r21 00000000 r22 00000000 r23 00000000 r24 00000000
r25 00000000 r26 00000000 r27 00000002 r28 00000000 r29 10510000
r30 1064C000 r31 1064C040 lr  01049924 sr1 00007072 dsi 42000000
ctr 0104A304 cr  40000224 xer 00000000
01049448(0002D048):coreinit.rpl|CoreInitDefaultHeap
01049924(0002D524):coreinit.rpl|CoreInitDefaultHeap
10AAE198(        ):<unknown>
0104A35C(0002DF5C):coreinit.rpl|MEMFreeToExpHeap
0D04372C(000436F0):ffl_app|input_driver_deinit_mapper
0D024BD0(00024B94):ffl_app|command_event
0D020478(0002043C):ffl_app|rarch_ctl

So, having reproduced the crash several times, I notice that the crash is happening in exactly the same place every time. Next, I've modified every instance of [Con with a different string ([Co0, [Co1, etc) to see which one is the one involved in the crash.

I notice that the "bad write" address is r12+8, it will be interesting if that holds true.

One last follow-up for tonight:

  • The [Con in r12 is most definitely the result of the snprintf in command_event_save_config
  • The "write to unmapped memory" error is consistently r12+8

@twinaphex this seems to have helped in terms of crash frequency, but it doesn't solve the crash completely. Interestingly, the stack trace looks more interesting.

DSI: Instr at 01049448 (908C0008) bad wr: unmapped memory: 5B436F76
r0  00000000 r1  1062ECB8 r2  1050B000 r3  1064C040 r4  10ACA290
r5  10AAD23C r6  0000000D r7  11186614 r8  00000040 r9  10ACA2E4
r10 00000000 r11 00000054 r12 5B436F6E r13 1050B000 r14 00000000
r15 00000000 r16 00000000 r17 00000000 r18 00000000 r19 00000000
r20 00000000 r21 00000000 r22 00000000 r23 00000000 r24 00000000
r25 00000000 r26 00000000 r27 00000002 r28 00000000 r29 10510000
r30 1064C000 r31 1064C040 lr  01049924 sr1 00007072 dsi 42000000
ctr 0104A304 cr  42000224 xer 00000000
01049448(0002D048):coreinit.rpl|CoreInitDefaultHeap
01049924(0002D524):coreinit.rpl|CoreInitDefaultHeap
0104FEC4(00033AC4):coreinit.rpl|IOS_IoctlAsync
0104A35C(0002DF5C):coreinit.rpl|MEMFreeToExpHeap
0D043704(00043704):ffl_app|input_driver_deinit_mapper
0D024B50(00024B50):ffl_app|command_event
0D02043C(0002043C):ffl_app|rarch_ctl

So; I did some research into the backend of what's going on here. It looks like the actual pointer freed is fine; but there's heap corruption elsewhere that's causing the problem. In essence, everything you malloc has a bit of metadata at the address just below it which makes up a big linked-list of every allocated memory block. When freeing, the allocator looks at the previous and next items in the linked list and fixes their next and previous pointers respectively, so that the freed block is no longer part of the linked list. In our crash, this works fine, so the actual input driver mapper pointer hasn't been messed with.
Where it gets a bit weird is that the allocator then walks through the linked list again looking for a block that matches certain attributes pulled from the initially freed block. If it finds it, it gets freed too. If not, nothing happens. Either way, the code keeps track of the block just before the one that matches the conditions (or the last one in the list). This ends up in r5. The allocator then moves backwards one step in the list; however instead of a proper pointer it reads out our "[Con" text. It only does nullchecks so it doesn't see a problem. It then attempts to change the previous pointer in this block and DSIs.
This is going to be a difficult one. A few quick things we do know:

  • 10AAD248 is where the "[Con" string is
  • The actual problem isn't the input driver mapper, which was freed normally
  • At the time of the crash, either the block in r5 or the one within a few steps of it has corrupt metadata
  • We can't tell at this point whether this corruption was caused directly to the block or was copied in from another block during the allocator's other operations

This leads me to believe that there was a buffer overflow somewhere on the heap; either to the r5 block or one near it (the value may have been copied in by the allocator). One other observation: the snprintf is supposed to go into a stack variable; but the text ends up at an address significantly higher than the stack pointer at the time of the crash.

@gblues

https://github.com/libretro/RetroArch/commit/7ac5eda1e1913df2424a9839a03415e301c52231

See if this prevents the crashes you are experiencing.

Negative. :(

Today, I went back to older versions. More specifically, I built 1.4.1 and 1.5.0 from source and tried to reproduce the crash. I could not. So, if nothing else, I think we can rule out any funny business happening at the compiler/toolchain level.

Both of those use the old rgui UI. I'm going to try building 1.6.9 with the XMB UI disabled, and see what happens.

Got the same crash in rgui mode.

Last-ditch attempt at a blind fix for this -

https://github.com/libretro/RetroArch/commit/fdf79e2e9b698924c89487ff65ad1a2b5d53d030

This should delay invoking the freeing/initing of the mapper until after INIT_CONTROLLERS.

If this still doesn't work, hopefully you'll be able to come up with something. From what I see in the stack trace, when we start using heap functions like free/malloc, the default heap for IOS isn't yet set up. This might be the recurring issues we are running into, since WiiU/Wii has a shared heap pool where it needs to combine all sorts of disparate hardware memory pools (MEM1/MEM2) together to form a whole.

No dice.

From what @QuarkTheAwesome said, the working theory is that the "[Con" string is ending up in the malloc metadata. So, if we assume that the snprintf() isn't directly writing to the overflow buffer, the next logical conclusion is that the buffer is being passed downstream to something that's writing its contents to a bad memory location.

Which, in this case, is logging.

Now, logging is pretty messy, because there's three states:

  • (HAVE_LOGGING is set && IS_SALAMANDER is set)
  • (HAVE_LOGGING is set && IS_SALAMANDER is not set)
  • HAVE_LOGGING is not set

If HAVE_LOGGING is set, the RARCH_LOG etc macros wrap logger_send(). Otherwise they are funky macros that I didn't really want to untangle.

So, I chose the "HAVE_LOGGING" route and put dummy implementations into frontend/drivers/platform_wiiu.c.

Interestingly, this exposed an error where tasks/task_content.c was calling RARCH_ERR() with only one parameter, which caused a compilation error. I changed these to provide an explicit format string.

Anyway, it's building now, and we'll see if dummying out the logging functions has an effect.

No dice, @twinaphex

Since I know it broke somewhere between the "v1.5.0" commit and the "v1.6.0" commit, I'm attempting a bisect.

@gblues does no dice mean the same crash happened, or a crash happened?

It's still crashing, and it's still the same crash.

I'm doing a git bisect now to try to narrow down about when the behavior started.

It's worth noting that the string may not necessarily be in the wrong place

  • a corrupt set of metadata in the linked list could lead to random data
    being treated as a malloc chunk. However, I think assuming the string is in
    the wrong place is a good place to start.
    Have we tried some printf debugging to get the addresses of the involved
    buffers? We know where the offending string is, after all.

On 26 Nov 2017 13:48, "gblues" notifications@github.com wrote:

It's still crashing, and it's still the same crash.

I'm doing a git bisect now to try to narrow down about when the behavior
started.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/libretro/RetroArch/issues/5357#issuecomment-346980162,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AII1QRrHXtHv3AvaO4Y49AYBzI0KF_-6ks5s6NFggaJpZM4O82z8
.

@QuarkTheAwesome I tried, but it crashes before the debug statements get written to the log file. I'm not sure how to set up a network debug.

Complicating the bisect is that there are swathes of commits where the wiiu build is broken.
sigh. Thank goodness for scripts.

Re: network debugging, you need to put your computer's IP and "4405" in the
Make file for IP/port; then run wiiu/net_listen.sh, giving the Wii U's IP
as an argument. Rebuild frontend_wiiu.c and you should be good to go. The
usual network caveats apply (same subnet, etc.)

Bisect number 1 done.

bisect.log

Looks like the first breaking commit was: 4ed3e750d4617555b333e5e2f9d3a4e461433603. Prior to that commit, RetroArch is stable as a rock.

At 1f13d616cc293352b5423c037ed7554b6279628f I got the "problem in system memory" error--not our pet DSI error, but a close cousin I usually get in the SNES9x core.

All other "bad" commits gave me the same behavior: a black screen, no sound, when trying to load a ROM.

I will need to do another bisect between 1.6.0 and 1f13d616cc293352b5423c037ed7554b6279628f to find where the "System memory error" turns into the DSI error.

Right now, my suspicion (read: wild-ass guess) is that the DSI error was introduced when trying to fix the crash introduced at 4ed3e750d4617555b333e5e2f9d3a4e461433603.

Thanks for the bisect
Can you test master with the CC or nearest resampler? it's a shot in the dark but it would help us confirm

Yep. I'm finishing up a new bisect to find the point where "Error in system memory" became a random DSI exception. Once that's done, I'll try a master build with a different resampler.

bisect2.log

So, here's roughly where the "System memory error" (Error Code 160-2203) turns into the DSI error. I couldn't test the skipped commits due to build errors.

Hi there @gblues -

the issue you found is indeed valid - this was the issue -

https://github.com/libretro/RetroArch/blob/master/libretro-common/audio/resampler/drivers/sinc_resampler.c#L442

these lines were erroneously placed outside of the conditional. This is fixed on the latest commit though.

I am afraid we might have to bisect further in the future in order to find the REAL cause now, since the issue seems to be prevalent even on the latest commit. Try applying this diff to every commit AFTER the first bad commit to see at which next point in time things break again -

https://hastebin.com/akeyiviwod.diff

Sorry to put you through this, I do think though we are getting warm and I hope we will isolate the cause soon!

@twinaphex I'm maybe a step ahead of you. Take a look at the PR I just opened on my fork. The PR describes my search methodology.

If it's still too much to inspect, I'll start another bisect between breaking-commit and dsi-error to see if I can find another stable commit using the patch you provided and narrow it down even further.

@gblues Thank you very much for your efforts. I would indeed appreciate a further narrowing down using the diff patch I provided. I was able to reproduce the resampler crash on PC/Windows, but I doubt I would be able to recreate the DSI error similarly on PC.

OK, so after inspecting and researching in the git commit history:

  • We know the nasty bug is introduced sometime between 4ed3e750d4617555b333e5e2f9d3a4e461433603 when the sinc driver bug was introduced, and 47d0cb053e1d2510f3f27ba128ce6eff6131f829 when the fix for the sinc driver bug was committed.
  • But there are lots of commits between those two points

So, I am now executing the following procedure to try to identify the true crash point:

  1. Start a git bisect between those two hashes (i'm using the new/old dichotomy again)
  2. Apply the sinc patch
  3. Create build and test it
  4. flag as old/new
  5. repeat 2-4 until bisect is done

sigh ... this would go faster if these intermediary commits actually built successfully. /rant

Yeah, I feel a bit bad about putting you through that situation.

Meh. comes with the territory. :) I have at least one successful build that didn't crash, so it's helping a little, at least.

bisect3.log

This was the closest I was able to get. None of the skipped commits built successfully.

Builds marked "old" are stable builds that don't crash.
Builds that are marked "new" give the DSI error.

I didn't get any system memory errors in this bisect.

The commit history in this repo is... pretty bad. While git bisect found ce61db14, I can't actually check that commit out with git checkout.

So, instead, I followed its lineage and traced it back to PR 4948.

Unfortunately it's a PR from a repo that doesn't exist anymore, and it contains merges of merges so untangling it is hairy.

A process suggestion, if I may be so bold: when merging cross-fork PRs, do a squash merge to make it easy to revert in case the contributor goes away.

ControllerPatcher, eh? Interesting. I think a good test at this point is to temporarily disable it to see if that fixes the crash and we can figure out where to go from there. Commenting out the init and deinit (added here) and the input polling (this hunk) should completely neutralise that code; which should assist in the narrowing-down effort.
(P.S. did you ever get the network logger running?)

I've confirmed that it's the HID controller support that's causing the DSI error. Here's a PR showing the code changes I made to neutralize the HID driver, and a build from the neuter-wiiu-hid-driver branch does not encounter any DSI errors.

https://github.com/gblues/RetroArch/pull/3

Also: no, once I went into bisect mode I didn't have a pressing need for the network logger. Might give it a shot if I get any ideas on how to fix the HID code.

That leaves us in an interesting position. We now know there's an error in controller_patcher... somewhere. As far as I know, the only one with any experience in that code is @Maschell. We can spend the time trying to track down the bug; but due to the nature of the bug I don't think it'd be a small proposition (the exception is if Maschell comes and helps, that'd make things a lot easier imo).

The thing is, I've noticed that RetroArch has its own HID setup (here's an example). Someone experienced in that would have to fill us in on how well it works; but if it's suitable I'd like to suggest that we integrate the HID routines from controller_patcher into RetroArch's own system. I think it'd be easier than chasing down the bug on our own (if it was in the HID routines, we'd almost certainly catch it during the transfer) and we'd end up with a neater end result (imho).

There would be a few user considerations (for example, controller_patcher has special handling for DualShock 3s; Switch Pro Controllers and importantly the Wii U Gamecube Adapter) and we'd have to make the transition pretty smooth, but I reckon it's a good idea. Let me know what you think.
(cc @twinaphex)

Looking at the internal HID stuff--

Thinking out loud here. Apply NaCl to taste.

The way I'm reading this, in input/drivers_hid we have HID drivers for the various sources of HID devices:

  • btstack_hid.c for Bluetooth devices;
  • libusb_hid.c for PC USB devices;
  • iohidmanager for Mac OSX HID support;
  • null_hid.c for "none of the above" platforms;
  • wiiusb.c for libOGC HID (pretty sure this is Wii mode, not Wii U)

So, we'd need to add wiiu_usb.c for the WiiU side of things.

I've only just barely started analyzing the controller_patcher code. It looks like there's a background thread that gets created? Could be a thread safety issue. Which would explain why the frequency of the crash is so weird. And maybe that's the weird double-list-traversal-on-free behavior you were describing earlier in the thread.

I think moving the HID stuff along side the rest of the HID stuff makes sense.

Hi there @gblues,

is this the associated PR?

https://github.com/libretro/RetroArch/pull/4948

Maybe we can query the person who submitted the PR so that we might be able to fix the issue faster.

Also, I do agree that making a HID implementation for Wii U and just going through that controller interface (and just adding new pads under drivers/connect as we go along) would be the preferred option, since it will not only benefit WiiU, but all other platforms that have a backend implementation.

Currently I don't have much time, but if you have any questions about the controller_patcher feel free to ask.

@gblues Can we have a PR maybe that disables the controller patcher through compile-time ifdefs until this situation is properly resolved with the controller patcher code? Overall stability should trump features at least.

I'll do it; give us a moment...

TY @QuarkTheAwesome, I appreciate it!

I'm taking a stab at writing a HID driver that lives in input/drivers_hid.

Cool teamwork guys! I am really glad to see there are people willing to work with us to make the WiiU port better.

Done; I'm #5840.

Some notes from my side, sorry if some things were already discussed, I didn't read the whole discussion.

In case you want to keep the controller_patcher stuff - otherwise just ignore my message

  • Did you tried the .elf and .rpx version? Or just rpx? Does the crash also happen in the .rpx? I had some strange bugs with .rpx applications
  • I had some trouble getting my existing code working within retroarch ( iirc it was the sd mounting). The SAME code worked in HID to VPAD. I had to change some stuff, but don't remember what exactly.
  • I just saw Quark did some C++ related fixes. controller_patcher is using some c++ functions. Maybe this is causing problems?
  • The controller_patcher lib is also used in HID to VPAD. I never got any report of a similar crash using HID to VPAD. So it's probably the combination of retroarch and controller_patcher ?!?!
  • You could also try to disable the built-in network threads via ControllerPatcher::stopNetworkServer() and check if that helps.
  • There are also some more minor changes/fixes that are on the "normal" controller_patcher repo but not yet in RA. Maybe they will also help fixing stuff. I can do a PR/testing if I have some times this weekend.

Here also a short write up how the controller_patcher code works. Maybe this helps understanding the code.

  • The Application starts with a default configuration which will be placed into the global array "config_controller". This array contains all information about the button mapping.
    (optional) - The controller_patcher lib at reads and parses config files from the SD Card. This may cause some problem, but it's only done once (at start). The application wouldn't even start if there is an parsing error.
  • The lib uses the "official" HID-API to register serverval callbacks:

    • an attach/detach callback via "HIDAddClient". Which is called whenever a device is connected/disconnected to the console. It also saves some additional data like the VID/PID, which configuration "slot" (in the array) it should use etc. The attached device also gets a unique ID which will be use to store the data.

    • a read callback via HIDRead which gets called whenever the HID Device provides new data. The callback copies the received data into a global array "gHID_Devices" (into a slot based on the unique id of the controller).

    • These threads are completly handled by the SDK.

  • (optional but default) The "network server" is started. The network server is used to allow users to connect device via the network (which allows supports for more (non standard hid) devices).

    • This creates 2 threads in total:



      • One thread with an TCP Server. This is where the Network Clients connects to. It handles the handshakes, (de/a)ttaches of devices etc.. It uses the "normal attach/detach hid callbacks" It also does some PING/PONG every few secs to check if the connection is still alive.


      • Another thread with a UDP Connection. This thread is receiving the actual controller data. The data is getting parsed and the "normal hid read callback" is called.


      • And it contains a UDP connection. This sends the rumble data to the network client.



    • This network stuff can be disabled via ControllerPatcher::stopNetworkServer().

  • The lib provides serveral functions that handle multiple controller with the same VID/PID, Mouse/Keyboard to gamepad emulation, provide functions to remap/print inputs etc.
  • The lib was designed to replace the inputs of an actual Wii U Gamepad or emulate Pro Controllers. It provides serveral functions to accomplish this, do the mapping etc.
  • But the function Retroarch is using is "ControllerPatcher::gettingInputAllDevices(InputData * output,s32 array_size)".

    • This provides the parsed data for ALL conntected data. (+ some meta data like the VID/PID, pad number (if more than one of the same type are connected) etc.)

    • Internaly the global data array "gHID_Devices" (The place where the read call copied the data to) gets parsed (based on the configuration in config_controller[]) and converted into a VPADBuffer.



      • This is where all the "magic" happens



The reason why I used the "normal" controller and not the HID driver:
- The controller_patcher was "ready-to-go", all I needed to do was adding the (de)Init function and like 30/40 lines of code in the joypad driver.
- It provides some "extra" functions like support for the controllers via the network support (XInput!) for the Switch Pro Controller/Gamecube USB Adapter etc.
- Existing configuration from HID to VPAD can be used without any change => Easy/Zero no configuration for the user.

But I see that it would a cleaner/better solution to use the existing, but this was simply way more effort to implement (for me). Maybe a hybrid solution could a way to go?

Sorry for my shitty english.

The *.elf build of retroarch doesn't work at all for me. All of my analysis and testing was done with RPX. I dunno if the controller_patcher has anything to do with that or not, but I doubt it.

We've isolated the cause of the crash to "something to with controller_patcher" but we haven't gotten any deeper than that.

I'm aiming at a hybrid approach -- rewrite the code to fit into RA's codebase, but maintain compatibility with HID2VPAD's config file.

Of course, I may inadvertently port the bug into the new code, but at least then it will be easier to analyze at that point. :)

The ELF builds can't share a config file with the RPX ones; otherwise you get a very strange bug where HBL loses its ability to load elf homebrew at all. Renaming sd:/retroarch/retroarch.cfg should get it working. Even so; I find the ELF build tends to have strange bugs (such as XMB stopping accepting input) that make it really hard to do anything with.
I did some impromptu polling of the users on GBATemp, and all the respondents who said they used the ELF builds also said they wouldn't have a problem if they were forced to use RPX; so I don't think we have to worry too much about ELF. I might open another issue to talk about all this though.

@gblues: I'm liking the look of what you've done so far! If you want a hand implementing anything, let me know.
(for those who haven't seen it, a skeleton drivers is at gblues@a17fb38)

@QuarkTheAwesome I'd love some help. I sent you a collaborator invite for my fork. I'll shoot you an e-mail with more info.

I'd like to give some info on the Wii side of things, where I believe I've struck the same bug.

When using the hid input driver, one of my controllers will give me a DSI error upon loading a game, changing core, etc (this is with the psxtousb controller adapter).

Other usb controllers don't show this issue (one registers as the generic nes controller, but it's a 10 button ps1-like controller, the other is a dual shock 3).

I'd like to point out that, if one plugs the controller while a core is running there are no issues at all, which I found a bit strange (Edit: by this I mean that while the core is running there are no issues, if you go to the menu and do some actions it will crash as usual).

I use rgui on the wii, have a working compiler setup, and can provide more info and tests as required.

Hi there @gblues and @QuarkTheAwesome, I'd like to give some general advice - some of the macros have changed in the input drivers in latest upstream. All of the macros are now available from retro_miscellaneous.h. You can check what they are now in the latest sourcecode, for instance, going through input/connect. Sorry about this slight inconvenience but it was necessary to do this for several reasons, but basically other than a name change, the usage of them should be near-on identical. If you have any questions on things being unclear to you, let me know and I'll be happy to help.

Thanks @twinapex I’ve rebased my fork accordingly.

hey @danieljg sorry for not responding to you -- while it might be a similar class of bug, we've isolated the problem to code that's specific to the Wii U version--the Wii build doesn't use the controller_mapper at all and has its own driver. I suggest opening a new issue for it if you haven't done so already. Fixing this bug isn't going to fix the issue you're describing.

@gblues I assume this has been fixed now?

@retrob0t, @gblues and @QuarkTheAwesome Any updates on this issue, is it still a problem?

The original error was never really well-defined, but we ended up rewriting the hid subsystem and that all worked out. Can probably close this, if that's what you're looking for.

Thanks for the update! Of course if anyone still has a problem please let us know.

not fixed, just right now WII U 5.5.4 E (Europe)
Tried load FBA with a rom from metal slug. that worked fine before.

I get similar DSI errors when trying to add games inside the SCUMMVM core.

This issue is not a catch-all for DSI errors. It’s for a specific DSI error occurrence that has been resolved.

  • core-specific DSI errors should be reported on the corresponding core
  • general-case DSI errors should be reported in their own issue
Was this page helpful?
0 / 5 - 0 ratings

Related issues

hyarsan picture hyarsan  Â·  4Comments

sergiobenrocha2 picture sergiobenrocha2  Â·  4Comments

parkerlreed picture parkerlreed  Â·  3Comments

ghost picture ghost  Â·  3Comments

rrooij picture rrooij  Â·  3Comments