Qmk_firmware: [Bug] keyboard stops responding after power cycles and found where goes wrong

Created on 7 Aug 2020  路  22Comments  路  Source: qmk/qmk_firmware


I was playing with the STM32F401CCU6, which is the blackpill development board. I found that the keyboard stops responding if I use that keyboard to wake the host computer. After adding some LEDs event for debugging, I found which code section the QMK firmware sticks and make a dump hotfix to verify the observation.


Describe the Bug

After putting the host computer into sleep, if I press any key from the keyboard to wake the host up, the keyboard can not send any key event to the host computer.

I can observe the QMK firmware is running the loop in the tmk_core/protocol/chibios/main.c to wait for the USB driver to leave the USB_SUSPENDED. However, even the host computer is up, the USB driver state is still USB_SUSPENDED, which causes the keyboard sticks in this loop forever and thus not sending any new key event. I make a dumb hotfix by stoping the USB driver, then re-start the USB driver, and the USB driver becomes USB_ACTIVE again, and the keyboard is back to work.

System Information

  • Keyboard:

    • Revision (if applicable): Phoenix (I will create a PR for this keyboard as it is still under development, but should be done in this week)

  • Operating system: macOS 10.15
  • AVR GCC version: 9.2.0
  • ARM GCC version: 9.2.1 20191025 (release) [ARM/arm-9-branch revision 277599] (GNU Tools for Arm Embedded Processors 9-2019-q4-major)
  • QMK Firmware version: 0.9.49
  • Any keyboard related software installed?

    • [ ] AutoHotKey

    • [ ] Karabiner

    • [ ] Other:

Additional Context

The diff of my dumb hotfix:

diff --git a/tmk_core/protocol/chibios/main.c b/tmk_core/protocol/chibios/main.c
index 7d32c16ed..bdac09223 100644
--- a/tmk_core/protocol/chibios/main.c
+++ b/tmk_core/protocol/chibios/main.c
@@ -229,6 +229,7 @@ int main(void) {
                 /* Remote wakeup */
                 if (suspend_wakeup_condition()) {
                     usbWakeupHost(&USB_DRIVER);
+                    restart(&USB_DRIVER);
                 }
             }
             /* Woken up */
diff --git a/tmk_core/protocol/chibios/usb_main.c b/tmk_core/protocol/chibios/usb_main.c
index 65bd291be..d27db391a 100644
--- a/tmk_core/protocol/chibios/usb_main.c
+++ b/tmk_core/protocol/chibios/usb_main.c
@@ -559,6 +559,14 @@ void init_usb_driver(USBDriver *usbp) {
     chVTObjectInit(&keyboard_idle_timer);
 }

+void restart(USBDriver *usbp) {
+    usbStop(usbp);
+    usbDisconnectBus(usbp);
+    wait_ms(1500);
+    usbStart(usbp, &usbcfg);
+    usbConnectBus(usbp);
+}
+
 /* ---------------------------------------------------------
  *                  Keyboard functions
  * ---------------------------------------------------------
diff --git a/tmk_core/protocol/chibios/usb_main.h b/tmk_core/protocol/chibios/usb_main.h
index 94baf9b35..6083fa2bd 100644
--- a/tmk_core/protocol/chibios/usb_main.h
+++ b/tmk_core/protocol/chibios/usb_main.h
@@ -34,6 +34,7 @@

 /* Initialize the USB driver and bus */
 void init_usb_driver(USBDriver *usbp);
+void restart(USBDriver *usbp);

 /* ---------------
  * Keyboard header

bug help wanted

All 22 comments

@fredizzimo @tzarc @zvecr Any insights on this?
I'm asking because ZSA has seen similar reports of this, and I've experienced it myself, in the past

From my understanding, this bug seems to be caused by the Chibios, or more preciously, is caused by usbWakeupHost call. If I wake the host with other devices, the keyboard is able to detect the USB state change and thus leave the infinite loop. Only if I wake the device with the keyboard, I can see the keyboard finishes the usbWakeupHost call, but the USB state remains the same.

So I think either this is a bug from Chibios, or it is caused by misconfiguration of the MCU. Anyhow, the workaround works good, I never had that problem for 2 days.

a dirty hack was to modify the file
lib/chibios/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c
line 546

-  if (sts & GINTSTS_WKUPINT) {
+  if (sts & GINTSTS_WKUPINT || ((usbp->state == USB_SUSPENDED) && ((sts&GINTSTS_OEPINT) || (sts&GINTSTS_IEPINT)))) {

I patched this for Matrix noah and Matrix 2.0 Additional keyboard, it works well under Windows. but I still got report that this does not work on MAC OS. From what I know, WKUPINT was send from HOST to keyboard while the HOST wakeup from some other events. Seems the chibios USB OTG driver did not correctly handling the wakeup from self (keyboard) situation.

@yulei wouldn't a better hack be to just call usb_lld_wakeup_host instead of usbWakeHost, since all that does is check the state and then wake it up?

Actually usbWakeHost was just a wrapper for usb_lld_wakeup_host. The problem was that there was no spec for the host's behavior while it resumed by a remote wakeup. Some discuss here: https://www.microchip.com/forums/m492730.aspx

What observed on windows was that the WIN10 does not RESET or RESUME the keyboard, it just send control command to endpoint 0 after it was resumed by the keyboard.

So maybe the USB driver need to quit the suspend state while it received any signals as RESET, RESUME or IN/OUT ? Maybe can dig the codes from LUFA to get an answer :) The workaround by @LSChyi was great.

If the workaround is ok, I can make a PR.

I'll have a look with my mac -- I don't usually put anything to sleep so I haven't really struck anything like this so far.

Yeah, getting a number of reports of this with the Moonlander, and some with the planck ez.

Which is up to date with QMK now.

a dirty hack was to modify the file
lib/chibios/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c

Is definitely a dirty hack. I think it's a bad idea to modify the qmk chibiOS files, as it adds additional maintenance. Any changes like this should be made upstream.

a dirty hack was to modify the file
lib/chibios/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c
line 546

-  if (sts & GINTSTS_WKUPINT) {
+  if (sts & GINTSTS_WKUPINT || ((usbp->state == USB_SUSPENDED) && ((sts&GINTSTS_OEPINT) || (sts&GINTSTS_IEPINT)))) {

I patched this for Matrix noah and Matrix 2.0 Additional keyboard, it works well under Windows. but I still got report that this does not work on MAC OS. From what I know, WKUPINT was send from HOST to keyboard while the HOST wakeup from some other events. Seems the chibios USB OTG driver did not correctly handling the wakeup from self (keyboard) situation.

Just a heads up, this doesn't help unless otg is enabled for the board, it looks like. Eg. for many of the STM32F303 boards, this won't do anything to fix the issue.

However, the solution that @LSChyi posts does seem to work for said boards.

Ok, I can make a PR.

Not striking this issue with F411 at all -- how frequently is this actually happening?

Almost every time and is reproducible (that how I found it is caused by the USB state not changed).

Puzzling, guess I retry with F401.

10 attempts with F401 black pill, zero instances of lockup after sleep.

I test it on a mac mini machine, where when I set the computer into sleep, it takes some time to tell USB devices into a suspended state, then the problem can be reproduced. Could it be the keyboard is not yet entering a suspended state?

@tzarc I make a video for it, you can have a look: https://www.youtube.com/watch?v=oWEjBHfVcPw&feature=youtu.be

Perfect -- you were on the right track with pointing out that the USB hadn't entered suspend. I modified the code to turn on the LED on C13, and only attempted wakeup after it turns on -- it's definitely locking up after that.

Not seeing this issue on F303 (Proton-C). I'm on macOS 10.15.6. I'm verifying that the chip is indeed getting the suspend signal with the backlight feature - although apparently the backlight_set(0) call on suspend is only implemented on AVR, so I had to add that in at the start of suspend_power_down(). Nevertheless, when I plug it in, hit the Sleep button and wait a short while, the backlight LED turns off. If I then hit a key, it lights up again, my MBP wakes up and I can immediately type.

This also doesn't seem to be an issue on Windows (10 at least) at all.

Weird, @drashna says that

Yeah, getting a number of reports of this with the Moonlander, and some with the planck ez.

Which is up to date with QMK now.

a dirty hack was to modify the file
lib/chibios/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c

Is definitely a dirty hack. I think it's a bad idea to modify the qmk chibiOS files, as it adds additional maintenance. Any changes like this should be made upstream.

, as far as I know, the Planck is using F303, so it should also have that issue.

Video proof: https://youtu.be/jO9ceiKaKfA
Unless I'm misunderstanding what the issue is?

@fauxpark you're right, I also tested the black pill on my MBP, and I still have that problem. Could it be the MCU misconfiguration?

Considering, as you point out, the Planck, Moonlander and EZ are also using the F303, they all are using the exact same *conf.h files as the Proton-C, so I'm not sure how that could be possible.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

drashna picture drashna  路  3Comments

gesinger picture gesinger  路  3Comments

Frefreak picture Frefreak  路  4Comments

helluvamatt picture helluvamatt  路  4Comments

michaeldauria picture michaeldauria  路  3Comments