After my PR #10689 was recently merged, I synced a project I'm working on up to Mbed-OS master. This project uses USB MSD.
Now, when I connect my device to a host (tried on Linux and Windows), I hit the following assert:
++ MbedOS Error Info ++
Error Status: 0x80FF0144 Code: 324 Module: 255
Error Message: Assertion failed: info->pending >= 1
Location: 0x6549
File: ./mbed-os/usb/device/USBDevice/USBDevice.cpp+955
Error Value: 0x0
Current Thread: main Id: 0x2000155C Entry: 0x76B5 StackSize: 0x1000 StackMem: 0x200023B0 SP: 0x2003FF00
For more info, visit: https://mbed.com/s/error?error=0x80FF0144&tgt=NRF52840_DK
-- MbedOS Error Info --
I then changed back to my feature branch (https://github.com/AGlass0fMilk/mbed-os/tree/nrf52840-usbphy-implementation), rebuild and the issue is gone. My device mounts as /dev/sdb fine.
Not sure if anything changed in the USB stack that might be causing this. I don't see any other recent changes.
Essentially, my main code ends with (after setting up FAT file system from example):
USBMSD usb(&qspi_blockdevice);
while (true) {
usb.process();
}
I am also printing to the debug UART. Other than that, there's not much else going on in the project.
I tested this on my USB Serial example here: https://github.com/AGlass0fMilk/mbed-usb-cdc-example.git
Same results. When I'm on my feature branch it works as expected. When I pull master (currently at 14b77c9a2e51acbfa7c675e680a516f89696f22c) the code hits the assert as soon as I plug in USB.
I haven't had luck tracing down the commit that introduced this issue. My feature branch isn't that stale so that should help narrow it down.
Testing on nRF52840_DK with GCC_ARM toolchain.
[ ] Question
[ ] Enhancement
[X] Bug
Did some testing where I rebased my feature branch against some newer commits that have been pulled into master until the issue came back. Rebasing on dc77c40fe9c9e3dacf71894e1acd320d0491f222 causes the issue to appear so it seems to be related to Cordio.
Maybe the USB interrupt priority needs to be higher?
Rebased with df84eb1b28214ac1646f66ba270a34d2af4fa27a (just before Cordio PR) and I can confirm it works as expected.
I tried adjusting the USB interrupt priority all the way to 0 (highest) and still the issue appeared, albeit maybe a little delayed compared to the lower priority.
I also tried removing Cordio from the build by removing the BLE feature from the nRF52840_DK and the issue still occurred. Not sure how that's happening.
@paul-szczepanek-arm Any idea what could be happening?
It'd be understandable if USB and BLE could not be used _at the same time_ but simply linking BLE in shouldn't disable USB.
If I revert the changes to targets.json introduced in dc77c40fe9c9e3dacf71894e1acd320d0491f222 the issue goes away. May be related to removing the soft device (although I don't use any soft-device aware functions that I know of. Maybe the power peripheral driver.)
The following changes to the MCU_NRF52840 target produces the issue:

Hopefully there isn't some weird reliance on the softdevice that the USB driver/hardware has...
Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-1302
My current theory is that the issue is related to how critical regions are handled when using the softdevice vs when not using the softdevice.
It seems that an IN interrupt is happening before the upper stack can officially queue a write operation. This violates the stack's logic and results in an assert. It seems to happen only on non-control endpoints? I have to do some more testing with different USB classes to see how the error changes.
Below are two screenshots of the same code (the usb serial example I linked to above). One compiled with softdevice support and one compiled without. The one without the SD abruptly stops after the assert happens.
Failure

No Failure:

It seems the cause of the issue lies in the simplified critical region API being used by the Nordic USB driver (and presumably all Nordic drivers). Normally, with a softdevice-enabled build, the critical region API used by the Nordic drivers is redirected to sd_nvic_critical_region_enter/exit:
This function is defined in nrf_nvic.h/c, which changes based on which softdevice "target" you enable (in this case, it's SOFTDEVICE_NONE):
This is the same function used by the Mbed critical_section HAL implementation for nRF52-based targets:
However, when the macro SOFTDEVICE_PRESENT is disabled, you see in the first code block above the Nordic app_util critical region API gets retargeted to app_util_disable/enable_irq.
This has a critical region counter that is entirely unaware of the number of nested critical regions Mbed may be in at the time and so if it isn't in a nested critical region, it globally enables interrupts again and completely crashes Mbed's expectation of being in a critical section.
I tested the fix in my fork and it fixed this issue. Needs to be CI tested with all the other Nordic drivers (which I'm surprised still worked with this going on):
I don't see a nice clean way of fixing this given the software architecture of Nordic's SDK. That discussion will be saved for the associated PR: #10881
That is some sterling work, thank you.
Excellent work indeed. Looks like PRs have been merged, can you close this?
Resolved by #10881
Hi @AGlass0fMilk ,
first of all thanks for the awesome patch.
After #10881 merge the bug is much less reproducible but I think I found a way to trigger it anyway.
Exposing as a SerialUSB (CDC) device and opening/closing the host serial port multiple times ( > 10 usually) triggers the same error path.
Here's the mbed debug output and the stacktrace
++ MbedOS Error Info ++
Error Status: 0x80FF0144 Code: 324 Module: 255
Error Message: Assertion failed: info->pending >= 1
Location: 0x123B9
File: ./mbed-os/usb/device/USBDevice/USBDevice.cpp+955
Error Value: 0x0
Current Thread: main Id: 0x2000126C Entry: 0x13341 StackSize: 0x8000 StackMem: 0x20002040 SP: 0x2003FF04
For more info, visit: https://mbed.com/s/error?error=0x80FF0144&tgt=ARDUINO_NANO33BLE
-- MbedOS Error Info --
#7 0x000123b8 in mbed_assert_internal ()
at ./mbed-os/platform/mbed_assert.c:26
#8 0x00019136 in USBDevice::in ()
at ./mbed-os/usb/device/USBDevice/USBDevice.cpp:955
#9 0x0001b2be in USBPhyHw::process ()
at ./mbed-os/usb/device/targets/TARGET_NORDIC/TARGET_MCU_NRF52840/USBPhy_Nordic.cpp:528
#10 0x0001908c in USBDevice::start_process ()
at ./mbed-os/usb/device/USBDevice/USBDevice.cpp:1555
#11 0x000172e6 in usbd_ep_data_handler ()
at ./mbed-os/targets/TARGET_NORDIC/TARGET_NRF5x/TARGET_SDK_15_0/integration/nrfx/legacy/nrf_drv_usbd.c:1219
#12 0x0001774c in ev_epdata_handler ()
at ./mbed-os/targets/TARGET_NORDIC/TARGET_NRF5x/TARGET_SDK_15_0/integration/nrfx/legacy/nrf_drv_usbd.c:1325
#13 0x00017fc6 in USBD_IRQHandler ()
Let me know if you need additional info or if it's enough to triage the issue :wink:
@paul-szczepanek-arm I'd ask you to reopen the issue too (or I can open another one if needed), thanks!
That's above my paygrade, @SeppoTakalo?
Reopening as per @facchinm's comments.
@facchinm When reproducing the issue, how fast do you open/close the CDC port? Is it very rapid or after 10 times in an hour or something? Are you unplugging the device and plugging it back in?
I'm thinking the driver may need to be updated to deal with this in a more robust way. My current idea is:
Phy driver keeps track of number of pending In/Out events that should happen (just as the upper layer is doing)
When it encounters a situation where this assert will be triggered (eg: no pending events on some endpoint but an event is occurring anyway) it caches the event and forwards it next time the upper layers make a call and would be expecting the event.
Thanks for reopening @AGlass0fMilk !
Both fast and slow open/close trigger the issue, and I never completely unplug the device.
Your proposal to make the driver more reliable looks very reasonable, if you need any help for testing a solution just tag me on the PR :wink:
@facchinm I haven't been able to reproduce the issue yet. I am using some example code I wrote here: https://github.com/AGlass0fMilk/mbed-usb-cdc-example
I am currently building with master (https://github.com/ARMmbed/mbed-os/commit/0b8ae1b0982b17c2bd53d0866ccb65639da012e3)
Can you provide some sample code and procedure for reproducing this issue? What software are you using to open and close the USB CDC/Serial port? What operating system is your USB host running? Can you give me any details of your hardware setup?
I'm in a very nonstandard environment (a wrapper built over mbed) but I've been able to replicate using this program (just remove the retarget if you are targeting a board with a correct porting layer)
#include "mbed.h"
#include "USBSerial.h"
USBSerial _serial(false);
PinName pin = P0_2;
mbed::UARTSerial serial(P1_3, P1_10, 115200);
namespace mbed {
FileHandle *mbed_override_console(int fd) {
return &serial;
}
FileHandle *mbed_target_override_console(int fd) {
return &serial;
}
}
int main() {
_serial.connect();
while (1) {
if (_serial.connected()) {
_serial.printf("%d\n", AnalogIn(pin).read_u16());
}
wait(0.001f);
}
}
Moving to mbed master (and swapping target to NRF52840_DK) made it more difficult to trigger the issue, but eventually I made it. I was previously branching off 9cdfe3778392dbd1286c2dc4bde9eea72f411b8e
++ MbedOS Error Info ++
Error Status: 0x80FF0144 Code: 324 Module: 255
Error Message: Assertion failed: info->pending >= 1
Location: 0x1E25
File: ./mbed-os/usb/device/USBDevice/USBDevice.cpp+955
Error Value: 0x0
Current Thread: rtx_idle Id: 0x20001888 Entry: 0x2F51 StackSize: 0x200 StackMem: 0x20001BD0 SP: 0x2003FF00
For more info, visit: https://mbed.com/s/error?error=0x80FF0144&tgt=NRF52840_DK
-- MbedOS Error Info --
= System will be rebooted due to a fatal error =
= Reboot count(=1) reached maximum, system will halt after rebooting =
As host OS I'm on Linux (kernel 5.2.2-arch1-1-ARCH), and to open/close the serial port I'm using the Arduino IDE (which uses jssc under the hood).
So I tested this with an nRF52840_DK by modifying your code a little (my dev kit requires hardware flow control for the debug uart):
#include "mbed.h"
#include "USBSerial.h"
USBSerial _serial(false);
PinName pin = P0_2;
mbed::UARTSerial serial(USBTX, USBRX, 115200);
namespace mbed {
FileHandle *mbed_override_console(int fd) {
serial.set_flow_control(mbed::SerialBase::RTSCTS, STDIO_UART_RTS, STDIO_UART_CTS);
return &serial;
}
}
int main() {
_serial.connect();
while (1) {
if (_serial.connected()) {
_serial.printf("%d\n", AnalogIn(pin).read_u16());
}
printf("hello!\r\n");
wait(0.001f);
}
}
I was able to read from both the debug serial and the target's USB serial port simultaneously without issue. I was also able to open and close the target's USB serial port well over 10 times without crashing. I was using the Arduino IDE under a Windows environment.
My first guess in your case is somehow the wrapper layer or some other part of your system config/setup is affecting the interrupt latency of the USB driver.
Feel free to reach out via email if you want to discuss details of your setup that are not yet public. I noticed you're working on porting a new Arduino BLE board based on the nRF52840.
I retested both the mbed plain and the mbed derived code on Windows and, surprise, it doesn't crash at all...
Instead, both Linux and OSX trigger the issue. It looks like both the OS send multiple line coding packets on open, one for every parameter.
I'll investigate a little bit and report back, thanks for taking a look!
@facchinm Should this issue be closed? Is the USB implementation working for the Arduino-MbedOS core?
In fact no, I added this hacky patch to the mbed tree we are using to avoid getting a million user tickets
```patch
From 788235de9cd4821884e36c98c5751902cae4c43e Mon Sep 17 00:00:00 2001
From: Martino Facchin m.facchin@arduino.cc
Date: Wed, 31 Jul 2019 12:48:04 +0200
Subject: [PATCH] HACK: avoid #10862 by not firing the assert
usb/device/USBDevice/USBDevice.cpp | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/usb/device/USBDevice/USBDevice.cpp b/usb/device/USBDevice/USBDevice.cpp
index 8291fbce06..da69c4c1c6 100644
--- a/usb/device/USBDevice/USBDevice.cpp
+++ b/usb/device/USBDevice/USBDevice.cpp
@@ -952,10 +952,11 @@ void USBDevice::in(usb_ep_t endpoint)
endpoint_info_t *info = &_endpoint_info[EP_TO_INDEX(endpoint)];
Of course it's not acceptable upstream, so I didn't either try to publish it :slightly_smiling_face:
Any update on this issue? I'm seeing this same thing in 5.15.1
Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers.
Internal Jira reference: https://jira.arm.com/browse/IOTOSM-2172