Mbed-os: MBED_TICKLESS causing error in rtx_idle thread (since 5.13.4)

Created on 3 Sep 2019  路  20Comments  路  Source: ARMmbed/mbed-os

Description

I noticed that tickless enabled in 5.13.4 for k64f is causing that below code is failing

#include "mbed.h"

int main() {

    wait_ms(1);
}

test command: mbed compile -t GCC_ARM -m k64f -f

 ++ MbedOS Error Info ++
Error Status: 0x80010133 Code: 307 Module: 1
Error Message: Mutex: 0x20001E08, Not allowed in ISR context
Location: 0x30D1
Error Value: 0x20001E08
Current Thread: rtx_idle  Id: 0x20000EDC Entry: 0x30DD StackSize: 0x200 StackMem: 0x20001220 SP: 0x2002FE58
For more info, visit: https://mbed.com/s/error?error=0x80010133&tgt=K64F
-- MbedOS Error Info --

= System will be rebooted due to a fatal error =
= Reboot count(=95) reached maximum, system will halt after rebooting

Issue request type


[ ] Question
[ ] Enhancement
[X] Bug

CLOSED mirrored bug

All 20 comments

cc @ARMmbed/team-nxp (https://github.com/ARMmbed/mbed-os/pull/10796 might be causing this?)

Hi @mmahadevan108 , seem the K64F not quite working with TICKLESS mode,
it crashed due to your PR #10796
please review the implementation.

meanwhile @gpsimenos could you update the test results for other NXP boards please and turn off TICKLESS on K64F temperately

Its strange, it passed all mbed-os tests before being enabled

Its strange, it passed all mbed-os tests before being enabled

I think this is a corner case, which not covered by our greentea tests

Its strange that mbed-os tests does not cover this, something to look into?

the clue could be that adding printf before wait fixes the problem!
That is probably the reason why greente tests all had pass, they prints a lot

PR https://github.com/ARMmbed/mbed-os/pull/11403 opened to disable tickless mode on K64F for now

In the last couple of weeks I have been enabling and testing tickless mode (by running all greentea tests on RaaS) on a number of boards that can theoretically support it.

The K66F, K82F and KW41Z passed all tests successfully.
The K22F, K64F and KW24D failed with several errors (some of them might be due to RaaS)
The KL82Z could not be tested due to RaaS problems.

I am unable to reproduce the failure on the current master branch

Can we confirm mbed_app.json contents, if any?

There was no mbed_app.json config
On our side three people was able to reproduce this problem

I am able to see it as well. @kjbracey-arm Thank you, I did have "mbed-trace.enable": 1, in mbed_app.json. Removing this line shows the failure.

The issue seems to be related to the below call that waits for the debug UART's TX to complete.
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_Freescale/TARGET_MCUXpresso_MCUS/api/sleep.c#L47

Below is the implementation, should the code simply return with an error when busy instead of waiting for TX to complete?
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_Freescale/TARGET_MCUXpresso_MCUS/TARGET_MCU_K64F/serial_api.c#L769

I was able to root cause the failure. The UART clock is not enabled when calling this function
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_Freescale/TARGET_MCUXpresso_MCUS/TARGET_MCU_K64F/serial_api.c#L769

I will add a call to enable UART clock at the start of this function.

I have a question about this function. Should it wait for UART TX to complete or should it return immediately if UART is busy thereby not going into deep sleep.

@mmahadevan108

I will add a call to enable UART clock at the start of this function.

I think that better idea is to skip serial_wait_tx_complete(STDIO_UART) call if the serial is not enabled yet

I have a question about this function. Should it wait for UART TX to complete or should it return immediately if UART is busy thereby not going into deep sleep.

What was the reason to add serial_wait_tx_complete(STDIO_UART) call to hal_deepsleep ?
Maybe better conception is to block deepsleep during uart transmission is active?

Below is the implementation, should the code simply return with an error when busy instead of waiting for TX to complete?
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_Freescale/TARGET_MCUXpresso_MCUS/TARGET_MCU_K64F/serial_api.c#L769

No, it should return immediately, because it's not busy. (So no need to wait).

I have a question about this function. Should it wait for UART TX to complete or should it return immediately if UART is busy thereby not going into deep sleep.

The device is being asked to deep sleep (because the core code believes there's nothing going on). If the HAL knows that it previously accepted serial bytes via serial_putc, and they're still draining out of a hardware buffer, which will stop draining if it does go into deep sleep, it has a few options:

1) Go into shallow sleep instead. This is not ideal, as it could spend an hour in shallow sleep instead of deep because the buffer was draining at instant of sleep request.
2) Go into shallow sleep instead, and enable the interrupt for the buffer becoming empty. (The interrupt handler doesn't have to do anything) That will wake it from its shallow sleep, and then it will go back to sleep, and this time go to deep sleep.
3) Wait for the buffer(s) to drain, then go to deep sleep.
4) Something involving interaction with the core code - eg locking deep sleep itself when it knows write is active (eg in serial_putc), and releasing it in a buffer empty interrupt. Similar to option 2, but with that interrupt handler installed all the time, not just for the sleep transition. Might lead to a lot of somewhat pointless "buffer empty" interrupts while active, but does stop the core code even considering deep sleep.

Third option is simplest good option _(no - see comment below)_, and what is being attempted here. I'm not sure option 2 or 4 are worth the complexity.

We do use something similar to option 2 for handling some low power timer complexity - eg scheduling an initial wake from deep sleep early, to allow for deep wake latency, then going back to shallow sleep until precise time expires. That is necessary to get good timing from a sleep_for(100) when it's deep - 97ms-ish of deep sleep, then shallow sleep until 100 precisely. So it is an option.

I will add a call to enable UART clock at the start of this function.

No. Check if the UART clock is enabled, and if not, return immediately. Do the check from a hardware register, not a RAM flag, as that can reduce RAM usage (given that this function is getting pulled in by sleep whether or not any serial port is used).

I think that better idea is to skip serial_wait_tx_complete(STDIO_UART) call if the serial is not enabled yet

The sleep code (or a loop elsewhere) should be calling this function for EVERY serial port. There's no reason to single out STDIO_UART. And the function should return immediately for any serial port that is disabled.

I need to correct my comment above - I didn't properly understand what @mmahadevan108 was suggesting.

Yes, it is (much) better to have hal_deepsleep return immediately if temporarily unable to enter deep sleep than for it to sit and loop. Any "sleep" call is allowed to be a no-op and return immediately for any reason.

If the sleep returns, then the core code will repeat it - any sleep must be in some sort of "while nothing to do" loop. But importantly, during that loop interrupts get a chance to run, which will potentially cancel the sleep attempt.

You can see the sleep logic here (it's abstracted away so much via the templated operation nonsense, it's actually quite clear):

https://github.com/ARMmbed/mbed-os/blob/567479792cf7a52cda7d405eaca2cc7ab8843664/platform/mbed_os_timer.cpp#L143-L180

I think the @mmahadevan108 's fix fixed my issue with TICKLESS, could you confirm that this fix working fine for you @gpsimenos @maciejbocianski

Works for me

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ashok-rao picture ashok-rao  路  4Comments

cesarvandevelde picture cesarvandevelde  路  4Comments

rbonghi picture rbonghi  路  3Comments

DuyTrandeLion picture DuyTrandeLion  路  3Comments

1domen1 picture 1domen1  路  3Comments