Mbed-os: TRNG Error: Conflict between BLE and Wi-SUN

Created on 17 Jun 2020  路  22Comments  路  Source: ARMmbed/mbed-os

Description of defect

When running simultaneously the BLE stack and the Wi-SUN interface, the application returns a Fatal Run-time error in the HAL_RNG_GenerateRandomNumber function. It doesn't happen until both WisunInterface and BLE are up.

After some investigation, I found out that the error occurs inside the stm32wbxx_hal_rng.c in line 574:

 while (__HAL_RNG_GET_FLAG(hrng, RNG_FLAG_DRDY) == RESET)
    {
      if ((HAL_GetTick() - tickstart) > RNG_TIMEOUT_VALUE)
      {
        hrng->State = HAL_RNG_STATE_READY;
        hrng->ErrorCode = HAL_RNG_ERROR_TIMEOUT;
        /* Process Unlocked */
        __HAL_UNLOCK(hrng);
        return HAL_ERROR;
      }
    }

Target(s) affected by this defect ?

NUCLEO_WB55RG

Toolchain(s) (name and version) displaying this defect ?

GCC for Arm (gcc-arm-none-eabi-9-2019-q4-major)

What version of Mbed-os are you using (tag or sha) ?

mbed-os-6.0.0 (https://github.com/ARMmbed/mbed-os/commit/165be79392ae7b1bee4388d2bc8ed8281978f07c)

What version(s) of tools are you using. List all that apply (E.g. mbed-cli)

mbed-cli 1.10.2

How is this defect reproduced ?

This issue occurs either when the BLE stack is started after the Wi-SUN interface is connected or when the Wi-SUN interface calls the connect method after the BLE initialization. It seems that the problem is related to both instances trying to access the TRNG.

I'm using the last version of the BLE Full stack firmware from STM32Cube_FW_WB_V1.7.0.

@jeromecoutant @mikter

CLOSED st mirrored bug

All 22 comments

Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers.
Internal Jira reference: https://jira.arm.com/browse/MBOTRIAGE-2723

Hi @eriknayan

I am investigating with BLE experts, but you can check in parallel the application note describing RNG use in both STM32WB core (M4 and M0+):
https://www.st.com/resource/en/application_note/dm00598033-building-wireless-applications-with-stm32wb-series-microcontrollers-stmicroelectronics.pdf

Note also that there are few updates inSTM32WB since 6.0 release to enable AES HW crypto and USB device, so maybe you can have a try with the master branch ?

@LMESTM

Hi @jeromecoutant ,

Thanks for the quick answer!

Sure, I'm going to take a look at the AN you sent me to help finding what may be the cause of the issue. It looks like the TRNG can't provide a new number when both BLE and Wi-SUN are up (because RNG_FLAG_DRDY is always in RESET state).

Regarding the test with the master branch, I've just pulled it here and I still get the same error.

It looks like the TRNG can't provide a new number when both BLE and Wi-SUN are up (because RNG_FLAG_DRDY is always in RESET state).

Need to check why the CFG_HW_RNG_SEMID semaphore use is not sufficient?

Maybe we can have a look on this part:
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_STM/trng_api.c#L41

Regarding the test with the master branch, I've just pulled it here and I still get the same error.

Thx for the test

Yes, I guess it's probably related to a TRNG concurrence by both interfaces.

Maybe we can have a look on this part:
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_STM/trng_api.c#L41

I'm going to have a look on this!

Hi @jeromecoutant , any news about the issue?

https://www.st.com/resource/en/application_note/dm00598033-building-wireless-applications-with-stm32wb-series-microcontrollers-stmicroelectronics.pdf

I've read the AN and couldn't figure out what may be problem. It looks like the Sem0 should be enough to manage the RNG IP sharing between the two CPUs, but for some reason the RNG gets stuck and can't provide new numbers after the Wi-SUN interface and the BLE Stack are up simultaneously.

Hi @jeromecoutant ,

I've commented these lines and I got the following MbedOS Error:

Error Status: 0x80FF0100 Code: 256 Module: 255
Error Message: Fatal Run-time error
Location: 0x8084D57
Error Value: 0x0
Current Thread: main Id: 0x20016D88 Entry: 0x807B895 StackSize: 0x6E20 StackMem: 0x2001D1A0 SP: 0x20023DD4 
For more info, visit: https://mbed.com/s/error?error=0x80FF0100&tgt=NUCLEO_WB55RG
-- MbedOS Error Info --
Only 1 RNG instance supported

So apparently the semaphore is working as expected...

I've run this test here and still have the same issue:

++ MbedOS Error Info ++
Error Status: 0x80FF0100 Code: 256 Module: 255
Error Message: Fatal Run-time error
Location: 0x8084D71
Error Value: 0x0
Current Thread: main Id: 0x20016D88 Entry: 0x807B885 StackSize: 0x6E20 StackMem: 0x2001D198 SP: 0x20023DCC 
For more info, visit: https://mbed.com/s/error?error=0x80FF0100&tgt=NUCLEO_WB55RG
-- MbedOS Error Info --
trng_init: HAL_RNG_GenerateRandomNumber

With the RNG_inited variable, the application run without problems until the Wi-SUN EAP-TLS handshake:

[DBG ][tlsp]: TLS: start
[ERR ][tlsl]: drbg seed fail
[ERR ][tlsp]: TLS: library init fail
[DBG ][tlsp]: TLS: finish
[ERR ][eaps]: EAP-TLS: handshake fatal error
[INFO][eaps]: EAP-TLS finish

dd86af6

Unfortunately I have the same error as above.

Maybe I can create an application running only the Wi-SUN Interface and the BLE Stack so we can test it in an isolated scenario @jeromecoutant ?

Hi again @jeromecoutant

As I was trying to implement some possible fixes to the problem, I found something interesting:

With only the addiction of the LL_HSEM_1StepLock(HSEM, 5) in system_clock.c, and using the original trng_api.c, I could use BLE + Wi-SUN without problems only if I started the BLE Stack after the Wi-SUN established the connection, otherwise I would get this time another EAP-TLS handshake error:

[ERR ][tlsp]: TLS: error
[DBG ][tlsp]: TLS: finish
[ERR ][eaps]: EAP-TLS: handshake failed
[INFO][eaps]: EAP-TLS: send RESPONSE type TLS id 3 flags 0 len 17
[INFO][eaps]: EAP-TLS finish

I think that for now this can be a workaround but there's still a conflict in RNG sharing.

Yes, I think we are close !

However, I still can't init the BLE stack before the Wi-SUN interface, even with jeromecoutant@53714fa. I trying to figure out what may be interfering with the EAP-TLS handshake.

Hi
Just to be sure. Are you using the latest BLE FW ?
https://github.com/ARMmbed/mbed-os/tree/master/targets/TARGET_STM/TARGET_STM32WB#ble-fw-update

Hi @jeromecoutant ,

Yes, I believe so. I'm currently running the stm32wb5x_BLE_Stack_full_fw.bin which comes with the STM32Cube_FW_WB_V1.7.0 package.

In order to have an isolate scenario for this test, I created this very simple code example from the BLE_Beacon with just a small addition to verify the mbed-TLS behaviour along with BLE:

int main()
{   
    BLE &ble = BLE::Instance();
    ble.onEventsToProcess(schedule_ble_events);

    BeaconDemo demo(ble, event_queue);
    demo.start();

    mbedtls_entropy_context *entropy = new mbedtls_entropy_context;
    mbedtls_entropy_init(entropy);

    int ret = 1;
    if( ( ret = mbedtls_entropy_source_self_test( 0 ) ) != 0 )
    {
        printf("MBED TLS ERROR\r\n");
    }
    else
    {
        printf("MBED TLS OK\r\n");
    }

    return 0;
}

When I run the mbedtls_entropy_source_self_test before the BLE init, it works as expected. However, if I run it after the BLE init, we have 2 different error scenarios:

  1. When I apply your commit > 53714fa , the application returns a MbedOS Error, which according to the Fault Handler tool corresponds to this:
Crash Info:
    Crash location = HAL_RNG_Init [0x801840A] (based on PC value)
    Caller location = HAL_RNG_Init [0x80183F3] (based on LR value)
    Stack Pointer at the time of crash = [20004220]
    Target and Fault Info:
        Processor Arch: ARM-V7M or above
        Processor Variant: C24
        Forced exception, a fault with configurable priority has been escalated to HardFault
        Imprecise data access error has occurred
  1. If no changes are made on the mbed-os master, there's is no application fault, but the mbedtls_entropy_source_self_test fails. I verified that is actually the while (LL_HSEM_1StepLock(HSEM, CFG_HW_HSI48_SEMID)); who is responsible for the fault, although I don't have this issue when I add it in my custom application code.

Maybe you can use this code to verify the concurrence problem as well. Meanwhile I'll also keep testing and trying to find a fix for this issue.

Hi @jeromecoutant

I have some interesting results after applying your patch:

  1. In mbed-os 6.0, I still get a TLS error during the handshake in my WI-SUN application. However, in the BLE_Beacon modified example it worked! Now the TLS test returns ok even when the BLE stack is started at the beginning of the application.

  2. In mbed-os-5.15.4, the fix also worked for my application, I was able to start the BLE stack before the Wi-SUN interface and the EAP-TLS run successfully.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

davidantaki picture davidantaki  路  3Comments

chrissnow picture chrissnow  路  4Comments

bcostm picture bcostm  路  4Comments

pilotak picture pilotak  路  3Comments

drahnr picture drahnr  路  4Comments