Mbed-os: Cypress: FPGA: UART random failure

Created on 13 Nov 2019  路  23Comments  路  Source: ARMmbed/mbed-os

Description of defect

Cypress: FPGA: UART random failure

Step to reproduce:

  1. Run UART tests
  2. Comment all test in mbed_hal_fpga_ci_test_shield/uart/main.cpp, uncomment any test for example line 313.
  3. Comment line 252 - 271
  4. Run test (repeate couple times)

Expected: PASS

Actual:
tester.reset() returned:

++ MbedOS Error Info ++
[1573648317.62][CONN][RXD] Error Status: 0x80010133 Code: 307 Module: 1
[1573648317.68][CONN][RXD] Error Message: Mutex: 0x8003B14, Not allowed in ISR context
[1573648317.70][CONN][RXD] Location: 0x100095AD
[1573648317.73][CONN][RXD] Error Value: 0x8003B14
[1573648317.84][CONN][RXD] Current Thread: main Id: 0x8003B38 Entry: 0x10009477 StackSize: 0x1000 StackMem: 0x8004A18 SP: 0x80055D4
[1573648317.94][CONN][RXD] For more info, visit: https://mbed.com/s/error?error=0x80010133&tgt=CY8CKIT_062_WIFI_BT

This test PASSED if comment line 251.

Target(s) affected by this defect ?

All Cypress boards. Tested on CY8CKIT_062_WIFI_BT

Toolchain(s) (name and version) displaying this defect ?

Tested on GCC_ARM

What version of Mbed-os are you using (tag or sha) ?

Latest from main.
https://github.com/ARMmbed/mbed-os/commit/f1848f9e6d7005ca40e5c0c4dd0aec133bab124a

What version(s) of tools are you using. List all that apply (E.g. mbed-cli)

N/A

How is this defect reproduced ?

50/50

CLOSED mirrored bug

All 23 comments

@0xc0170 @mprse

Latest from main.

please report SHA, helps to get fixed version, edit the above to contain SHA.

Latest from main.

please report SHA, helps to get fixed version, edit the above to contain SHA.

Done. Thanks.

@mprse Please take a llok if you have time.

@mprse I see this exception from tester (FPGA) side.

So if you run only one test case (any) it fails randomly?

Comment line 252 - 271

What does it mean? What is the purpose of removing these lines:
https://github.com/ARMmbed/mbed-os/blob/699372421a3b388fe568e9be85b1a985749a438f/TESTS/mbed_hal_fpga_ci_test_shield/uart/main.cpp#L252-L271

Mutex: 0x8003B14, Not allowed in ISR context
This test PASSED if comment line 251:

https://github.com/ARMmbed/mbed-os/blob/699372421a3b388fe568e9be85b1a985749a438f/TESTS/mbed_hal_fpga_ci_test_shield/uart/main.cpp#L251

It looks like something bad happens inside serial interrupt handler - some disallowed operation and we got an exception. serial_irq_set() enables serial interrupts, so if you remove this line probably interrupt mode is disabled, so you don't have the exception.

@fkjagodzinski have you seen such behavior?

@mprse

So if you run only one test case (any) it fails randomly?

Yes.

Comment line 252 - 271
What does it mean? What is the purpose of removing these lines:

>

It is not important. I commented because I wanted to isolate the problem.

@fkjagodzinski ?

3. Comment line 252 - 271

++ MbedOS Error Info ++
[1573648317.62][CONN][RXD] Error Status: 0x80010133 Code: 307 Module: 1
[1573648317.68][CONN][RXD] Error Message: Mutex: 0x8003B14, Not allowed in ISR context
[1573648317.70][CONN][RXD] Location: 0x100095AD
[1573648317.73][CONN][RXD] Error Value: 0x8003B14
[1573648317.84][CONN][RXD] Current Thread: main Id: 0x8003B38 Entry: 0x10009477 StackSize: 0x1000 StackMem: 0x8004A18 SP: 0x80055D4
[1573648317.94][CONN][RXD] For more info, visit: https://mbed.com/s/error?error=0x80010133&tgt=CY8CKIT_062_WIFI_BT

This looks suspicious. On current master (and also the SHA you mentioned -- f1848f9) you should not remove line 252 alone; you should also remove 249 & 251.

https://github.com/ARMmbed/mbed-os/blob/2e96145b7607de430235dd795ab5350c1d4d64d7/TESTS/mbed_hal_fpga_ci_test_shield/uart/main.cpp#L249-L252

I do not have any Cypress target with an Arduino form factor to test with FPGA shield but I can see a similar crash on K64F if I comment out 252-271:

[1574070560.88][CONN][RXD] ++ MbedOS Error Info ++
[1574070560.92][CONN][RXD] Error Status: 0x80010133 Code: 307 Module: 1
[1574070560.98][CONN][RXD] Error Message: Mutex: 0x2000113C, Not allowed in ISR context
[1574070560.99][CONN][RXD] Location: 0x6817
[1574070561.03][CONN][RXD] Error Value: 0x2000113C
[1574070561.13][CONN][RXD] Current Thread: main Id: 0x20001160 Entry: 0x6747 StackSize: 0x1000 StackMem: 0x200022B0 SP: 0x20002F50
[1574070561.21][CONN][RXD] For more info, visit: https://mbed.com/s/error?error=0x80010133&tgt=K64F
[1574070561.23][CONN][RXD] -- MbedOS Error Info --

If you like to skip DUT RX (IRQ) / FPGA TX part of the test, I suggest commenting out all the related code (lines 239-269):
https://github.com/ARMmbed/mbed-os/blob/2e96145b7607de430235dd795ab5350c1d4d64d7/TESTS/mbed_hal_fpga_ci_test_shield/uart/main.cpp#L239-L269

@mprse @0xc0170 Please take a look @fkjagodzinski can reproduce this issue:

I do not have any Cypress target with an Arduino form factor to test with FPGA shield but I can see a similar crash on K64F if I comment out 252-271:

I think that we are missing the point here. @fkjagodzinski reproduced your failure on K64F by removing some random part of the test that you mentioned. Original test version passes on the K64F.

I think that we should start again since I think that no one understands now where or even if we have a problem.

Please provide full test output with the failure (without modifying the test).

I think that we are missing the point here. @fkjagodzinski reproduced your failure on K64F by removing some random part of the test that you mentioned. Original test version passes on the K64F.

I think that we should start again since I think that no one understands now where or even if we have a problem.

Please provide full test output with the failure (without modifying the test).

OK.

I think that we are missing the point here. @fkjagodzinski reproduced your failure on K64F by removing some random part of the test that you mentioned. Original test version passes on the K64F.

I think that we should start again since I think that no one understands now where or even if we have a problem.

Please provide full test output with the failure (without modifying the test).

@mprse Please see test output (without modifying the test):

[1574157839.82][HTST][INF] host test executor ver. 0.0.9
[1574157839.82][HTST][INF] copy image onto target... SKIPPED!
[1574157839.82][HTST][INF] starting host test process...
[1574157839.83][CONN][INF] starting connection process...
[1574157839.83][CONN][INF] notify event queue about extra 60 sec timeout for serial port pooling
[1574157839.83][CONN][INF] initializing serial port listener...
[1574157839.83][SERI][INF] serial(port=/dev/tty.usbmodem14344302, baudrate=9600, read_timeout=0.01, write_timeout=5)
[1574157839.83][HTST][INF] setting timeout to: 60 sec
[1574157839.83][SERI][TXD] mbedmbedmbedmbedmbedmbedmbedmbedmbedmbed
[1574157839.83][CONN][INF] sending up to 2 __sync packets (specified with --sync=2)
[1574157839.83][CONN][INF] sending preamble '623e60f4-5280-4229-a350-36c6211c7437'
[1574157839.83][SERI][TXD] {{__sync;623e60f4-5280-4229-a350-36c6211c7437}}
[1574157839.96][CONN][RXD] mbedmbedmbedmbedmbedmbedmbedmbed
[1574157840.01][CONN][INF] found SYNC in stream: {{__sync;623e60f4-5280-4229-a350-36c6211c7437}} it is #0 sent, queued...
[1574157840.02][HTST][INF] sync KV found, uuid=623e60f4-5280-4229-a350-36c6211c7437, timestamp=1574157840.014832
[1574157840.04][CONN][INF] found KV pair in stream: {{__version;1.3.0}}, queued...
[1574157840.04][HTST][INF] DUT greentea-client version: 1.3.0
[1574157840.06][CONN][INF] found KV pair in stream: {{__timeout;240}}, queued...
[1574157840.06][HTST][INF] setting timeout to: 240 sec
[1574157840.09][CONN][INF] found KV pair in stream: {{__host_test_name;default_auto}}, queued...
[1574157840.09][HTST][INF] host test class: ''
[1574157840.09][HTST][INF] host test setup() call...
[1574157840.09][HTST][INF] CALLBACKs updated
[1574157840.09][HTST][INF] host test detected: default_auto
[1574157840.11][CONN][INF] found KV pair in stream: {{__testcase_count;1}}, queued...
[1574157840.14][CONN][RXD] >>> Running 1 test cases...
[1574157840.20][CONN][RXD]
[1574157840.20][CONN][INF] found KV pair in stream: {{__testcase_name;basic, 9600, 8N1, FC off}}, queued...
[1574157840.25][CONN][RXD] >>> Running case #1: 'basic, 9600, 8N1, FC off'...
[1574157840.29][CONN][INF] found KV pair in stream: {{__testcase_start;basic, 9600, 8N1, FC off}}, queued...
[1574157840.32][CONN][RXD] Skipping TX pin D1 (327681)
[1574157840.35][CONN][RXD] Skipping TX pin D14 (393217)
[1574157840.39][CONN][RXD] Skipping TX pin D12 (786433)
[1574157840.42][CONN][RXD] Skipping RX pin D0 (327680)
[1574157840.45][CONN][RXD] Skipping RX pin D15 (393216)
[1574157840.48][CONN][RXD] Skipping RX pin D11 (786432)
[1574157840.53][CONN][RXD] Testing all UART-no-FC peripherals
[1574157840.58][CONN][RXD] 1080098816 - Could not find pins to test peripheral
[1574157840.67][CONN][RXD] 1080164352 - peripheral tested on port: peripheral=(1080164352) TX=(A1) RX=(A0) ...succeeded

No responce...

@mprse This test expects data from tester (FPGA):
Last main.cpp call - line 251 (mbed_hal_fpga_ci_test_shield/uart/main.cpp)

RemoteDesktopManager64_NgpXNJVWwo

I'm on sick leave. Will chcek later.

So the last call in the test is enabling the rx interrupt. And then you land in cyhal_uart_getc()? It looks like the rx interrupt fired and it tries to get data from uart buffer... This is probably some issue if getc is called immediately after enabling rx interrupt.

@mprse Get well. @fkjagodzinski

This test consist of 4 test cases:

  1. // DUT TX / FPGA RX
  2. // DUT RX / FPGA TX
  3. // DUT TX (IRQ) / FPGA RX
  4. // DUT RX (IRQ) / FPGA TX

I tried Run 4) separatelly. But it stucked on line 255.
If run 3) and 4) test some time passed because it used tester data from 3).

It looks like test issue.

Issue is not reproducible with code change:

1111

Issue is not reproducible with code change:

I don't see what is changed. Can you rather include git diff ?

1111

Hello again. According to the defined behavior of serial_irq_set():

https://github.com/ARMmbed/mbed-os/blob/7177d8fefe2626d36bb576917cff70f466eb0fdf/hal/serial_api.h#L141-L145

In the test, we are enabling RX interrupt which should fire when Receive Data Register is full, however it looks like the interrupt is triggered just after enabling it (where no data should be received). This means that for sure this is a Cypress driver issue.

In the following line we can see that the following interrupts are enabled (CYHAL_UART_IRQ_RX_DONE | CYHAL_UART_IRQ_RX_ERROR | CYHAL_UART_IRQ_RX_NOT_EMPTY):
https://github.com/ARMmbed/mbed-os/blob/7177d8fefe2626d36bb576917cff70f466eb0fdf/targets/TARGET_Cypress/TARGET_PSOC6/cy_serial_api.c#L152-L153

I suggest leaving only CYHAL_UART_IRQ_RX_NOT_EMPTY. That would be consistent with the defined behavior. Maybe CYHAL_UART_IRQ_RX_DONE | CYHAL_UART_IRQ_RX_ERROR are causing that interrupt is fired immediately after enabling when no data is received yet.
Alternatively, try to flush RX buffer before enabling the RX interrupt.

Created #12052

Was this page helpful?
0 / 5 - 0 ratings

Related issues

drahnr picture drahnr  路  4Comments

bcostm picture bcostm  路  4Comments

bcostm picture bcostm  路  4Comments

chrissnow picture chrissnow  路  4Comments

neilt6 picture neilt6  路  4Comments