PR: https://github.com/ARMmbed/mbed-os/pull/4912 introduced changes which somehow broke the SPI for NRF targets. Cl test sheilld test test-api-spi hangs on the case 'SPI - SD card exists' (which is the first with real communication process).
I have tested that even in case I force device to not going to sleep (I mocking hal sleep function by empty one) the test was still hanging. So looks like it is not related to introduced sleep management but rather to RTX changes.
Target
NRF52_DK. NRF52840_DK
Toolchain:
GCC_ARM (maybe others as well)
mbed-os rev
Bug have been reproducing since rev https://github.com/ARMmbed/mbed-os/commit/cb4e9b32a2933553630ff1ed40f9bdb77e51fda2
ci-test-shield
d1d016383e5776c7b0dd68736256cd3c4405598c (just master HEAD)
Actual behavior
Steps to reproduce
Run the tests-spi-spi ci-test-shield test.
I have tested that even in case I force device to not going to sleep (I mocking hal sleep function by empty one) the test was still hanging. So looks like it is not related to introduced sleep management but rather to RTX changes.
Pinpointed rev cb4e9b3, this does not work , any earlier commit works?
I'll try to reproduce this issue.
Pinpointed rev cb4e9b3, this does not work , any earlier commit works?
Yes, from this rev. including. Any earlier revision works well.
Is this only for NRF52 DK? I can't reproduce on NRF51DK
Taken from https://github.com/ARMmbed/ci-test-shield/blob/master/TESTS/API/SPI/SPI.cpp
#include "mbed.h"
#include "FATFileSystem.h"
#include "SDBlockDevice.h"
#include "mbed_error.h"
// main() runs in its own thread in the OS
int main() {
int result = 0;
SDBlockDevice sd(D11, D12, D13, D10);
FATFileSystem fs("sd");
sd.init();
result = fs.mount(&sd);
if (result !=0) {
error("SD file system mount failed.\r\n");
}
FILE *File = fopen("/sd/card-present.txt", "w+");
if (File == NULL) {
error("SD Card is not present. Please insert an SD Card.\r\n");
}
result = fs.unmount();
if (result !=0) {
error("SD file system unmount failed.\r\n");
}
sd.deinit();
}
latest mbed-os master. I do not see how this test can be affected by sleep manager addition (there is no sleep involved as you noted above, it does not affect the code).
Fortunately NRF51 works well. You need a nrf52 device.
Fortunately NRF51 works well. You need a nrf52 device.
These 2 devices do not share the common HAL port? https://github.com/ARMmbed/mbed-os/tree/master/targets/TARGET_NORDIC/TARGET_NRF5 - there are only few differencies. I am failing to see how that commit referenced above can affect SD card test.
I dont have nrf52 currently. @pan- could you help?
cc @scartmell-arm - I recall the bug report related to this, Could you rerun the example above or the test, and find out the root cause?
Bump @scartmell-arm
Bump @0xc0170 @scartmell-arm
CC @MarceloSalazar @dlfryar @maclobdell
I've tested this with 2 different NRF52 boards so far but I get the same TIMEOUT failure that's unrelated to this issue, both before and after the changeset that was mentioned.
I got finally the board few days ago, also having TIMEOUT for simple lp ticker test on the latest mbed-os. Using the firmware that is published on mbedos target page (219 version). Anything else missing?
Is it possible that you try to run the test form hand: you can use RelaTerm witch two subsequent string for start the on-board test: 1st:mbedmbedmbedmbedmbedmbedmbedmbedmbedmbed
2nd: {{__sync;1a2cdb58-88d8-4c5f-8db4-0d4609576611}} ?
this will help to check wheter timeout is caused by IF or mbed-os.
Is it possible that you try to run the test form hand: you can use RelaTerm witch two subsequent string for start the on-board test: 1st:mbedmbedmbedmbedmbedmbedmbedmbedmbedmbed
2nd: {{__sync;1a2cdb58-88d8-4c5f-8db4-0d4609576611}} ?
I am having problem with the interface on my board. I flashed it few times just in case, 219 version downloaded from mbed target page. It reconnects often, looks like power problem.What other options I have to debug this problem? I've been using NRF51 DK without any problems. I am currently running jlink interface, disconnections seems to be gone.
These 2 devices do not share the common HAL port? https://github.com/ARMmbed/mbed-os/tree/master/targets/TARGET_NORDIC/TARGET_NRF5 - there are only few differencies. I am failing to see how that commit referenced above can affect SD card test.
@nvlsianpu I had verified that both targets use the same spi implementation, what is different then for these 2 boards? Looking at the commit you pointed above, the only change for rtx was idle loop - https://github.com/ARMmbed/mbed-os/pull/4912/files#diff-a984e292501fce618b84c54859ad621d. I would revert also this change locally - then almost any change for a target should be gone within that commit. The question would be - there are two additions - use sleep manager and critical section. However, see my below local testing. As I resolved debugger interface problem, I am able to run programs now and debug ! Here are my latest:
My above program from CI test shield runs fine on latest master
* 41eb565 - (HEAD -> master, upstream/master) Merge pull request #5342 from ARMmbed/feature_cortex_a (22 hours ago)>.a601d85 - (HEAD -> master, origin/master, origin/HEAD) Merge pull request #67 from deepikabhavnani/test_checks (35 hours ago). I added simple printf("Success \r\n"); at the end of the program. I can see on the terminal success, debug shows the same.I run this via uvision (using jlink interface J-Link OB-SAM3U128-V2-NordicSemi 160212, full erase chip), blinky example changed with the code from https://github.com/ARMmbed/mbed-os/issues/5297#issuecomment-335813055.
Also run via mbed compile for ARM, the same result - success printed (using produced hex file and flash via drag-n-drop).
It was reported to us that also another team is having SPI issues with NRF 52480 (that one I do not have currently to test). As I indicated, spi is not affected, as nrf51 works, using the same spi implementation, something else it could be?
I can change the test program, I'll try to do more modifications, add more sd card tests to be certain it works. @nvlsianpu Does the test program I shared above fails for you? If you use the same revisions than me ? Have you analyzed SPI lines what is going on there? Is this test related - spi test fails but if you run app similar to the test it success?
Updated the test code to read/write 10 characters.
#include "mbed.h"
#include "FATFileSystem.h"
#include "SDBlockDevice.h"
#include "mbed_error.h"
#define SD_TEST_STRING_MAX 10
char SD_TEST_STRING[SD_TEST_STRING_MAX] = {0};
void init_string()
{
int x = 0;
for(x = 0; x < SD_TEST_STRING_MAX-1; x++){
SD_TEST_STRING[x] = 'A' + (rand() % 26);
}
SD_TEST_STRING[SD_TEST_STRING_MAX-1] = 0;
printf("\r\n****\r\nSD Test String = %s\r\n****\r\n",SD_TEST_STRING);
}
int main() {
int result = 0;
init_string();
SDBlockDevice sd(D11, D12, D13, D10);
FATFileSystem fs("sd");
sd.init();
result = fs.mount(&sd);
if (result !=0) {
error("SD file system mount failed.\r\n");
}
FILE *File = fopen("/sd/test_file.txt", "w+");
if (File == NULL) {
error("SD Card is not present. Please insert an SD Card.\r\n");
}
if (fprintf(File, SD_TEST_STRING) < 0) {
error("Writing file to sd card failed");
}
fclose(File);
File = fopen("/sd/test_file.txt", "r");
char read_string [SD_TEST_STRING_MAX] = {0};
fgets(read_string, SD_TEST_STRING_MAX, File);
if (strcmp(read_string, SD_TEST_STRING) != 0) {
printf("Written: %s \r\n", SD_TEST_STRING);
printf("Read: %s \r\n", read_string);
error("String written and read are not the same");
}
fclose(File);
result = fs.unmount();
if (result !=0) {
error("SD file system unmount failed.\r\n");
}
sd.deinit();
printf("SD card test done. \r\n");
}
Run multiple times, see the output below on my terminal:
****
SD Test String = EJDOPHABN
****
SD card test done.
****
SD Test String = EJDOPHABN
****
SD card test done.
****
SD Test String = EJDOPHABN
****
SD card test done.
****
SD Test String = EJDOPHABN
****
SD card test done.
Tested with GCC ARM and ARM.
I sorted out that DEVICE_SPI_ASYNCH is enabled on NRF52_DK - but it shouldn't be because of the same reason as was disabled for nRF52840_xxAA (https://github.com/ARMmbed/mbed-os/pull/4088).
[edity]
this is ok
I recall that hw bug. Does this fixes the failures are were experiencing ?
strange - now work even without this (on my desk).
[edit]
one difference is that I have disabled flow control for uart.
I updated my jlink IF to newest version form http://www.nordicsemi.com/eng/nordic/Products/nRF52-DK/nRF5x-OB-JLink-IF/52275 (so 170724 : COM port stability improvement). Now it works with hardware flow control well too.
I think all this issue was because of problems with our DAP\Jlink firmware - I think this issue can be closed as false-negative.
I reopen the issue; it is still visible on NRF52840_DK.
The issue is solved if the NRF52840 targets defines MBED_TICKLESS; it might be a good thing to add that. Otherwise I've observed the following behavior:
If the content of default_idle_hook is commented out, the test example works and if default_idle_hook is reduced to:
static void default_idle_hook(void)
{
core_util_critical_section_enter();
core_util_critical_section_exit();
}
Then it doesn't as strange as it is.
@nvlsianpu Can you validate the addition of MBED_TICKLESS macro in the NRF52840 target ? I'd suggest you to also look at the critical section issue.
@pan- I will check this in next week.
@pan- Regard mentioned default_idle_hook(void) problem, whats looks like critical section bug i provide a fix PR https://github.com/ARMmbed/mbed-os/pull/5595. This fixes the tests-api-spi.
Anyway MBED_TICKLESS is required for NRF52840 too as it behaves similar to NRF52832 (systick).
Could you make a PR to enable MBED_TICKLESS if it is confirmed by Nordic that it is safe to use it on NRF52840 ?
@pan- #5606 is merged, tests-concurrent-comms pass on nRF52840 as well. This issue is so fixed.
+---------------------+---------------+------------------------+--------+--------------------+-------------+
| target | platform_name | test suite | result | elapsed_time (sec) | copy_method |
+---------------------+---------------+------------------------+--------+--------------------+-------------+
| NRF52840_DK-GCC_ARM | NRF52840_DK | tests-concurrent-comms | OK | 26.72 | shell |
+---------------------+---------------+------------------------+--------+--------------------+-------------+
Fixed, via the referenced PR above, closing
Most helpful comment
strange - now work even without this (on my desk).
[edit]
one difference is that I have disabled flow control for uart.