Mbed-os: NUCLEO_F429ZI occassionally fails to set up ethernet

Created on 23 Jan 2019  路  9Comments  路  Source: ARMmbed/mbed-os

Description

Target: NUCLEO_F429ZI
Toolchain: any
Tools:
mbed-cli 1.8.3
mbed-os @ af52c30234d61ff136a98728e7119d277b421b32 -< this is today's latest, but I also tried with release 5.11.1, 5.10.3 and 5.9.4

Steps to reproduce:

The issue is rather hard to reproduce - which is probably why it went unnoticed for so long.

Basically, any repeated loop of:
1) setting the ethernet up on NUCLEO_F429ZI
2) resetting the board without setting ethernet down
Will eventually lead to step (1) failing, due to PHY_LINKED_STATUS flag not getting set in the PHY_BSR (Transceiver Basic Status Register).

We sometimes (once in a 100 runs) see this happen in features-netsocket-* greentea tests of mbed-os. The most recent example is NUCLEO_F429ZI-IAR.mbed-os-tests-netsocket-dns.mbed-os-tests-netsocket-dns failing with:

[1547720172.68][CONN][RXD] :156::FAIL: Expected 0 Was -3004
[1547720172.75][CONN][RXD] >>> failure with reason 'Assertion Failed' during 'Test Setup Handler'

(-3004 stand for NSAPI_ERROR_NO_CONNECTION)

We see this much more often with our icetea tests, as they reset the board before every test case and the inability to connect is much more visible.

Finally I wrote my own simple icetea test for cliapp which basically repeatedly calls

Bench.reset_dut(self, 1)
interfaceUp(self, ["dut1"])

With NUCLEO_F429ZI this test fails after a random number of iterations, claiming that no connection can be established. On average it takes about 100 resets (it failed after 17 resets but another time after 130 resets). Note - we do not explicitly deinitialize Ethernet before resetting.

I ran this test on K64F, which has a different EMAC driver, and it did not fail a single time in a 1000 runs.

Most importantly however - I ran this test on UBLOX_EVK_ODIN_W2, which has exactly the same STM32_EMAC driver as NUCLEO_F429ZI and it also never failed in a 1000 runs. I checked that the two boards (UBLOX and NUCLEO) have identical Ethernet phy configuration and should execute exactly the same source code.

I took the effort to minimize the possibility that this is a network configuration issue by running the tests on multiple platforms (K64F, UBLOX_EVK_ODIN_W2 and NUCLEO_F429ZI) on two different networks (ARM's internal testing network and my local office network). Only F429ZI was having connectivity issues and it happenned on both networks. I locally also secured a static IP address but even then, the issue was reproducible.

Digging into the code I found that the STM32 EMAC driver calls back to the higher layer depending on the the PHY_LINKED_STATUS flag. I therefore suppose that the root cause is this flag not always being set.

Issue request type

[ ] Question
[ ] Enhancement
[x] Bug
IOTOSM-2257 DONE st mirrored bug

All 9 comments

@ARMmbed/team-st-mcd Please review

@jeromecoutant can you look into this issue? It's impacting the results in our CI.

@ARMmbed/team-st-mcd - any runtime to look into this?

Hi Janne
This is in our "long" to-do list.... :-(

Hi all
Do you think it's possible to check #12464 and #12457 ?
Thx

@MarceloSalazar can u close if resolved

Many changes have been introduced and we haven't seen issues reported since then.
Closing for now and may reopen in the future if required.

Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers.
Internal Jira reference: https://jira.arm.com/browse/IOTOSM-2257

Was this page helpful?
0 / 5 - 0 ratings