Describe the bug
nxp_fmuk66 reboot command hangs if SD card popped in and out
To Reproduce
Steps to reproduce the behavior:
Expected behavior
shutdown line 201 never calls callback. May be related to card insert or logger. I am assuming the Work queue is blocked.
fyi @igalloway
I'm going through some old NXP / FMUK66 related issues. I cannot reproduce this one, but I do get a mag timeout when an SD card is inserted while the FMU is powered.
Mag timeout. GPS timeout as well? Sometimes I also get a baro timeout.
Reboot from console
nsh> ERROR [sensors] Mag #0 fail: TIMEOUT!
ERROR [sensors] Sensor Mag #0 failed. Reconfiguring sensor priorities.
WARN [sensors] Remaining sensors after failover event 0: Mag #0 priority: 1
WARN [sensors] Remaining sensors after failover event 0: Mag #1 priority: 255
INFO [ekf2] Mag sensor ID changed to 4395025
WARN [ecl/EKF] EKF gps hgt timeout - reset to baro
WARN [ecl/EKF] EKF GPS data stopped
WARN [ecl/EKF] EKF stopping navigation
INFO [ecl/EKF] EKF commencing GPS fusion
@jarivanewijk I will have to determine the root cause. But I would suggest not making removal/insertion while powered a common practice.
@davids5 Of course we should not make this a common practice. However, I think this might cause in flight issues when there is bad contact between SD card and the reader (vibrations?). You would (temporarily?) lose a sensor.
We see quite a lot of logging drop outs with FMUK66, so this might not be too far fetched.
@jarivanewijk yes I was thinking along the same lines, hence the comment on determining the root cause. I will not be able to get to this right away, but have it on my list. Have you tested on current master with the workqueue changes?
@davids5 This issue has been open since January. If it was a big problem we would have come back to this earlier. There's no rush. ;)
Just tested with latest master (c66fc85). I get the same behavior as yesterday (with 1.9.2 stable). I now always see both the mag and the baro fail.
NuttShell (NSH)
nsh> INFO [ekf2] Mag sensor ID changed to 396809
INFO [ecl/EKF] EKF aligned, (pressure height, IMU buf: 22, OBS buf: 14)
INFO [ecl/EKF] EKF GPS checks passed (WGS-84 origin set, using GPS height)
INFO [ecl/EKF] EKF commencing GPS fusion
ERROR [sensors] Mag #0 fail: TIMEOUT!
ERROR [sensors] Sensor Mag #0 failed. Reconfiguring sensor priorities.
WARN [sensors] Remaining sensors after failover event 0: Mag #0 priority: 1
WARN [sensors] Remaining sensors after failover event 0: Mag #1 priority: 255
ERROR [sensors] Baro #0 fail: TIMEOUT!
ERROR [sensors] Sensor Baro #0 failed. Reconfiguring sensor priorities.
WARN [sensors] Remaining sensors after failover event 0: Baro #0 priority: 1
WARN [sensors] Remaining sensors after failover event 0: Baro #1 priority: 75
INFO [ekf2] Mag sensor ID changed to 4395025
WARN [ecl/EKF] EKF gps hgt timeout - reset to baro
WARN [ecl/EKF] EKF GPS data stopped
WARN [ecl/EKF] EKF stopping navigation
We are getting some reports of flight logs being cut off mid flight. I think we need to have another look at this issue, because it is probably all related. Can we help in any way to find the root cause of these SD card issues? @davids5
I had the flight cut of twice now. Running 1.9.2 stable.
Wondering if related to issue https://github.com/PX4/Firmware/issues/11815
Here are my cut-off flight logs:
@redxeth, @jarivanewijk Please test https://github.com/PX4/Firmware/pull/13440
Thanks @davids5 ! Will try it.
Thank you @davids5 - But I now get your original issue: the FMU hangs immediately when it is (re)booted without SD card inserted. Reinserting the SD card while PX4 is already running no longer gives the results I described in August (which was still the case for current master version).
@davids5 - I just noticed that with PX4 1.10 the FMUK66 does not boot when no SD card is inserted. The message "Successfully bound SDHC to the MMC/SD driver" is printed in the debug console and then it hangs. I assume this is related to this issue and PR #13440. It was not the case for PX4 1.9.2. When the board is powered when an SD card is inserted, it works as it should (also with 1.10).
This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.
Closing, please re-open if this is still an issue.
I tested this with a resent master. With no issue. @jarivanewijk does the unit in question have the latest bootloader?
Sorry that it took me so long to get back to this, @davids5 - No, it was an old bootloader. We were still using the bootloader from May 2019, because of https://github.com/PX4/Bootloader/issues/149 - It was fixed in November, but I forgot to test it and update the bootloader binary that we were distributing.
With the latest bootloader and firmware I have no issues. Sorry for the inconvenience!
I was too quick. It works fine with latest master firmware, but with latest stable (1.10.1) it still hangs after the bootloader (RGB LED remains white). @davids5
Thank you @jarivanewijk for retesting. Given the 1.11 is to be released soon, I would say it is good this is documented and will be resolved in the new release.