Px4-autopilot: Unexpected mode switch from OFFBOARD mode

Created on 19 Nov 2020  路  7Comments  路  Source: PX4/PX4-Autopilot

Describe the bug
It has been reported by several users that while flying in offboard the Pixhawk randomly changes to a different mode for no apparent reason, i.e. not due to offboard control failure, or sensor failure etc. Usually this change is extremely fast and the reason for is not registered in the flight logs making this issue quite hard to debug. In our case this mode change coincided with the mode that was set in the Pilot RC at the time, in this case ACRO, though @bperseghetti reported haaving the same issue in the past and it even happened without an RC on which seems to indicate it is not an RC issue. He also mentioned that this mode switch seemed random but it's still to be verified. @ThomasRigi also reported facing the same issue more than once and one time also in the kill switch.

This issue was discussed in PX4 Dev call on November 18th 2020: https://discuss.px4.io/t/px4-dev-call-november-18-2020/19554

It was first raised in slack here: https://px4.slack.com/archives/C0W2KUFFT/p1605613677266600

So far it has been reported on V1.09 and V1.10. No instances on V1.11 but most likely it's just from lack of usage.

All the logs will be provided further below.

To Reproduce
Unfortunately this issue seems super hard to reproduce as it does happen quite rarely (@bperseghetti put it in the range of happening once every 150/200 flights). But the steps are to simply fly offboard often and for large periods of time.

Expected behavior
The UAV should NOT exit offboard mode for no apparent reason, and while there are many reasons that could explain this mode change, for our current architecture the UAV would have changed into HOLD mode given our parameters, or in more extreme situations it could switch into either POSITION, ALTITUDE, or as a last resort into STABILIZED.

A secondary issue is that even if the UAV is indeed changing modes, that should be explicitly recorded in the logs which is currently not happening as it seems that since it's such a quick change it goes in between the two log points.

Log Files and Screenshots

Log 1
UAV changed from OFFBOARD mode to ACRO. No change in the RC Input:

RC_Input

https://logs.px4.io/plot_app?log=e83a1c3d-99a5-484a-b908-6eff075edc56

User: @RicardoM17

Log 2

Kill switch was engaged for an extremely short time but no action is recorded from the Pilot RC

image

image

https://logs.px4.io/plot_app?log=7ba7eefa-ed3e-4247-96c8-1a0294aea8d6

User: @ThomasRigi

Log 3

Momentary offboard loss with no Pilot RC input.

image

image

User: @ThomasRigi

@bperseghetti also mentioned being affected by this issue several times. I'm not sure if you have any logs that you could share?

Possible Solutions/Changes
These Solutions/Changes were discussed on the Dev Call when this was first raised.

  1. Add some hysteresis or more simply a timeout for the mode change. It really shouldn't matter for most of the user base if the mode change takes 50-100ms extra vs being basically instant. For the small use cases where this can be a concern perhaps this could be disabled via a parameter or even the precise timout period could be defined there.

  2. Throw an INFO message on mode change. If possible with the reason for mode change. It doesn't really solve this issue but it would help gather data to solve it in a different way in the future. IMO it also makes sense that this is enabled anyway.

  3. Change the message logging structure to ensure that no relevant data escapes the log in the presence of such a glitch.

  4. ???

FYI @ThomasRigi @bperseghetti @dagar @MaEtUgR

Many thansk in advance.

P.s. @dagar We would especially like to ask if a fix regarding 1. could be fast tracked as there is a good chance that we are blocked from flying atleast we implement a solution, even if temporary. So this is quite high priority/critical from our side. Of course whatever help that you need just let me know and we'll help out in any way I can but I would assume it would just be changed in commander. If you do not have the bandwidth let me know and I can try to get a PR for you to review but in that case any pointers that you can provide would be helpful.

bug

Most helpful comment

I've found the root cause here. https://github.com/PX4/PX4-Autopilot/pull/16264

All 7 comments

Small comment on Log 3: I think the offboard loss was due to a poorly set COM_OF_LOSS_T, but the resulting action was certainly not as it should have been. It should have switched to Position mode, but instead it switched to... offboard mode :/

About the possible solutions, I think that 1) makes sense for mode changes that are triggered over RC switches. But you need to be careful to not interfere with mode changes over a GCS or some other mavlink channel that get sent only once, let alone failsafe actions. (I guess I'm just stating the obvious here.)

2) Agree, but only with adding the reason for the mode change.

3) I don't see how you'd implement this, but if possible why not

We've also seen this, perhaps once or twice across hundreds of offboard flight hours. I'll try to locate a log, but IIRC there wasn't any useful information in them which could track this down. I think an explicit print in the log with the reason for the flight mode switch would help catch it.

@ThomasRigi 3. was a suggestion by @dagar himself. I'll admit that I'm a bit fuzzy on the details now but it did indeed make sense.

@mhkabir If you could locate a log that would be great, just to have more data. Do you happen to use V1.11 already?

Also what you mention is what we're suggesting in 2. and 3. Like I said it doesn't really prevent the issue, just makes it easier to spot and debug afterwards.

@julianoes since this seems to involve commander perhaps you can also contribute with some feedback here? Specifically:

1: Add some hysteresis or more simply a timeout for the mode change. It really shouldn't matter for most of the user base if the mode change takes 50-100ms extra vs being basically instant. For the small use cases where this can be a concern perhaps this could be disabled via a parameter or even the precise timout period could be defined there.

2: Throw an INFO message on mode change. If possible with the reason for mode change. It doesn't really solve this issue but it would help gather data to solve it in a different way in the future. IMO it also makes sense that this is enabled anyway.

P.s. @dagar We would especially like to ask if a fix regarding 1. could be fast tracked as there is a good chance that we are blocked from flying atleast we implement a solution, even if temporary. So this is quite high priority/critical from our side. Of course whatever help that you need just let me know and we'll help out in any way I can but I would assume it would just be changed in commander. If you do not have the bandwidth let me know and I can try to get a PR for you to review but in that case any pointers that you can provide would be helpful.

@ThomasRigi 3. was a suggestion by @dagar himself. I'll admit that I'm a bit fuzzy on the details now but it did indeed make sense.

I'm going to split the switches out from manual_control_setpoint into a new message manual_control_switches that only publishes at a low rate (or immediately on change). Then we can log 100% of the messages instead of sampling manual_control_setpoint at 5 Hz.

I've found the root cause here. https://github.com/PX4/PX4-Autopilot/pull/16264

Just wanted to leave here in this thread that while the main fix is being fixed in #16264 , subsequent improvements are being worked on in #16270 . These correspond mostly to possible solutions that were discussed here when it was believed to be an RC issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

julianoes picture julianoes  路  3Comments

mwiatt picture mwiatt  路  5Comments

bthnekn picture bthnekn  路  4Comments

wuxianshen picture wuxianshen  路  3Comments

alexcherpi picture alexcherpi  路  4Comments