Ardupilot: Copter: disarmed mid-air when radio failsafe recovered (FrSky receiver)

Created on 30 Dec 2018  路  39Comments  路  Source: ArduPilot/ardupilot

Bug report

Issue details

arducopter 3.6.3 disarms while flying AUTO mission when radio reception came back - please see details in log at second 746.39.

i had earlier werid behaviours (quad yawing 360 degrees when radion comes back)

i am flying in failsafe mode "continue with mission when RX lost'

Version

ARDUCOPTER 3.6.3

Platform
[ ] All
[ ] AntennaTracker
[X] Copter
[ ] Plane
[ ] Rover
[ ] Submarine

Airframe type

quad, X

Hardware type

cube

Logs
_Please provide a link to any relevant logs that show the issue_

Copter Safety

Most helpful comment

Just to be clear, it's no that high channels "come back" later than earlier channels. The timing of when the channels "come back" is random per channel, though they tend to be in groups. The problem occurs when Ch3 comes back before a switch channel. So the failsafe is cleared, and then the switch logic acts on the "low" signal.

And why have people had 100's of successful failsafe events? Because that's how insidious this FrSky Rx bug is. Also, if you use Hold or Pre-programmed failsafes, then it's fine. This only occurs with the "no pulses" failsafe on FrSky Rx using SBUS. You can do 100 bench tests, and it's all, good, then get a bad one. In my extensive investigation into the problem, it appears it might be related to actual corrupt wireless data when the signal is marginal. But it's really just a hunch. All I know for sure is it is very difficult to replicate on the bench, and then it can happen in real-world.

The proper fix for this, IMO, is that the radio library should look at any given channel input, and if it is outside of the Min/Max, then it should flag an error down to the vehicle code. Simple as that. There is no good reason why an input parsing program, would accept input that is outside of the defined boundaries.

All 39 comments

here is my theory. i recently added ArmDisarm on a switch.

With faint radio signal is it possible that channel values are read at their low (1000) positions?

If so this would explain my 360 left yaw during a prior mission and this mid-air disarm during the last mission

Expected behaviour is uninterupted AUTO mission bahavioiur during fluctuating radio reception.

This should probably be in the discuss forum as the issue list for bugs and enhancements. But nonetheless, you are probably right. if you set the ARM/DISARM switch low, it will disarm, whether it is in flight or on the ground. So if you use an arm/disarm or a motor interlock switch, you have to be VERY VERY careful with your receiver failsafe configurations.

09:20:20 Radio failsafe begins
09:28:57 Radio failsafe clears
09:29:04 Copter disarms

The log then terminates because LOG_DISARMED is disabled.

Understood. I have never changed the arm switch position through out the mission. This is why I consider this a bug. Radio had the switch in armed position.

further more this unpredictability means that copter might change the mode off of the auto on to whatever is set for the lowest PWM range on the mode switch - it is quite dangerous.

It's not a bug because ArduPilot only does what your radio receiver tells it to do. So if your radio receiver is configured improperly, allowing that switch to go low when entering or exiting failsafe, it will do exactly as it's told and disarm. The problem is in your receiver configuration, not a bug in ArduPilot.

I can accept that. What is this guideline for seeing up failsafe behavior on frsky rx. X8r.

I had my RX setup to ardupilot recommended method - no pulses:

http://ardupilot.org/copter/docs/radio-failsafe.html

  • either recommendation should change or code should be altered. The recommended method does not work with AUTO mode and leads to unpredictable behavior.

Yes it does work fine, and it is used by thousands of vehicles without disarming for no reason. The problem again is your receiver configuration or receiver itself sending undesired input. No further diagnosis of your configuration issue is even possible because the log ends when disarmed due to having LOG_DISARMED disabled. If you had logging while disarmed enabled, it may have led to something useful about your configuration issue.

This should be posted in the discuss forum, the issue list is not for troubleshooting vehicle issues.

@Pedals2Paddles Matt: don't be rude to users. Courtesy doesn't cost you anything. I know you don't usually mean it, but you're coming across very poorly, and it reflects on all of us. It's not on. Period.

@maciek01 This should be in the forum at this stage, but I'll have a quick look. At first glance this does appear to be caused by the Rx failsafe behavior as Matt indicated. Have you tested failsafe behaviour of that particular radio setup on the ground? If you have a range test mode, I'd give that a go rather than just on/off, so the degradation is observable.

yes, i am trying to validate the failsafe behaviour by turning off TX while flying. At first glance looks like HOLD mode is more appropriate here with AUTO missions, but havnt concluded yet.

With Taranis, try putting the transmitter in range check mode, and walk away until it failsafes. The signal degradation, rather than just a binary on/off test, will give better insight.
As an SBus rx, it should be setting a flag in the signal when the quality drops below a given level (this is receiver firmware, not ArduPilot), so that ArduPilot knows when to stop trusting the signal. It may be that your rx doesn't set that flag soon enough for some reason.

@auturgy Not sure what you think was rude. None of it was, and you've told him the same thing I did a few times now. I see you've reopened the issue (but also directed him to discuss again),so I will unsubscribe from this one and let you handle.

I have heard something similar to this from @R-Lefebvre before to do with an FrSky receiver (I think). The issue Rob reported was that the receiver outputs were not recovering from the failsafe all at the same time. The lower channels were recovering before the upper channels. If this is happening then ArduPilot will think the radio failsafe is over (because the lower channels are coming in) and start processing all channels including the switches.

To be clear though, from ArduPilot's point of view, it just receives a block of data from the receiver with the values for all channels - AP does not have a way to recognise the issue on it own.

@maciek01, can you tell us the frsky receiver and transmitter model being used?

i have no problem moving this to forum. the challenge is that so far my experience is that no-one responds to forum posts, specifically non trivial issues that i experience - like this one here. i'm out by $200 after this crash and very skeptical to continuing with autonomous long range missions if stuff like this continues to happen.

@rmackay9 thanks for getting back:

frsky Taranis x9d plus
frsky X8R - initially with no pulses failsafe

Ah, very nice find. That's the one. So we could certainly add a warning to the wiki somewhere.. not exactly sure where about this particular combination.

OK, reading that report from Rob and looking at his fix, I think we could add an out-of-bounds check on the auxiliary switches. Basically just throw away the change if it's at or below the magical 874... perhaps round up a bit so if it's below 900 we throw it away.

if i may add - i experienced left yaw 360 turn upon failsafe recovery another time. this is on one of the 4 basic channels. video link: https://youtu.be/6lV0UsyuPro?list=PLvrRKTth9HIwtH_eycT6PRivtJK_RC3tc&t=103

the mission leg then completed on non waypoint facing yaw angle. i can attach logs from this event in a few days if helpful.

Also related to to Taranis SBUS failsafe problems: https://github.com/ArduPilot/ardupilot/issues/9389

video recording at the moment of disarm

https://youtu.be/5xjCsWYWbSU?t=614

Also in https://github.com/ArduPilot/ardupilot/issues/7516 there is a magical 874 value reported by FrSky receiver.
This would be ugly FrSky-specific fix, but I believe ArduPilot shall do a post-validation and if any channel <= 874 then interpret it as a radio FS.

Well, it's pretty clear we need a fix.

It looks like RC_Channel.cpp's read_mode_switch() has a fix for the flight mode channel. This part of the code has changed in master since Copter-3.6 was forked though so we probably need a similar check in the Copter-3.6 branch.

Also a check in the aux switch code (in Copter-3.6 and master). It looks like in master it's the RC_Channel.cpp::read_3pos_switch() that needs changing. my guess is we probably need to change the function to return a bool (i.e. true on success) so it can indicate it was unable to read the switch position.

Normally we fix these issues in master before backporting to the release branch..

would it also address yaw stick problem - (360 counter clockwise rotation - as if the stick read was at its low position)


unrelated to this issue - the following code in RC_Channel.cpp's read_mode_switch()
causes the mode 5 to kick in momentarily anytime taranis gets a faulty read from the 6-pos pot. its clearly noticeable with the flightdeck lua software running on taranis. F/W reads the intermediate switch transition signals and defaults to mode 5 for a quick moment until pot settles in the final position. I would suggest to check for precise ranges and to not default the mode to 5 if the signal is outside the 6 predefined ranges (>= 1750). It adds just a tiny bit of cpu cycle but eliminates unpredictability.

if      (pulsewidth < 1231) position = 0;
else if (pulsewidth < 1361) position = 1;
else if (pulsewidth < 1491) position = 2;
else if (pulsewidth < 1621) position = 3;
else if (pulsewidth < 1750) position = 4;
else position = 5;

alternatively MODE_SWITCH_DEBOUNCE_TIME_MS could be set bit higher than 200ms

@maciek01, I'm afraid the fix I'm proposing above won't address the roll, pitch, yaw inputs from the user but it would address the more critical issue of disarms in flight when using an arm/disarm switch with an FrSky receiver which is behaving badly by sending only half the channels (sorry to point the finger at the receiver but objectively speaking I think this behaviour is really bad).

We could/should certainly add a check on the read_mode_switch() but note that the valid range is higher than 1750, a valid range might be more like 2100 or 2200. If you have a log of the value that's being momentarily sent we could use that as a reference to decide what the upper limit should be. We need to be careful not to set it too low because it's very possible for users to setup their transmitters to send values above 2000 for some channels.. 2100 is probably the lowest we could make the upper limit.

Thank you. Just wanted to clarify the following part of your statement:

"... address the roll, pitch, yaw inputs from the user ..." - the situation i am referring to was not triggered by the user input. Similar to the disarm switch this occurred during failsafe recovery and the copter decided to yaw left 360 deg as if the yaw stick was at its low (which it wasn't). This looks identical (at first sight at least) to the frsky sending bad value for the high range switches.

Any ways, just stating it. Its not safety related on copters and its not interfering with the copter flight, just annoying. However it may be perceived differently on the ARDUPLANE as yaw triggered during autonomous flight might alter the flight pattern.

(btw - i will record and submit the log for the transitional mode switch values sent by taranis - in a separate issue).

@rmackay9 you're absolutely right this is a bad receiver behavior. But there are a lot of people using FrSky.
I don't understand why there is a pinpoint fix for switches only.
Why the <900 value considered as not good for Switches
but same time it is good for Pitch/Roll/Yaw etc?
Why not to do a generic fix and interpret as a complete radio FS the case if any of the channels are <= 900?
You can see in the logs, receiver provides 874 for some channels for a very short time period, just after radio link has been restored. So it will be just a small failsafe time prolongation.

@SergeyBokhantsev it's not a bad idea to check all channels but I worry that we are simply unaware of some receivers sending values below 900. I can imagine that there are tx/rx systems out there (perhaps FrSky systems) where a receiver capable of receiving say 12 channels is paired with a transmitter that can only send 8 and then it sends 874 for channels 9 to 12. I really don't know, I'm just guessing.. but I'm also looking for a very small change that we can get into Copter-3.6 without a drawn out beta testing period.

Perhaps interpret any channel below 900 as still in failsafe as described above... But only if that channel actually has a function assigned? That should prevent unused channels from "getting in the way".

btw - betaflight allows user to define valid ranges in failsafe configuration:
their defaults are min 885 and max 2215.
I never experienced unpredictable failsafe behavior with BF/frsky
They do also react to the failsafe bit in the SBUS data in addition to invalid ranges. i thought i'd just mention this.

On Thu, 3 Jan 2019, Maciek Kolesnik wrote:

btw - betaflight allows user to define valid ranges in failsafe configuration:
their defaults are min 885 and max 2215.

We check against 900 and 2200 for the mode switch - constants in the code

  • but not for the aux switches. I'm not sure why we don't also check
    against channel min/max.

Does anybody here have a reproducible test case that they can use to test
patches to fix this?

Strange, I have taranis x9d radio and frsky receivers and fly auto like this, and probably hundred plus missions and never ever have had an issue with radio signal coming back within range and the copter disarming or doing 360s

@imrj is your failsafe mode configured to HOLD?

I'm not sure why we don't also check against channel min/max.

BTW we need to take into account that Channel 5, as well as all others after it, aren't usually calibrated

On Fri, 4 Jan 2019, Sergey Bokhantsev wrote:

  I'm not sure why we don't also check against channel min/max.
  BTW we need to take into account that Channel 5, as well as all
  others after it, aren't usually calibrated

After discussions with Randy I made the PR check the same max/min as the
mode switch does - 2200 and 900

This issue appears semi-related to the THR_FS logic discussion, especially about needing to set THR_FS to 0: https://github.com/ArduPilot/ardupilot/issues/1748

Just to be clear, it's no that high channels "come back" later than earlier channels. The timing of when the channels "come back" is random per channel, though they tend to be in groups. The problem occurs when Ch3 comes back before a switch channel. So the failsafe is cleared, and then the switch logic acts on the "low" signal.

And why have people had 100's of successful failsafe events? Because that's how insidious this FrSky Rx bug is. Also, if you use Hold or Pre-programmed failsafes, then it's fine. This only occurs with the "no pulses" failsafe on FrSky Rx using SBUS. You can do 100 bench tests, and it's all, good, then get a bad one. In my extensive investigation into the problem, it appears it might be related to actual corrupt wireless data when the signal is marginal. But it's really just a hunch. All I know for sure is it is very difficult to replicate on the bench, and then it can happen in real-world.

The proper fix for this, IMO, is that the radio library should look at any given channel input, and if it is outside of the Min/Max, then it should flag an error down to the vehicle code. Simple as that. There is no good reason why an input parsing program, would accept input that is outside of the defined boundaries.

On Mon, 7 Jan 2019, Robert Lefebvre wrote:

The proper fix for this, IMO, is that the radio library should look at any given channel input, and if it is outside of the Min/Max, then it should flag
an error down to the vehicle code. Simple as that. There is no good reason why an input parsing program, would accept input that is outside of the defined
boundaries.

This should fix the 874 issue:
https://github.com/ArduPilot/ardupilot/pull/10199

We will backport this fix to the Copter-3.6 branch and release with Copter-3.6.5 which I hope will start beta testing within a week.

This is fixed in the Copter-3.6 branch and will released later today with 3.6.5-rc1.

Was this page helpful?
0 / 5 - 0 ratings