Px4-autopilot: SITL intermittent failure: param indices changing during sync

Created on 5 Feb 2019  路  18Comments  路  Source: PX4/PX4-Autopilot

http://ci.px4.io:8080/blue/organizations/jenkins/PX4_misc%2FFirmware-SITL_tests/detail/master/593/pipeline/

image

bug wontfix

All 18 comments

One possibility here is params being marked active mid-sync.

@bkueng tagging you in case anything comes to mind for parameter flux early on in SITL.

To be really paranoid we'd have to do something in mavlink parameters (either copy the list or lock the param system).

This is the reason the mavlink boot_complete call exists and all instances of param references should be initialized on app launch. Which app is offending the design?

Not sure yet, looking at adding debugging to catch the change. Anything that isn't using the new param wrapper (px4_params.h) or doing a param_find immediately is suspect.

param_find is also comparatively expensive and should not be run regularly.

Hack to help find offenders - https://github.com/PX4/Firmware/pull/11388

I think this one might also be related to lock step, at least why we're catching it regularly now. Something to do with sleep (now px4_sleep) before lock step is fully working? @julianoes FYI

screen shot 2019-02-05 at 6 30 39 pm

Dev call: Needs analysis on boot and fixes for offending nested param find calls.

Still need to keep an eye on this one.

@julianoes This is still a problem.

image

Another example.
http://ci.px4.io:8080/blue/organizations/jenkins/PX4_misc%2FFirmware-SITL_tests/detail/PR-11505/2/pipeline
image

@julianoes @dagar we see a lot of this errors

[ WARN] [1554283456.615858849, 13.772000000]: PR: request param #419 timeout, retries left 2, and 1 params still missing
[ WARN] [1554283456.616271258, 13.772000000]: PR: Param RC16_MIN (418/559): <value><double>1000</double></value> different index: 419/560

when retrieving the px4 paramters from mavlink/mavros on the avoidance side.
Is there anything we can do about it ?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@dagar I haven't seen this in a while.

Likewise, but there's still nothing fundamentally preventing it. How about we get the "hack" (https://github.com/PX4/Firmware/pull/11388) into an acceptable state for merging and move on? At least then we'd capture it in logs if it did happen in the field.

The alternative is locking the param system during mavlink param sync.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

I believe this has been resolved.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lgh5054 picture lgh5054  路  4Comments

zhanghouxin07 picture zhanghouxin07  路  5Comments

bosskwei picture bosskwei  路  3Comments

FaboNo picture FaboNo  路  5Comments

Stifael picture Stifael  路  3Comments