Apollo: Running launch/perception_camera.launch crashes intermittently (apollo 3.5)

Created on 31 Jan 2019  ·  4Comments  ·  Source: ApolloAuto/apollo

System information

  • Linux Ubuntu 14.04:
  • Apollo installed from source:
  • Apollo version 3.5:

Steps to reproduce the issue:

  • Please use bullet points and include as much details as possible:

  • Run $ cyber_launch start modules/drivers/camera/launch/camera.launch or play a bag with the topic /apollo/sensor/camera/front_6mm/image in it

  • Run $ cyber_launch start modules/perception/production/launch/perception_camera.launch

  • It crashes most of the time, but launches sometimes as well

Supporting materials (screenshots, command lines, code/script snippets):

  • Crash message: [cyber_launch_24544] ERROR Process [mainboard_default_24544] has finished. [pid 24545, cmd mainboard -d /apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag -p mainboard_default_24544 -s CYBER_DEFAULT].

  • Stack trace **Please ignore the line numbers in stack trace as they won't be the same as yours

    1 0x00007f70b987d8b8 in _int_malloc (av=0x7f70b9bbf760 , bytes=1344) at malloc.c:3425

2 0x00007f70b987fae0 in __GI___libc_malloc (bytes=1344) at malloc.c:2893

3 0x00007f70b9e3adad in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

4 0x00007f70bbb7b3a9 in google::protobuf::DescriptorPool::Tables::AllocateBytes (this=0x1935a30,

size=1344) at external/com_google_protobuf/src/google/protobuf/descriptor.cc:1159

5 0x00007f70bbbac36a in google::protobuf::DescriptorPool::Tables::AllocateArray (this=0x1935a30, count=8) at external/com_google_protobuf/src/google/protobuf/descriptor.cc:1124

6 0x00007f70bbba533a in google::protobuf::DescriptorBuilder::AllocateArray (this=0x7ffeb8fc4a20, size=8, output=0x48defb00)

at external/com_google_protobuf/src/google/protobuf/descriptor.cc:3130

7 0x00007f70bbb89e8e in google::protobuf::DescriptorBuilder::BuildMessage (this=0x7ffeb8fc4a20, proto=

..., parent=0x0, result=0x48defad0)
at external/com_google_protobuf/src/google/protobuf/descriptor.cc:4348

8 0x00007f70bbb89577 in google::protobuf::DescriptorBuilder::BuildFileImpl (this=0x7ffeb8fc4a20,

proto=...) at external/com_google_protobuf/src/google/protobuf/descriptor.cc:4267

9 0x00007f70bbb887b0 in google::protobuf::DescriptorBuilder::BuildFile (this=0x7ffeb8fc4a20, proto=...)

at external/com_google_protobuf/src/google/protobuf/descriptor.cc:4083

10 0x00007f70bbb8586d in google::protobuf::DescriptorPool::BuildFileFromDatabase (this=0x1935970,

proto=...) at external/com_google_protobuf/src/google/protobuf/descriptor.cc:3421

11 0x00007f70bbb7cfb9 in google::protobuf::DescriptorPool::TryFindFileInFallbackDatabase (

this=0x1935970, name="modules/drivers/proto/sensor_image.proto")
at external/com_google_protobuf/src/google/protobuf/descriptor.cc:1732

12 0x00007f70bbb7bcb2 in google::protobuf::DescriptorPool::FindFileByName (this=0x1935970,

name="modules/drivers/proto/sensor_image.proto")
at external/com_google_protobuf/src/google/protobuf/descriptor.cc:1334

13 0x00007f70bbc1a3b0 in google::protobuf::internal::AssignDescriptors (

filename="modules/drivers/proto/sensor_image.proto", 
schemas=0x7f70ab26e5b0 <apollo::drivers::protobuf_modules_2fdrivers_2fproto_2fsensor_5fimage_2eproto::schemas>, 
default_instances_=0x7f70ab4a5770 <apollo::drivers::protobuf_modules_2fdrivers_2fproto_2fsensor_5fimage---Type <return> to continue, or q <return> to quit---

_2eproto::file_default_instances>,
offsets=0x7f70ab26e1c0 , factory=0x0,
file_level_metadata=0x7f70ab4afb80 ,
file_level_enum_descriptors=0x7f70ab4afba0 ,
file_level_service_descriptors=0x0)
at external/com_google_protobuf/src/google/protobuf/generated_message_reflection.cc:2311

14 0x00007f70ab256343 in apollo::drivers::protobuf_modules_2fdrivers_2fproto_2fsensor_5fimage_2eproto::(anonymous namespace)::protobuf_AssignDescriptors ()

at bazel-out/local-dbg/genfiles/modules/drivers/proto/sensor_image.pb.cc:108

15 0x00007f70bb680e85 in google::protobuf::internal::FunctionClosure0::Run (this=0x7ffeb8fc4df0)

at external/com_google_protobuf/src/google/protobuf/stubs/callback.h:129

16 0x00007f70bb682ba0 in google::protobuf::GoogleOnceInitImpl (

once=0x7f70ab4afba8 <apollo::drivers::protobuf_modules_2fdrivers_2fproto_2fsensor_5fimage_2eproto::(anonymous namespace)::protobuf_AssignDescriptorsOnce()::once>, closure=0x7ffeb8fc4df0)
at external/com_google_protobuf/src/google/protobuf/stubs/once.cc:83

17 0x00007f70c71b908b in google::protobuf::GoogleOnceInit (

once=0x7f70ab4afba8 <apollo::drivers::protobuf_modules_2fdrivers_2fproto_2fsensor_5fimage_2eproto::(anonymous namespace)::protobuf_AssignDescriptorsOnce()::once>, 
init_func=0x7f70ab2562c4 <apollo::drivers::protobuf_modules_2fdrivers_2fproto_2fsensor_5fimage_2eproto::(anonymous namespace)::protobuf_AssignDescriptors()>)
at external/com_google_protobuf/src/google/protobuf/stubs/once.h:128

18 0x00007f70ab2563a6 in apollo::drivers::protobuf_modules_2fdrivers_2fproto_2fsensor_5fimage_2eproto::(anonymous namespace)::protobuf_AssignDescriptorsOnce ()

at bazel-out/local-dbg/genfiles/modules/drivers/proto/sensor_image.pb.cc:113

19 0x00007f70ab256a95 in apollo::drivers::Image::descriptor ()

at bazel-out/local-dbg/genfiles/modules/drivers/proto/sensor_image.pb.cc:342

20 0x00007f70abbcd4a5 in apollo::cyber::message::MessageType ()

at ./cyber/message/protobuf_traits.h:35

---Type to continue, or q to quit---

21 0x00007f70abbc6abf in apollo::cyber::NodeChannelImpl::FillInAttr (

this=0x2148c60, attr=0x7ffeb8fc4f40) at ./cyber/node/node_channel_impl.h:218

22 0x00007f70abbbdfb5 in apollo::cyber::NodeChannelImpl::CreateReader(apollo::cyber::proto::RoleAttributes const&, std::function const&)> const&, unsigned int) (this=0x2148c60, role_attr=..., reader_func=..., pending_queue_size=1)

at ./cyber/node/node_channel_impl.h:188

23 0x00007f70abbb38a6 in apollo::cyber::NodeChannelImpl::CreateReader(std::string const&, std::function const&)> const&) (this=0x2148c60,

channel_name="/apollo/sensor/camera/front_6mm/image", reader_func=...)
at ./cyber/node/node_channel_impl.h:163

24 0x00007f70abbabd3c in apollo::cyber::Node::CreateReader(std::string const&, std::function const&)> const&) (this=0x2148750,

channel_name="/apollo/sensor/camera/front_6mm/image", reader_func=...) at ./cyber/node/node.h:158

25 0x00007f70abb96906 in apollo::perception::onboard::FusionCameraDetectionComponent::InitCameraListeners

(this=0x2142170) at modules/perception/onboard/component/fusion_camera_detection_component.cc:495

26 0x00007f70abb941d6 in apollo::perception::onboard::FusionCameraDetectionComponent::Init (

this=0x2142170) at modules/perception/onboard/component/fusion_camera_detection_component.cc:193

27 0x0000000000410441 in apollo::cyber::Component::Initialize (this=0x2142170, config=...)

at ./cyber/component/component.h:120

28 0x0000000000413484 in apollo::cyber::mainboard::ModuleController::LoadModule (this=0x7ffeb8fc5650,

dag_config=...) at cyber/mainboard/module_controller.cc:99

29 0x00000000004138b2 in apollo::cyber::mainboard::ModuleController::LoadModule (this=0x7ffeb8fc5650,

path="/apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag")
at cyber/mainboard/module_controller.cc:128

30 0x0000000000412eca in apollo::cyber::mainboard::ModuleController::LoadAll (this=0x7ffeb8fc5650)

at cyber/mainboard/module_controller.cc:64

31 0x0000000000412ad4 in apollo::cyber::mainboard::ModuleController::Init (this=0x7ffeb8fc5650)

at cyber/mainboard/module_controller.cc:33

32 0x000000000040f462 in main (argc=7, argv=0x7ffeb8fc5808) at cyber/mainboard/mainboard.cc:41

Perception Help wanted

Most helpful comment

@DevMMI / @techoe could you please provide details about your fix? I have encountered the same problem.
Thanks in advance.

All 4 comments

Resolved this issue by tracing down an out of bound memory access in the same process

@DevMMI / @techoe could you please provide details about your fix? I have encountered the same problem.
Thanks in advance.

@natashadsouza

@DevMMI / @techoe could you please provide details about your fix? I have encountered the same problem.
Thanks in advance.

Hi, the issue was that I was indexing an array out of bounds in the component I was running. C++ won't give you a compile error when you do this, and therefore you'll have to track it down yourself.

I recommend you look through the core dump using GDB and try to spot roughly where it crashed, and use that to localize which files you'll need to audit. I also recommend you try a previous commit that worked, then see what changed in high risk files using tools like _www.diffchecker.com_. Good luck!

Was this page helpful?
0 / 5 - 0 ratings