song@in_dev_docker:/apollo$ bash apollo.sh build_gpu
System check passed. Build continue ...
[WARNING] ESD CAN library supplied by ESD Electronics does not exist. If you need ESD CAN, please refer to third_party/can_card_library/esd_can/README.md.
[INFO] Start building, please wait ...
INFO: Reading 'startup' options from /apollo/tools/bazel.rc: --batch_cpu_scheduling
[INFO] Building on x86_64...
INFO: Reading 'startup' options from /apollo/tools/bazel.rc: --batch_cpu_scheduling
INFO: Analysed 1470 targets (12 packages loaded).
INFO: Found 1470 targets...
INFO: From Executing genrule //modules/drivers/pandora:pandora_genrule:
<command-line>:0:15: warning: ISO C99 requires whitespace after the macro name [enabled by default]
<command-line>:0:15: warning: ISO C99 requires whitespace after the macro name [enabled by default]
INFO: From Executing genrule //modules/perception/cuda_util:cuda_util_genrule:
-- Configuring done
-- Generating done
-- Build files have been written to: /home/song/.cache/bazel/_bazel_song/540135163923dd7d5820f3ee4b306b32/execroot/apollo/modules/perception/cuda_util/cmake_build
[100%] Built target cuda_util
/home/song/.cache/bazel/_bazel_song/540135163923dd7d5820f3ee4b306b32/execroot/apollo
INFO: From Compiling modules/perception/obstacle/camera/visualizer/glfw_fusion_viewer.cc:
modules/perception/obstacle/camera/visualizer/glfw_fusion_viewer.cc: In member function 'bool apollo::perception::lowcostvisualizer::GLFWFusionViewer::draw_analysis_curve()':
modules/perception/obstacle/camera/visualizer/glfw_fusion_viewer.cc:1327:1: warning: no return statement in function returning non-void [-Wreturn-type]
}
^
INFO: Elapsed time: 1918.030s, Critical Path: 1458.57s
INFO: Build completed successfully, 7253 total actions
============================
[ OK ] Build passed!
[INFO] Took 1922 seconds
============================
song@in_dev_docker:/apollo$ bash scripts/bootstrap.sh
Started supervisord with dev conf
Start roscore...
voice_detector: started
Dreamview is running at http://localhost:8888
song@in_dev_docker:/apollo$ rostopic list
-------NOTHING SHOWS HERE----
song@in_dev_docker:/apollo$ rosnode list
-------NOTHING SHOWS HERE----
song@in_dev_docker:/apollo$ env
CPLUS_INCLUDE_PATH=/usr/local/cuda-8.0/include:
HOSTNAME=in_dev_docker
TERM=xterm
ROS_ROOT=/home/tmp/ros/share/ros
ROS_PACKAGE_PATH=/home/tmp/ros/share:/home/tmp/ros/stacks
APOLLO_BIN_PREFIX=/apollo/bazel-bin
ROS_MASTER_URI=http://localhost:11311
USER=song
LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:/home/tmp/ros/lib:/apollo/lib:/apollo/bazel-genfiles/external/caffe/lib:/home/caros/secure_upgrade/depend_lib
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:
APOLLO_IN_DOCKER=true
CPATH=/home/tmp/ros/include
ROS_DOMAIN_ID=68321777
PATH=/usr/local/cuda-8.0/bin:/home/tmp/ros/bin:/apollo/scripts:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
C_INCLUDE_PATH=/usr/local/cuda-8.0/include:
DOCKER_GRP=song
PWD=/apollo
DOCKER_GRP_ID=1000
ROSLISP_PACKAGE_DIRECTORIES=
DOCKER_USER_ID=1000
DOCKER_USER=song
SHLVL=1
HOME=/home/song
ROS_DISTRO=indigo
PYTHONPATH=/usr/local/lib/python2.7/dist-packages:/apollo/py_proto:/usr/local/apollo/snowboy/Python:/apollo/modules/tools:/home/tmp/ros/lib/python2.7/dist-packages
PKG_CONFIG_PATH=/home/tmp/ros/lib/pkgconfig
LESSOPEN=| /usr/bin/lesspipe %s
DOCKER_IMG=registry.docker-cn.com/apolloauto/apollo:dev-x86_64-20180413_2000
CMAKE_PREFIX_PATH=/home/tmp/ros
DISPLAY=:0.0
LESSCLOSE=/usr/bin/lesspipe %s %s
APOLLO_BASE_SOURCED=1
ROS_ETC_DIR=/home/tmp/ros/etc/ros
_=/usr/bin/env
OLDPWD=/apollo/third_party/ros_x86_64
song@in_dev_docker:/apollo$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 18240 280 pts/23 Ss+ 08:47 0:00 /bin/bash
song 51 0.0 0.0 20972 3116 pts/24 Ss 08:47 0:00 /bin/bash
song 119 0.3 0.1 50472 9196 ? Ss 08:48 0:12 /usr/bin/python /usr/local/bin/supervisord -c /apollo/modules/tools/supervisord/dev.conf
song 127 0.1 0.0 542808 3560 pts/24 Sl 08:48 0:05 /usr/bin/python /home/tmp/ros/bin/roscore
song 133 0.7 0.0 1039604 7372 ? Sl 08:48 0:28 /apollo/bazel-bin/modules/monitor/monitor --flagfile=/apollo/modules/monitor/conf/monitor.conf
song 152 0.1 0.0 747584 4060 ? Ssl 08:48 0:06 /usr/bin/python /home/tmp/ros/bin/rosmaster --core -p 11311 -w 3 __log:=/home/song/.ros/log/738498b2-735a-11e8-b711-509a4c2d2f83/master.log
song 169 0.3 0.0 499240 368 ? Ssl 08:48 0:14 /home/tmp/ros/lib/rosout/rosout __name:=rosout __log:=/home/song/.ros/log/738498b2-735a-11e8-b711-509a4c2d2f83/rosout-1.log
song 185 0.1 0.0 711548 1064 pts/24 Sl 08:48 0:05 python modules/tools/voice_detection/snowboy_detector.py
song 191 2.1 0.0 5045096 1444 ? Sl 08:48 1:26 /apollo/bazel-bin/modules/dreamview/dreamview --flagfile=/apollo/modules/dreamview/conf/dreamview.conf
song 2422 0.1 0.5 711548 45284 pts/24 Sl 09:47 0:00 python modules/tools/voice_detection/snowboy_detector.py
song 2467 0.0 0.0 15584 2012 pts/24 R+ 09:55 0:00 ps aux
A week ago, apollo and ros all work very well. no commit on apollo docker images.
bootstap.sh redirects stdout/stderr of execution to various files, so errors do not show up on the console.
Can you please paste the contents of these files (from within the dev docker)?
/apollo/data/log/roscore.out
/tmp/supervisord.start.log
That will give us clues as to what is failing.
BTW, I see roscore as part of your running processes, but /rosout isn't part of your registered topics. This is already pointing to a potential problem with your ROS because /rosout should have appeared under both topics and nodes. If you just run
roscore
manually you should see this:
...
process[master]: started with pid [504]
ROS_MASTER_URI=http://in_dev_docker:11311/
setting /run_id to aebeff98-73e7-11e8-a890-7085c2287315
process[rosout-1]: started with pid [524]
started core service [/rosout]
...
Please share with us what you see.
thanks a lot.
[ OK ] Enjoy!
song@in_dev_docker:/apollo$ uname -a
Linux in_dev_docker 4.2.0-27-generic #32~14.04.1-Ubuntu SMP Fri Jan 22 15:32:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
song@in_dev_docker:/apollo$ bash scripts/bootstrap.sh
Started supervisord with dev conf
Start roscore...
voice_detector: started
Dreamview is running at http://localhost:8888
song@in_dev_docker:/apollo$ cat /tmp/supervisord.start.log
song@in_dev_docker:/apollo$ cat /apollo/data/log/roscore.out
song@in_dev_docker:/apollo$ roscore
... logging to /home/song/.ros/log/6d4d2904-7437-11e8-bfa5-509a4c2d2f83/roslaunch-in_dev_docker-452.log
Checking log directory for disk usage. This may take awhile.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.
started roslaunch server http://in_dev_docker:36205/
ros_comm version 1.11.21
SUMMARY
========
PARAMETERS
* /rosdistro: indigo
* /rosversion: 1.11.21
NODES
roscore cannot run as another roscore/master is already running.
Please kill other roscore/master processes before relaunching.
The ROS_MASTER_URI is http://in_dev_docker:11311/
The traceback for the exception was written to the log file
song@in_dev_docker:/apollo$ bash scripts/bootstrap.sh stop
dreamview: stopped
voice_detector: stopped
monitor: stopped
roscore: stopped
song@in_dev_docker:/apollo$ roscore
... logging to /home/song/.ros/log/d520bbd6-7437-11e8-8e62-509a4c2d2f83/roslaunch-in_dev_docker-506.log
Checking log directory for disk usage. This may take awhile.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.
started roslaunch server http://in_dev_docker:40673/
ros_comm version 1.11.21
SUMMARY
========
PARAMETERS
* /rosdistro: indigo
* /rosversion: 1.11.21
NODES
auto-starting new master
process[master]: started with pid [521]
ROS_MASTER_URI=http://in_dev_docker:11311/
setting /run_id to d520bbd6-7437-11e8-8e62-509a4c2d2f83
process[rosout-1]: started with pid [538]
started core service [/rosout]
-----IN ANOTHER TERMINAL------
song@in_dev_docker:/apollo$ rosnode list; rostopic list
song@in_dev_docker:/apollo$ exit
exit
song@songPC:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f9e1e92aa48e registry.docker-cn.com/apolloauto/apollo:dev-x86_64-20180413_2000 "/bin/bash" 8 minutes ago Up 8 minutes apollo_dev
a9ea6ec560ab registry.docker-cn.com/apolloauto/apollo:yolo3d_volume-x86_64-latest "/bin/sh" 8 minutes ago Up 8 minutes apollo_yolo3d_volume
d36b77463c6e registry.docker-cn.com/apolloauto/apollo:localization_volume-x86_64-latest "/bin/sh" 8 minutes ago Up 8 minutes apollo_localization_volume
b82bd9b04fd9 registry.docker-cn.com/apolloauto/apollo:map_volume-sunnyvale_loop-latest "/bin/bash" 8 minutes ago Up 8 minutes apollo_map_volume-sunnyvale_loop
a4a259dbf978 registry.docker-cn.com/apolloauto/apollo:map_volume-sunnyvale_big_loop-latest "/bin/sh" 8 minutes ago Up 8 minutes apollo_map_volume-sunnyvale_big_loop
song@songPC:~$
I also encountered the same problem. It can't be solved by deleting the docker image and the bazel cache files.
Solved. Caused by ROS_DOMAIN_ID conflict.
Solutions 1:
[in_dev_docker] In each bash terminal participate the ROS, try
env |grep -i domain|cut -c21-22
And
export ROS_DOMAIN_ID=`hostname -I | sed 's/[^0-9]//g' | cut -c5-10`"XX"; env |grep -i domain_id
ATTENTION: replace “XX” to number like “88”, ”99” etc, but differs to what you get from env |grep -i domain|cut -c21-22
Solutions 2:
[in_dev_docker] open /home/tmp/ros/setup.sh, in the lasted line, you will find
export ROS_DOMAIN_ID=`hostname -I | sed 's/[^0-9]//g' | cut -c5-10`"77"
Change 77 to number like “88”, ”99” etc.
Commit the docker image with tag “YOURIMAGETAG ” and restart docker with “local” option:
Bash docker/scripts/dev_start.sh -t YOURIMAGETAG -l
These solutions still have chance lead to fail, try to change the number XX and retry.
To Apollo team:
ROS_DOMAIN_ID=`hostname -I | sed 's/[^0-9]//g' | cut -c5-10`"77" cut middle of host IPs as ROS_DOMAIN_ID, possibility exists there are same ROS_DOMAIN_IDs in LAN.
For exp: “hostname -I” may returns IPs like
192.168.3.2 172.17.0.1
Or 192.168.3.217 172.17.0.1
We get the same ROS_DOMAIN_ID 68321777.
Moreover, In apollo-platform/ros/third_party/fast-rtps_x86_64/include/fastrtps/rtps/attributes/RTPSParticipantAttributes.h:
class PortParameters
{
public:
PortParameters()
{
portBase = 7400;
participantIDGain = 2;
domainIDGain = 250;
offsetd0 = 0;
offsetd1 = 10;
offsetd2 = 1;
offsetd3 = 11;
};
virtual ~PortParameters(){}
/**
* Get a multicast port based on the domain ID.
*
* @param domainId Domain ID.
* @return Multicast port
*/
inline uint32_t getMulticastPort(uint32_t domainId)
{
return portBase+ domainIDGain * domainId+ offsetd0;
}
............
}
It’s not guaranteed getMulticastPort() return the distinct MulticastPort in the LAN even the Domain ID distinct to each other.
The permanent solution could be modify the corresponding mentioned source code and republish docker image.
I guess if fomat output as following, it could work better.
hostname -I | sed 's/[^0-9]//g' | cut -c5-10`"77" #replace this line to get the following output
->192.168.3.2 172.17.0.1 #find the idx of space then reverse cut 6 characters back
->216832
->216832+randNumber+same_randNumber
To Apollo team:
We have also suffered with this equation for deriving ROS_DOMAIN_ID
We had the opposite problem. Machines were on the same subnet, but ended up with different ROS_DOMAIN_ID's
E.g. if the two nodes have short IPs such as:
192.168.1.13
192.168.1.18
The ROS_DOMAIN_IDs using the Apollo original equation end up being:
68113177
68118177
and the devices didn't connect.
It looks like multiple people are being affected by this problem in different ways. To address Song's point, we actually expect that if two machines are on the same subnet (equal except last number), we want them to have same ROS_DOMAIN_ID. He is expecting them to be different. So a clarification on the intent is also important, and then the equation can be fixed to match the intent.
Thanks
@songhanchen @osaman88 Thank you for reporting this issue and providing a detailed analysis. Rtps selects a port to communicate based on the doman id, and sometimes the selected port is occupied, resulting in failure to communicate. We are looking for a more effective mechanism to set the domain id, try to avoid port conflicts.
In addition, if communication is found to fail, you can manually modify the domain id in the following way.
export ROS_DOMAIN_ID=212
Closed. Reopen if you still have questions.