Here is an example of the error:
Assertion failed: (reinterpret_cast<size_t>(array) & (15)) == 0 && "this assertion is explained here: " "http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html" " **** READ THIS WEB PAGE !!! ****", file c:\projects\xnuv8cec\build\include\eigen3\eigen\src/Core/DenseStorage.h, line 109
Earlier in the log file, the real problem is:
18:32:14 C:\projects\xNUV8Cec\build\include\gtest/internal/gtest-internal.h(484): warning C4316: 'drake::systems::plants::test::`anonymous-namespace'::RigidBodyTreeTest_TestAddFloatingJointWeldToLink_Test': object allocated on the heap may not be aligned 16 [C:\projects\xNUV8Cec\pod-build\drake\systems\plants\test\rigid_body_tree_test.vcxproj] [C:\projects\xNUV8Cec\pod-build\drake.vcxproj]
This problem was not detected during the PR CI tests because those only perform release-mode tests where assertions are disabled.
Related issue: https://github.com/RobotLocomotion/drake/issues/1854
There is an Eigen 3.3 bug fix that _might_ solve this. See #2106.
Using Windows 10 64-bit Professional, I compiled a 32-bit version of Drake but could not reproduce the error.
Here is the analysis proving that the executable is 32-bits:
$ file ./bin/RelWithDebInfo/rigid_body_tree_test.exe
./bin/RelWithDebInfo/rigid_body_tree_test.exe: PE32 executable (console) Intel 80386, for MS Windows
Here is the output when I run the unit test:
$ ctest -VV -C RelWithDebInfo -R rigid_body_tree_test
ctest -VV -C RelWithDebInfo -R rigid_body_tree_test
UpdateCTestConfiguration from :C:/Users/liang/dev/drake_build/drake/DartConfiguration.tcl
Parse Config file:C:/Users/liang/dev/drake_build/drake/DartConfiguration.tcl
UpdateCTestConfiguration from :C:/Users/liang/dev/drake_build/drake/DartConfiguration.tcl
Parse Config file:C:/Users/liang/dev/drake_build/drake/DartConfiguration.tcl
Test project C:/Users/liang/dev/drake_build/drake
Constructing a list of tests
Done constructing a list of tests
Checking test dependency graph...
Checking test dependency graph end
test 21
Start 21: rigid_body_tree_test
21: Test command: C:\Users\liang\dev\drake_build\drake\bin\RelWithDebInfo\rigid_body_tree_test.exe
21: Test timeout computed to be: 1500
21: Running main() from gtest_main.cc
21: [==========] Running 3 tests from 1 test case.
21: [----------] Global test environment set-up.
21: [----------] 3 tests from RigidBodyTreeTest
21: [ RUN ] RigidBodyTreeTest.TestAddFloatingJointNoOffset
21: could not find any links named body2
21: could not find any links named body2
21: [ OK ] RigidBodyTreeTest.TestAddFloatingJointNoOffset (5 ms)
21: [ RUN ] RigidBodyTreeTest.TestAddFloatingJointWithOffset
21: [ OK ] RigidBodyTreeTest.TestAddFloatingJointWithOffset (5 ms)
21: [ RUN ] RigidBodyTreeTest.TestAddFloatingJointWeldToLink
21: [ OK ] RigidBodyTreeTest.TestAddFloatingJointWeldToLink (6 ms)
21: [----------] 3 tests from RigidBodyTreeTest (16 ms total)
21:
21: [----------] Global test environment tear-down
21: [==========] 3 tests from 1 test case ran. (16 ms total)
21: [ PASSED ] 3 tests.
1/1 Test #21: rigid_body_tree_test ............. Passed 0.06 sec
The following tests passed:
rigid_body_tree_test
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 0.12 sec
I suspect it passed because it was compiled in RelWithDebInfo mode, which might have assertions disabled. I'm next going to try Debug mode.
The RelWithDebInfo binary is about 3X smaller than the Debug mode binary:
$ ls -sh drake/bin/RelWithDebInfo/rigid_body_tree_test.exe
72K drake/bin/RelWithDebInfo/rigid_body_tree_test.exe*
$ ls -sh drake/bin/Debug/rigid_body_tree_test.exe
288K drake/bin/Debug/rigid_body_tree_test.exe*
This give me some level of confidence that the Debug mode binary actually does contain the debug symbols, etc.
Bingo! I'm able to replicate the CI Windows error by using 32-bit Debug mode:
$ ctest -VV -C Debug -R rigid_body_tree_test
UpdateCTestConfiguration from :C:/Users/liang/dev/drake_build/drake/DartConfiguration.tcl
Parse Config file:C:/Users/liang/dev/drake_build/drake/DartConfiguration.tcl
UpdateCTestConfiguration from :C:/Users/liang/dev/drake_build/drake/DartConfiguration.tcl
Parse Config file:C:/Users/liang/dev/drake_build/drake/DartConfiguration.tcl
Test project C:/Users/liang/dev/drake_build/drake
Constructing a list of tests
Done constructing a list of tests
Checking test dependency graph...
Checking test dependency graph end
test 21
Start 21: rigid_body_tree_test
21: Test command: C:\Users\liang\dev\drake_build\drake\bin\Debug\rigid_body_tree_test.exe
21: Test timeout computed to be: 1500
21: Running main() from gtest_main.cc
21: [==========] Running 3 tests from 1 test case.
21: [----------] Global test environment set-up.
21: [----------] 3 tests from RigidBodyTreeTest
21: [ RUN ] RigidBodyTreeTest.TestAddFloatingJointNoOffset
21: Assertion failed: (reinterpret_cast<size_t>(array) & (15)) == 0 && "this assertion is explained here: " "http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html" " **** READ THIS WEB PAGE !!! ****", file c:\users\liang\dev\drake_build\install\include\eigen3\eigen\src/Core/DenseStorage.h, line 109
1/1 Test #21: rigid_body_tree_test .............***Failed 0.09 sec
0% tests passed, 1 tests failed out of 1
Total Test time (real) = 0.33 sec
The following tests FAILED:
21 - rigid_body_tree_test (Failed)
Errors while running CTest
Liang, if you want to check whether the alleged fix in Eigen 3.3 solves the problem, you could try hacking in this small change to Eigen.
Using windbg, I get the following back trace:
liang@DESKTOP-VN4SS6S MINGW64 ~/dev/drake_build/drake
$ /c/Program\ Files\ \(x86\)/Windows\ Kits/10/Debuggers/x86/windbg.exe -QY -g ./bin/Debug/rigid_body_tree_test.exe
0:000> k
# ChildEBP RetAddr
00 004fed38 77407dde ntdll!NtTerminateProcess+0xc
01 004fee14 770a7b42 ntdll!RtlExitUserProcess+0x9e
02 004fee28 64e67ac8 KERNEL32!ExitProcessImplementation+0x12
03 004fee34 64e67a5f ucrtbased!exit_or_terminate_process+0x38 [d:\th\minkernel\crts\ucrt\src\appcrt\startup\exit.cpp @ 130]
04 004fee78 64e67cc2 ucrtbased!common_exit+0x15f [d:\th\minkernel\crts\ucrt\src\appcrt\startup\exit.cpp @ 269]
05 004fee8c 64e62b4a ucrtbased!_exit+0x12 [d:\th\minkernel\crts\ucrt\src\appcrt\startup\exit.cpp @ 292]
06 004fee9c 64e6727d ucrtbased!abort+0x6a [d:\th\minkernel\crts\ucrt\src\appcrt\startup\abort.cpp @ 90]
07 004ff334 64e66f19 ucrtbased!common_assert_to_stderr_direct+0xcd [d:\th\minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 124]
08 004ff354 64e6558a ucrtbased!common_assert_to_stderr<wchar_t>+0x19 [d:\th\minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 138]
09 004ff36c 64e6774a ucrtbased!common_assert<wchar_t>+0x2a [d:\th\minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 378]
*** WARNING: Unable to verify checksum for C:\Users\liang\dev\drake_build\drake\lib\Debug\drakeRBM.dll
0a 004ff384 65697534 ucrtbased!_wassert+0x1a [d:\th\minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 404]
0b 004ff3a0 65616e66 drakeRBM!Eigen::internal::plain_array<double,36,0,16>::plain_array<double,36,0,16>+0x34 [c:\users\liang\dev\drake_build\install\include\eigen3\eigen\src\core\densestorage.h @ 109]
0c 004ff3ac 65634e1e drakeRBM!Eigen::DenseStorage<double,36,6,6,0>::DenseStorage<double,36,6,6,0>+0x16 [c:\users\liang\dev\drake_build\install\include\eigen3\eigen\src\core\densestorage.h @ 187]
0d 004ff3b8 656262a6 drakeRBM!Eigen::PlainObjectBase<Eigen::Matrix<double,6,6,0,6,6> >::PlainObjectBase<Eigen::Matrix<double,6,6,0,6,6> >+0x1e [c:\users\liang\dev\drake_build\install\include\eigen3\eigen\src\core\plainobjectbase.h @ 461]
0e 004ff3c4 65863ecc drakeRBM!Eigen::Matrix<double,6,6,0,6,6>::Matrix<double,6,6,0,6,6>+0x16 [c:\users\liang\dev\drake_build\install\include\eigen3\eigen\src\core\matrix.h @ 261]
*** WARNING: Unable to verify checksum for rigid_body_tree_test.exe
0f 004ff414 00d779ee drakeRBM!RigidBody::RigidBody+0x11c [c:\users\liang\dev\drake\drake\systems\plants\rigidbody.cpp @ 12]
10 004ff440 00d771a1 rigid_body_tree_test!std::_Ref_count_obj<RigidBody>::_Ref_count_obj<RigidBody><>+0x7e [c:\program files (x86)\microsoft visual studio 14.0\vc\include\memory @ 901]
11 004ff47c 00d80f04 rigid_body_tree_test!std::make_shared<RigidBody>+0x71 [c:\program files (x86)\microsoft visual studio 14.0\vc\include\memory @ 971]
*** WARNING: Unable to verify checksum for C:\Users\liang\dev\drake_build\install\lib\gtest.dll
12 004ff4c0 65a65ec6 rigid_body_tree_test!drake::systems::plants::test::`anonymous namespace'::RigidBodyTreeTest::SetUp+0x24 [c:\users\liang\dev\drake\drake\systems\plants\test\rigid_body_tree_test.cc @ 17]
13 004ff504 65a6569d gtest!testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test,void>+0x56 [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest.cc @ 2387]
14 004ff590 65a4329b gtest!testing::internal::HandleExceptionsInMethodIfSupported<testing::Test,void>+0x6d [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest.cc @ 2438]
15 004ff5bc 65a43cdd gtest!testing::Test::Run+0x6b [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest.cc @ 2470]
16 004ff5e8 65a442ff gtest!testing::TestInfo::Run+0xdd [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest.cc @ 2660]
17 004ff614 65a4a5ec gtest!testing::TestCase::Run+0xef [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest.cc @ 2775]
18 004ff674 65a662ac gtest!testing::internal::UnitTestImpl::RunAllTests+0x2cc [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest.cc @ 4650]
19 004ff6c0 65a65ccd gtest!testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>+0x5c [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest.cc @ 2387]
1a 004ff750 65a44971 gtest!testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>+0x6d [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest.cc @ 2438]
*** WARNING: Unable to verify checksum for C:\Users\liang\dev\drake_build\install\lib\gtest_main.dll
1b 004ff79c 6536c9af gtest!testing::UnitTest::Run+0x131 [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest.cc @ 4257]
1c 004ff7a4 65362d35 gtest_main!RUN_ALL_TESTS+0xf [c:\users\liang\dev\drake\externals\googletest\googletest\include\gtest\gtest.h @ 2234]
1d 004ff7ac 00d8989e gtest_main!main+0x25 [c:\users\liang\dev\drake\externals\googletest\googletest\src\gtest_main.cc @ 38]
1e 004ff7c0 00d896ea rigid_body_tree_test!invoke_main+0x1e [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 64]
1f 004ff818 00d8957d rigid_body_tree_test!__scrt_common_main_seh+0x15a [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 255]
20 004ff820 00d898b8 rigid_body_tree_test!__scrt_common_main+0xd [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 300]
21 004ff828 770938f4 rigid_body_tree_test!mainCRTStartup+0x8 [f:\dd\vctools\crt\vcstartup\src\startup\exe_main.cpp @ 17]
22 004ff83c 77415de3 KERNEL32!BaseThreadInitThunk+0x24
23 004ff884 77415dae ntdll!__RtlUserThreadStart+0x2f
24 004ff894 00000000 ntdll!_RtlUserThreadStart+0x1b
The back trace seems to indicate there's something wrong with RigidBody.cpp line 20. Unfortunately, it already has the magic EIGEN_MAKE_ALIGNED_OPERATOR_NEW macro defined.
I'm going to dig a little further and then, if still not successful, I will try @sherm1's suggestion of applying the Eigen 3.3 bug fix.
I tried adding EIGEN_MAKE_ALIGNED_OPERATOR_NEW to class RigidBodyTreeTest and removing the #ifndef SWIG processor macro that could potentially prevent the inclusion of EIGEN_MAKE_ALIGNED_OPERATOR_NEW from RigidBodyTree.h, but neither fixed the problem.
Thus, I shall try that Eigen 3.3 bugfix now.
@sherm1, when doing an out-of-source build of Drake on Windows, how do you modify Eigen's source code? I'm seeing two copies of Eigen in my build-artifacts directory:
I will for now try to modify both.
I believe you want to hack the version in externals/ but I'm not certain.
I hacked both both versions but unfortunately the error persists. I'm now trying to find the .dll or object file that Drake is linking against so I can delete it and be absolutely certain that my hacks are being compiled. Do you happen to know where Eigen's library is located when compiling out of source?
So far, I've searched in the following directories but cannot find Eigen's compiled shared library:
I notice that the latest unstable release of Eigen contains the patch we want to test. I'm currently modifying Drake's build system to make use of it. Hopefully this'll answer the question of whether Eigen 3.3 will resolve this problem.
Unfortunately, switching to the latest unstable version of Eigen did not solve the problem. I verified the patch was applied in the code in both install/include/eigen3/ and externals/eigen/eigen-prefix/src/eigen/. At this point, I will go back to learning how to interpret the back trace output of windbg.
Oh well. Thanks for trying that, Liang. :cry:
Using the tried-and-true debugging technique of commenting out code until the problem goes away and then slowing adding code back until the problem returns, I learned the following:
shared_ptr<RigidBody> or shared_ptr<RigidBodyTree>, care must be taken when initializing it.std::make_shared, a default allocator is used. This is a problem if the class is supposed to use a non-default allocator like Eigen's 16-bit aligned one.From the findings above, I think it is currently best practice _not_ to use std::make_shared with RigidBody and RigidBodyTree.
Instead of using:
foo = std::make_shared<RigidBody>()
do this:
foo = std::shared_ptr<RigidBody>(new RigidBody());
I shall submit a PR to close this issue shortly.
Hi, Liang. This problem has come up before in Drake -- the formal cure is std::allocate_shared which I believe exists because std::make_shared doesn't make use of a custom allocator when there is one.
Related issue: #1854