The goal here is to get the entire julia test suite working under Mozilla's "record and replay framework" (otherwise known as rr
). This is likely to help people debug intermittent bugs that only seem to happen on occasion when running the test suite.
Currently, all tests pass except for a few profile
and spawn
tests.
profile.jl
tests fail, rr
: timer_create
, timer_settime
, and timer_delete
[rr.8416] Warning: task 16746 (process 16746) dying from fatal signal SIGUSR2.
. (yyc: passes on linux x64 with rr 4.1.0.r111.gef7681a)spawn.jl
, the line @unix_only @test (run(pipe(yes,
head,DevNull)); true)
leads to a process dying from fatal signal SIGPIPE
. The tests then seem to hang there.spawn.jl
, the line @test_throws Base.UVError run(
foo_is_not_a_valid_command)
results in an ErrorException
instead of UVError
.spawn.jl
the four lines beginning with readall(setenv(
lead to an assertion failure within rr
itself. This (presumably) is because rr
expects LD_PRELOAD
to be set in all subprocesses, but the environment has been cleared by the call to setenv
.exec
failure is not correctly reported. (Reported upstream as https://github.com/mozilla/rr/issues/1741)For many (if not all) of these, the best path forward is simply to disable these tests when running under rr
.
@rocallahan, is there a good way for a program to detect whether it is running under rr
, and behave slightly differently if so? At the moment the best "hack" I can think of is to test whether LD_PRELOAD
contains the string rrpreload
.
Yeah, we don't have an officially supported way to do that right now. One approach that we should probably make officially supported is to write data to RR_MAGIC_SAVE_DATA_FD (999) (see rr/rr.h); under rr, that works, but otherwise it normally wouldn't work.
Please file rr issues on the bugs mentioned above; we should fix them. The timer syscalls would be easy to support. For the others, I'd need some debugging help.
I added timer_ support in https://github.com/mozilla/rr/commit/a32f60fdaaf3b3c558989005ae33b4c13a835793.
This is exciting progress!
The MIT cloud guys are making sure setting the proper KVM options for this doesn't break anything, and if it turns out okay, then they're going to set it up early next week. I can't guarantee it will work, but I think we've got a good shot at getting the buildbots rr
-capable.
Alright, we've got rr
running on our 32 and 64-bit Ubuntu 14.04 buildbots now. What is the desired interface here? An rr record
output posted for every commit? A web-accessible button that you can push to run rr record
on one of the buildbots? It's not outside of the realm of possibility that we could get pretty close to the "have an ssh session open and waiting", but I don't think we'd be able to do it for every commit. :P
Pinging @carnaval because it sounded like he was interested in this.
The whole test suite now passes under rr
now!
Woohoo! I'm guessing the bootstrap process works now too under rr?
IIUC the bootstrap has always been working.
Most helpful comment
The whole test suite now passes under
rr
now!