Servo: "Too many open files" crash after loading many small test cases via webdriver

Created on 1 Aug 2019  ·  137Comments  ·  Source: servo/servo

Built from git: 7adf022 on 20190724. Running on Ubuntu 16.04.2 LTS with Radeon HD 7770.

I have a test system (for my PhD dissertation) that loads a bunch of smallish test cases repeatedly via webdriver on multiple browsers (currently Chrome, Firefox and Servo). I can trigger a "Too many open files" panic in Servo after enough test cases have been loaded. I'm pretty sure I can reproduce this fairly easily by running long enough (I believe this is third time I've hit it today). I'm including a crash dump below but is anything additional I can do that would be particularly helpful for debugging this?

Here is an example that crashed after the 587th test case was loaded:

servo-20190724.git $ ./mach run --release -z --webdriver=7002 --resolution=400x300
VMware, Inc.
softpipe
3.3 (Core Profile) Mesa 18.3.0-devel
[2019-08-01T03:03:00Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/19183-2-34/3.html:1:2 fields are not currently supported
[2019-08-01T03:03:03Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/19183-2-34/4.html:1:2 fields are not currently supported
[2019-08-01T03:03:07Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/19183-2-34/5.html:1:2 fields are not currently supported
[2019-08-01T03:03:17Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/19183-2-34/8.html:1:2 fields are not currently supported
[2019-08-01T03:03:20Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/19183-2-34/9.html:1:2 fields are not currently supported
[2019-08-01T03:03:33Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/19183-2-34/13.html:1:2 fields are not currently supported
[2019-08-01T03:03:43Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/19183-2-34/16.html:1:2 fields are not currently supported
[2019-08-01T03:03:52Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/19183-2-34/19.html:1:2 fields are not currently supported
[2019-08-01T03:04:02Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/19183-2-34/22.html:1:2 fields are not currently supported
Could not stop player Backend("Missing dependency: playbin")
called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" } (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h1a74bd812d203c72 (0x55f41009633f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55f413330105)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55f41332fba1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55f41332fa85)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55f4133527ec)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x55f4133528e6)
             at src/libcore/result.rs:1051
   6: script::stylesheet_loader::StylesheetLoader::load::he3c6d2f4f6b2d86d (0x55f410cc94b4)
   7: script::dom::htmllinkelement::HTMLLinkElement::handle_stylesheet_url::h08d096fd325f038f (0x55f410da0119)
   8: <script::dom::htmllinkelement::HTMLLinkElement as script::dom::virtualmethods::VirtualMethods>::bind_to_tree::h4b49aadc206a8bb4 (0x55f410d9f9d5)
   9: script::dom::node::Node::insert::h5f83017d8cd2b0c4 (0x55f410e5f494)
  10: script::dom::node::Node::pre_insert::hc92be38a4a5c65ff (0x55f410e5e7d2)
  11: script::dom::servoparser::insert::hc2bdfe4c53d2659e (0x55f410c611de)
  12: html5ever::tree_builder::TreeBuilder<Handle,Sink>::insert_element::h483a7708bc71a57c (0x55f410fe59cd)
  13: html5ever::tree_builder::TreeBuilder<Handle,Sink>::step::h53264c73d52d73b7 (0x55f410ffaed7)
  14: <html5ever::tree_builder::TreeBuilder<Handle,Sink> as html5ever::tokenizer::interface::TokenSink>::process_token::h8d8307514ab0ae38 (0x55f410fb0653)
  15: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$13process_token17h75019972a696a3eeE.llvm.7363365923994120909 (0x55f41062d408)
  16: html5ever::tokenizer::Tokenizer<Sink>::emit_current_tag::h45460aa41ef12f15 (0x55f41062daf1)
  17: html5ever::tokenizer::Tokenizer<Sink>::step::h866e9b63f9f503ff (0x55f41063dcf1)
  18: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$3run17h8ce04a69e91aa671E.llvm.7363365923994120909 (0x55f41063194a)
  19: script::dom::servoparser::html::Tokenizer::feed::h9fdf5468e43321e7 (0x55f41065532a)
  20: script::dom::servoparser::ServoParser::do_parse_sync::hed9498713ee0e8aa (0x55f410c5cb3e)
  21: profile_traits::time::profile::h4e1d5a89a249ef96 (0x55f410e1b2c6)
  22: _ZN6script3dom11servoparser11ServoParser10parse_sync17h88e07764d212f834E.llvm.530561247764639767 (0x55f410c5c72f)
  23: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response_chunk::h61914a25d0137f92 (0x55f410c60885)
  24: script::script_thread::ScriptThread::handle_msg_from_constellation::h600baf3383c41b0e (0x55f410caa068)
  25: _ZN6script13script_thread12ScriptThread11handle_msgs17haf4ee7b3baaa2b69E.llvm.530561247764639767 (0x55f410ca488d)
  26: profile_traits::mem::ProfilerChan::run_with_memory_reporting::hce1ba89d5eb53e5e (0x55f410e1a1e7)
  27: std::sys_common::backtrace::__rust_begin_short_backtrace::hc93943e85aa6fa26 (0x55f411198e52)
  28: _ZN3std9panicking3try7do_call17hb93b842dfb1f8343E.llvm.11073267061158468562 (0x55f411245103)
  29: __rust_maybe_catch_panic (0x55f413339db9)
             at src/libpanic_unwind/lib.rs:82
  30: core::ops::function::FnOnce::call_once{{vtable.shim}}::h7217876d16e085d4 (0x55f410a082b2)
  31: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55f41331e8ee)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  32: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55f4133390df)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  33: start_thread (0x7fab3e4cc6b9)
  34: clone (0x7fab3cd6841c)
  35: <unknown> (0x0)
[2019-08-01T04:24:22Z ERROR servo] called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" }
Pipeline script content process shutdown chan: Os { code: 24, kind: Other, message: "Too many open files" } (thread Constellation, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h1a74bd812d203c72 (0x55f41009633f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55f413330105)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55f41332fba1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55f41332fa85)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55f4133527ec)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x55f4133528e6)
             at src/libcore/result.rs:1051
   6: constellation::pipeline::Pipeline::spawn::hc4537e426638cac3 (0x55f4101814af)
   7: constellation::constellation::Constellation<Message,LTF,STF>::new_pipeline::h03e845e5b2cd3635 (0x55f410196bdf)
   8: constellation::constellation::Constellation<Message,LTF,STF>::handle_panic::h9c59071776586bee (0x55f41019574b)
   9: constellation::constellation::Constellation<Message,LTF,STF>::handle_log_entry::hff41c6262b0a65ca (0x55f410198de1)
  10: constellation::constellation::Constellation<Message,LTF,STF>::handle_request_from_compositor::he6733721600a19c9 (0x55f4101ae38d)
  11: constellation::constellation::Constellation<Message,LTF,STF>::run::hc89c0bcdfed723c1 (0x55f4101b64d9)
  12: std::sys_common::backtrace::__rust_begin_short_backtrace::he8170c3b0c1833bc (0x55f41020014f)
  13: _ZN3std9panicking3try7do_call17h6580004dd9fdb22cE.llvm.4941401487006519239 (0x55f4100db513)
  14: __rust_maybe_catch_panic (0x55f413339db9)
             at src/libpanic_unwind/lib.rs:82
  15: core::ops::function::FnOnce::call_once{{vtable.shim}}::hd78ccb91aec8cf3a (0x55f4100efe12)
  16: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55f41331e8ee)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  17: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55f4133390df)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  18: start_thread (0x7fab3e4cc6b9)
  19: clone (0x7fab3cd6841c)
  20: <unknown> (0x0)
[2019-08-01T04:24:23Z ERROR servo] Pipeline script content process shutdown chan: Os { code: 24, kind: Other, message: "Too many open files" }
called `Result::unwrap()` on an `Err` value: RecvError (thread LayoutThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(695) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h1a74bd812d203c72 (0x55f41009633f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55f413330105)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55f41332fba1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55f41332fa85)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55f4133527ec)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x55f4133528e6)
             at src/libcore/result.rs:1051
   6: layout_thread::LayoutThread::start::hb675360933e398d4 (0x55f41047e063)
   7: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h8a534464fbe2083e (0x55f4104f3d23)
   8: std::sys_common::backtrace::__rust_begin_short_backtrace::hab512255cc628b7e (0x55f4105a9144)
   9: _ZN3std9panicking3try7do_call17h15789f162d15d069E.llvm.197603739779113467 (0x55f410513fd3)
  10: __rust_maybe_catch_panic (0x55f413339db9)
             at src/libpanic_unwind/lib.rs:82
  11: core::ops::function::FnOnce::call_once{{vtable.shim}}::h1b24f55261127c94 (0x55f41053f242)
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55f41331e8ee)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  13: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55f4133390df)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  14: start_thread (0x7fab3e4cc6b9)
  15: clone (0x7fab3cd6841c)
  16: <unknown> (0x0)
[2019-08-01T04:24:23Z ERROR servo] called `Result::unwrap()` on an `Err` value: RecvError                                                                                                                                                                              [151/362]
called `Option::unwrap()` on a `None` value (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }, at src/libcore/option.rs:378)
stack backtrace:
   0: servo::main::{{closure}}::h1a74bd812d203c72 (0x55f41009633f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55f413330105)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55f41332fba1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55f41332fa85)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55f4133527ec)
             at src/libcore/panicking.rs:85
   5: core::panicking::panic::h73f4a74a29ff704a (0x55f41335272b)
             at src/libcore/panicking.rs:49
   6: <script::dom::window::Window as script::dom::bindings::codegen::Bindings::WindowBinding::WindowBinding::WindowMethods>::Frames::h70e9762199220474 (0x55f410ab3346)
   7: <script::dom::htmlmediaelement::HTMLMediaElement as core::ops::drop::Drop>::drop::h82f851c26fc8d1fe (0x55f4107912f6)
   8: _ZN4core3ptr18real_drop_in_place17hb15edba9f2627c0eE.llvm.9711593096204109679 (0x55f410a27005)
   9: _ZN3std9panicking3try7do_call17hf7bcf8d4fe880bf2E.llvm.11073267061158468562 (0x55f411258803)
  10: __rust_maybe_catch_panic (0x55f413339db9)
             at src/libpanic_unwind/lib.rs:82
  11: script::dom::bindings::codegen::Bindings::HTMLAudioElementBinding::HTMLAudioElementBinding::_finalize::h510de5481af6c7d4 (0x55f410e9bcc0)
  12: _ZNK2js5Class10doFinalizeEPNS_6FreeOpEP8JSObject (0x55f411880a25)
             at /data/joelm/personal/UTA/dissertation/servo/servo-20190724.git/target/release/build/mozjs_sys-0c8fde13422c462f/out/dist/include/js/Class.h:872
      _ZN8JSObject8finalizeEPN2js6FreeOpE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/vm/JSObject-inl.h:83
      _ZN2js2gc5Arena8finalizeI8JSObjectEEmPNS_6FreeOpENS0_9AllocKindEm
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:591
  13: _ZL19FinalizeTypedArenasI8JSObjectEbPN2js6FreeOpEPPNS1_2gc5ArenaERNS4_15SortedArenaListENS4_9AllocKindERNS1_11SliceBudgetENS4_10ArenaLists14KeepArenasEnumE (0x55f411880872)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:651
  14: _ZL14FinalizeArenasPN2js6FreeOpEPPNS_2gc5ArenaERNS2_15SortedArenaListENS2_9AllocKindERNS_11SliceBudgetENS2_10ArenaLists14KeepArenasEnumE (0x55f411861c3b)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:683
  15: _ZN2js2gc10ArenaLists18foregroundFinalizeEPNS_6FreeOpENS0_9AllocKindERNS_11SliceBudgetERNS0_15SortedArenaListE (0x55f41186e169)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:5820
  16: _ZN2js2gc9GCRuntime17finalizeAllocKindEPNS_6FreeOpERNS_11SliceBudgetEPN2JS4ZoneENS0_9AllocKindE (0x55f41186f0fa)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:6110
  17: _ZN11sweepaction18SweepActionForEachI13ContainerIterIN7mozilla7EnumSetIN2js2gc9AllocKindEjEEES7_JPNS5_9GCRuntimeEPNS4_6FreeOpERNS4_11SliceBudgetEPN2JS4ZoneEEE3runESA_SC_SE_SH_ (0x55f4118891ab)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:6327
  18: _ZN11sweepaction19SweepActionSequenceIJPN2js2gc9GCRuntimeEPNS1_6FreeOpERNS1_11SliceBudgetEPN2JS4ZoneEEE3runES4_S6_S8_SB_ (0x55f4118893d4)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:6296
  19: _ZN11sweepaction18SweepActionForEachIN2js2gc19SweepGroupZonesIterEP9JSRuntimeJPNS2_9GCRuntimeEPNS1_6FreeOpERNS1_11SliceBudgetEEE3runES7_S9_SB_ (0x55f4118898b1)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:6327
  20: _ZN11sweepaction19SweepActionSequenceIJPN2js2gc9GCRuntimeEPNS1_6FreeOpERNS1_11SliceBudgetEEE3runES4_S6_S8_ (0x55f411889af1)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:6296
  21: _ZN11sweepaction20SweepActionRepeatForIN2js2gc15SweepGroupsIterEP9JSRuntimeJPNS2_9GCRuntimeEPNS1_6FreeOpERNS1_11SliceBudgetEEE3runES7_S9_SB_ (0x55f411889fc5)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:6356
  22: _ZN2js2gc9GCRuntime19performSweepActionsERNS_11SliceBudgetE (0x55f41186f5dd)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:6528
  23: _ZN2js2gc9GCRuntime16incrementalSliceERNS_11SliceBudgetEN2JS8GCReasonERNS0_13AutoGCSessionE (0x55f411871550)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:7049
  24: _ZN2js2gc9GCRuntime7gcCycleEbNS_11SliceBudgetEN2JS8GCReasonE (0x55f4118727d8)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:7398
  25: _ZN2js2gc9GCRuntime7collectEbNS_11SliceBudgetEN2JS8GCReasonE (0x55f41187335f)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:7569
  26: _ZN2js2gc9GCRuntime2gcE18JSGCInvocationKindN2JS8GCReasonE (0x55f41185640c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/gc/GC.cpp:7657
  27: _ZN9JSRuntime14destroyRuntimeEv (0x55f41162eaa3)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/vm/Runtime.cpp:284
  28: _ZN2js14DestroyContextEP9JSContext (0x55f4115c25d1)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/b2f8393/mozjs/js/src/vm/JSContext.cpp:197
  29: <mozjs::rust::Runtime as core::ops::drop::Drop>::drop::hed132fc3d12a947c (0x55f4114bfe75)
  30: _ZN4core3ptr18real_drop_in_place17h22626fd093207c2fE.llvm.9711593096204109679 (0x55f410a0f6dc)
  31: <alloc::rc::Rc<T> as core::ops::drop::Drop>::drop::h025c8fd65330ce43 (0x55f410a3c0e9)
  32: core::ptr::real_drop_in_place::hfb055ca4c92ce1d8 (0x55f4111a72b4)
  33: std::sys_common::backtrace::__rust_begin_short_backtrace::hc93943e85aa6fa26 (0x55f41119907d)
  34: _ZN3std9panicking3try7do_call17hb93b842dfb1f8343E.llvm.11073267061158468562 (0x55f411245103)
  35: __rust_maybe_catch_panic (0x55f413339db9)
             at src/libpanic_unwind/lib.rs:82
  36: core::ops::function::FnOnce::call_once{{vtable.shim}}::h7217876d16e085d4 (0x55f410a082b2)
  37: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55f41331e8ee)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  38: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55f4133390df)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  39: start_thread (0x7fab3e4cc6b9)
  40: clone (0x7fab3cd6841c)
  41: <unknown> (0x0)
[2019-08-01T04:24:23Z ERROR servo] called `Option::unwrap()` on a `None` value
thread panicked while panicking. aborting.
Servo exited with return value -4
A-webdriver I-panic

Most helpful comment

Things are MUCH improved with this build!

I did a test run for about 2000 loads with no crash/panic. There was a still a slight increase in fds over time, but it's very slow and certainly not happening with every page load. The fds started at about 220 and increased to 263 on the 2000th load. The number of fds wobbles up and down some (perhaps an artifact of how I'm counting them via /proc) but on average has a slow increase.

Here are the first few fd counts and then every 100 page loads after that:

Tue Sep  3 12:42:46 CDT 2019: Run 0, (fds: 220)
Tue Sep  3 12:42:47 CDT 2019: Run 1, (fds: 209)
Tue Sep  3 12:42:47 CDT 2019: Run 2, (fds: 212)
Tue Sep  3 12:42:47 CDT 2019: Run 3, (fds: 214)
Tue Sep  3 12:42:47 CDT 2019: Run 4, (fds: 214)
Tue Sep  3 12:42:47 CDT 2019: Run 5, (fds: 214)
Tue Sep  3 12:42:48 CDT 2019: Run 6, (fds: 214)
Tue Sep  3 12:42:49 CDT 2019: Run 7, (fds: 212)
Tue Sep  3 12:42:49 CDT 2019: Run 8, (fds: 214)
Tue Sep  3 12:42:49 CDT 2019: Run 9, (fds: 214)
Tue Sep  3 12:42:49 CDT 2019: Run 10, (fds: 214)
...
Tue Sep  3 12:43:06 CDT 2019: Run 100, (fds: 220)
Tue Sep  3 12:43:17 CDT 2019: Run 200, (fds: 223)
Tue Sep  3 12:43:27 CDT 2019: Run 300, (fds: 226)
Tue Sep  3 12:43:37 CDT 2019: Run 400, (fds: 232)
Tue Sep  3 12:43:47 CDT 2019: Run 500, (fds: 235)
Tue Sep  3 12:43:57 CDT 2019: Run 600, (fds: 241)
Tue Sep  3 12:44:06 CDT 2019: Run 700, (fds: 238)
Tue Sep  3 12:44:16 CDT 2019: Run 800, (fds: 238)
Tue Sep  3 12:44:25 CDT 2019: Run 900, (fds: 241)
Tue Sep  3 12:44:34 CDT 2019: Run 1000, (fds: 245)
Tue Sep  3 12:44:44 CDT 2019: Run 1100, (fds: 250)
Tue Sep  3 12:44:53 CDT 2019: Run 1200, (fds: 245)
Tue Sep  3 12:45:02 CDT 2019: Run 1300, (fds: 245)
Tue Sep  3 12:45:12 CDT 2019: Run 1400, (fds: 248)
Tue Sep  3 12:45:21 CDT 2019: Run 1500, (fds: 254)
Tue Sep  3 12:45:31 CDT 2019: Run 1600, (fds: 257)
Tue Sep  3 12:45:40 CDT 2019: Run 1700, (fds: 257)
Tue Sep  3 12:45:49 CDT 2019: Run 1800, (fds: 257)
Tue Sep  3 12:45:59 CDT 2019: Run 1900, (fds: 263)
Tue Sep  3 12:46:08 CDT 2019: Run 2000, (fds: 263)

All 137 comments

If there's some way to increase the maximum number of file handles on your system, that might help delay it. This suggests that we might be leaking handles somehow, though.

@jdm This is single webdriver session and it's just repeatedly loading different pages and then doing a screenshot of each one. So yeah, I suspect a leak somewhere in servo. Once I get a few more kinks worked out of my testing system, I'm intending to load > 100,000 test cases so even a 10x longer run before crashing won't help me that much. My goal is test browser rendering engine differences in the happy path (e.g. valid/safe pages) but so far it's proven to be pretty adept at crashing Servo too :-)

Are there any knobs I can set or files I can provide that would help to track this down if it turns out I can fairly reliably reproduce it?

Yep, happened again after the 710th test case was loaded.

The stack is shorter and seems different so I'm including it although I suspect that this is basically just panic'ing wherever the file exhaustion happens to occur:

servo-20190724.git $ ./mach run --release -z --webdriver=7002 --resolution=400x300
VMware, Inc.
softpipe
3.3 (Core Profile) Mesa 18.3.0-devel
[2019-08-01T05:02:36Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/12072-2-34/3.html:1:2 fields are not currently supported
[2019-08-01T05:02:39Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/12072-2-34/4.html:1:2 fields are not currently supported
[2019-08-01T05:02:42Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/12072-2-34/5.html:1:2 fields are not currently supported
[2019-08-01T05:02:52Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/12072-2-34/8.html:1:2 fields are not currently supported
[2019-08-01T05:02:56Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/12072-2-34/9.html:1:2 fields are not currently supported
[2019-08-01T05:03:08Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/12072-2-34/13.html:1:2 fields are not currently supported
[2019-08-01T05:03:17Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/12072-2-34/16.html:1:2 fields are not currently supported
[2019-08-01T05:03:27Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/12072-2-34/19.html:1:2 fields are not currently supported
[2019-08-01T05:03:36Z ERROR script::dom::bindings::error] Error at http://127.0.0.1:3000/gen/12072-2-34/22.html:1:2 fields are not currently supported
called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" } (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h1a74bd812d203c72 (0x5613fb31d33f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x5613fe5b7105)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x5613fe5b6ba1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x5613fe5b6a85)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x5613fe5d97ec)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x5613fe5d98e6)
             at src/libcore/result.rs:1051
   6: script::fetch::FetchCanceller::initialize::hc2be3bff8c1dfac0 (0x5613fbf1fa95)
   7: script::document_loader::DocumentLoader::fetch_async_background::hc49c09d7e9f54ad4 (0x5613fc435a2d)
   8: script::dom::document::Document::fetch_async::h82c1d743aa88b6a2 (0x5613fbeaebf4)
   9: script::stylesheet_loader::StylesheetLoader::load::he3c6d2f4f6b2d86d (0x5613fbf503ac)
  10: script::dom::htmllinkelement::HTMLLinkElement::handle_stylesheet_url::h08d096fd325f038f (0x5613fc027119)
  11: <script::dom::htmllinkelement::HTMLLinkElement as script::dom::virtualmethods::VirtualMethods>::bind_to_tree::h4b49aadc206a8bb4 (0x5613fc0269d5)
  12: script::dom::node::Node::insert::h5f83017d8cd2b0c4 (0x5613fc0e6494)
  13: script::dom::node::Node::pre_insert::hc92be38a4a5c65ff (0x5613fc0e57d2)
  14: script::dom::servoparser::insert::hc2bdfe4c53d2659e (0x5613fbee81de)
  15: html5ever::tree_builder::TreeBuilder<Handle,Sink>::insert_element::h483a7708bc71a57c (0x5613fc26c9cd)
  16: html5ever::tree_builder::TreeBuilder<Handle,Sink>::step::h53264c73d52d73b7 (0x5613fc281ed7)
  17: <html5ever::tree_builder::TreeBuilder<Handle,Sink> as html5ever::tokenizer::interface::TokenSink>::process_token::h8d8307514ab0ae38 (0x5613fc237653)
  18: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$13process_token17h75019972a696a3eeE.llvm.7363365923994120909 (0x5613fb8b4408)
  19: html5ever::tokenizer::Tokenizer<Sink>::emit_current_tag::h45460aa41ef12f15 (0x5613fb8b4af1)
  20: html5ever::tokenizer::Tokenizer<Sink>::step::h866e9b63f9f503ff (0x5613fb8c4cf1)
  21: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$3run17h8ce04a69e91aa671E.llvm.7363365923994120909 (0x5613fb8b894a)
  22: script::dom::servoparser::html::Tokenizer::feed::h9fdf5468e43321e7 (0x5613fb8dc32a)
  23: script::dom::servoparser::ServoParser::do_parse_sync::hed9498713ee0e8aa (0x5613fbee3b3e)
  24: profile_traits::time::profile::h4e1d5a89a249ef96 (0x5613fc0a22c6)
  25: _ZN6script3dom11servoparser11ServoParser10parse_sync17h88e07764d212f834E.llvm.530561247764639767 (0x5613fbee372f)
  26: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response_chunk::h61914a25d0137f92 (0x5613fbee7885)
  27: script::script_thread::ScriptThread::handle_msg_from_constellation::h600baf3383c41b0e (0x5613fbf31068)
  28: _ZN6script13script_thread12ScriptThread11handle_msgs17haf4ee7b3baaa2b69E.llvm.530561247764639767 (0x5613fbf2b88d)
  29: profile_traits::mem::ProfilerChan::run_with_memory_reporting::hce1ba89d5eb53e5e (0x5613fc0a11e7)
  30: std::sys_common::backtrace::__rust_begin_short_backtrace::hc93943e85aa6fa26 (0x5613fc41fe52)
  31: _ZN3std9panicking3try7do_call17hb93b842dfb1f8343E.llvm.11073267061158468562 (0x5613fc4cc103)
  32: __rust_maybe_catch_panic (0x5613fe5c0db9)
             at src/libpanic_unwind/lib.rs:82
  33: core::ops::function::FnOnce::call_once{{vtable.shim}}::h7217876d16e085d4 (0x5613fbc8f2b2)
  34: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x5613fe5a58ee)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  35: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x5613fe5c00df)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  36: start_thread (0x7f90ef5906b9)
  37: clone (0x7f90ede2c41c)
  38: <unknown> (0x0)
[2019-08-01T05:43:09Z ERROR servo] called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" }
Pipeline script content process shutdown chan: Os { code: 24, kind: Other, message: "Too many open files" } (thread Constellation, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h1a74bd812d203c72 (0x5613fb31d33f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x5613fe5b7105)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x5613fe5b6ba1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x5613fe5b6a85)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x5613fe5d97ec)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x5613fe5d98e6)
             at src/libcore/result.rs:1051
   6: constellation::pipeline::Pipeline::spawn::hc4537e426638cac3 (0x5613fb4084af)
   7: constellation::constellation::Constellation<Message,LTF,STF>::new_pipeline::h03e845e5b2cd3635 (0x5613fb41dbdf)
   8: constellation::constellation::Constellation<Message,LTF,STF>::handle_panic::h9c59071776586bee (0x5613fb41c74b)
   9: constellation::constellation::Constellation<Message,LTF,STF>::handle_log_entry::hff41c6262b0a65ca (0x5613fb41fde1)
  10: constellation::constellation::Constellation<Message,LTF,STF>::handle_request_from_compositor::he6733721600a19c9 (0x5613fb43538d)
  11: constellation::constellation::Constellation<Message,LTF,STF>::run::hc89c0bcdfed723c1 (0x5613fb43d4d9)
  12: std::sys_common::backtrace::__rust_begin_short_backtrace::he8170c3b0c1833bc (0x5613fb48714f)
  13: _ZN3std9panicking3try7do_call17h6580004dd9fdb22cE.llvm.4941401487006519239 (0x5613fb362513)
  14: __rust_maybe_catch_panic (0x5613fe5c0db9)
             at src/libpanic_unwind/lib.rs:82
  15: core::ops::function::FnOnce::call_once{{vtable.shim}}::hd78ccb91aec8cf3a (0x5613fb376e12)
  16: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x5613fe5a58ee)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  17: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x5613fe5c00df)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  18: start_thread (0x7f90ef5906b9)
  19: clone (0x7f90ede2c41c)
  20: <unknown> (0x0)
[2019-08-01T05:43:09Z ERROR servo] Pipeline script content process shutdown chan: Os { code: 24, kind: Other, message: "Too many open files" }
called `Result::unwrap()` on an `Err` value: RecvError (thread LayoutThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(692) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h1a74bd812d203c72 (0x5613fb31d33f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x5613fe5b7105)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x5613fe5b6ba1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x5613fe5b6a85)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x5613fe5d97ec)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x5613fe5d98e6)
             at src/libcore/result.rs:1051
   6: layout_thread::LayoutThread::start::hb675360933e398d4 (0x5613fb705063)
   7: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h8a534464fbe2083e (0x5613fb77ad23)
   8: std::sys_common::backtrace::__rust_begin_short_backtrace::hab512255cc628b7e (0x5613fb830144)
   9: _ZN3std9panicking3try7do_call17h15789f162d15d069E.llvm.197603739779113467 (0x5613fb79afd3)
  10: __rust_maybe_catch_panic (0x5613fe5c0db9)
             at src/libpanic_unwind/lib.rs:82
  11: core::ops::function::FnOnce::call_once{{vtable.shim}}::h1b24f55261127c94 (0x5613fb7c6242)
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x5613fe5a58ee)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  13: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x5613fe5c00df)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  14: start_thread (0x7f90ef5906b9)
  15: clone (0x7f90ede2c41c)
  16: <unknown> (0x0)
[2019-08-01T05:43:09Z ERROR servo] called `Result::unwrap()` on an `Err` value: RecvError
called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders for this socket closed" }) (thread StorageManager, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h1a74bd812d203c72 (0x5613fb31d33f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x5613fe5b7105)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x5613fe5b6ba1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x5613fe5b6a85)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x5613fe5d97ec)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x5613fe5d98e6)
             at src/libcore/result.rs:1051
   6: net::storage_thread::StorageManager::start::h466bcb1173168185 (0x5613fd32d91c)
   7: std::sys_common::backtrace::__rust_begin_short_backtrace::h210ea594b8a67611 (0x5613fd27c7b2)
   8: _ZN3std9panicking3try7do_call17h615e22bdb7f28b31E.llvm.2387109977385246948 (0x5613fd2d242b)
   9: __rust_maybe_catch_panic (0x5613fe5c0db9)
             at src/libpanic_unwind/lib.rs:82
  10: core::ops::function::FnOnce::call_once{{vtable.shim}}::he74969e58c7b4f78 (0x5613fd3660bf)
  11: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x5613fe5a58ee)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x5613fe5c00df)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  13: start_thread (0x7f90ef5906b9)
  14: clone (0x7f90ede2c41c)
  15: <unknown> (0x0)
[2019-08-01T05:43:09Z ERROR servo] called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders for this socket closed" })
Servo exited with return value -2

I think the most useful thing for narrowing down the cause in Servo would be if you can find the smallest set of unique webdriver commands that can reproduce this. In particular, if loading one particular file more times will trigger it, that's going to be easier to debug than loading a lot of different files fewer times.

I suspect it is

https://github.com/servo/servo/blob/799490a02e9bea575bf34c39f045ef0883539f05/components/script/stylesheet_loader.rs#L291

@kanaka Could you try again off of https://github.com/servo/servo/pull/23906 ? (Sorry for asking you to rebuild script)

If that is the case, we would have to find a way to not create an ipc channel for each stylesheet loader.

I can also see https://github.com/servo/servo/blob/45af8a34fe6cc6f304f935ababdbc90fb4c50ede/components/constellation/pipeline.rs#L279

So the question is whether this is because there are too many stylesheet loaders creating channels, or if it's more like webdriver is creating too many pipelines in general, and any creation of an ipc-chan will eventually fail.

Would it be acceptable for webdriver to run every pipeline in the same event-loop? That might be a way to naturally serialize webdriver commands and prevent them from flooding the system by creating many parallel event-loops(if that is the problem, off-course).

@kanaka actually could you please try running off of this branch https://github.com/servo/servo/pull/23761 (don't bother with the other one)

I was playing around with something the other day and realized we don't close event-loops when they're not managing any documents anymore, and I assume your script closes pages after taking a screenshot(you're not planning on keeping open 100k pages, are you?)?

In that case it might be that we're leaking event-loops...

@gterzian I will attempt to build and test with #23761. I'm not issuing an explicit close, just doing RemoteWebDriver.get calls to load a new page in the same window. My understanding was that a close command would close the browser if there is no other window loaded. I'm no creating new windows so my assumption was that I can just load a new page in the same window and I don't see issues in other browsers.

@gterzian hmm, we're not closing event loops? That's not good. What's meant to happen is that https://github.com/servo/servo/blob/master/components/constellation/constellation.rs#L3797 trim_history evicts old pipelines, and when there's no pipeline keeping en event loop alive it gets shut down by the Drop impl at https://github.com/servo/servo/blob/196c511d5ebf81b3fe202c5e0c5c1972a6408ab7/components/constellation/event_loop.rs#L21

Is webdriver not trimming the session history for some reason?

@gterzian with servo built from #23761 I still got a crash somewhere around 800 test cases. This one has some extra messages at the front end (most of which I've elided and I've also elided all but the first trace):

servo-20190724.git $ ./mach run --release -z --webdriver=7002 --resolution=400x300



Exiting pipeline (1,1).
shutting down layout for page (1,1)
Event loop empty
Exiting pipeline (1,2).
shutting down layout for page (1,2)
Exiting pipeline (1,3).
shutting down layout for page (1,3)
Exiting pipeline (1,4).
shutting down layout for page (1,4)
Exiting pipeline (1,5).
shutting down layout for page (1,5)
Exiting pipeline (1,6).
shutting down layout for page (1,6)
Exiting pipeline (1,7).
shutting down layout for page (1,7)
Exiting pipeline (1,8).
shutting down layout for page (1,8)
Exiting pipeline (1,9).
shutting down layout for page (1,9)
Exiting pipeline (1,10).
shutting down layout for page (1,10)
Exiting pipeline (1,11).
shutting down layout for page (1,11)
Exiting pipeline (1,12).
shutting down layout for page (1,12)
Exiting pipeline (1,13).
shutting down layout for page (1,13)
Exiting pipeline (1,14).
shutting down layout for page (1,14)
Exiting pipeline (1,15).
shutting down layout for page (1,15)
Exiting pipeline (1,16).
shutting down layout for page (1,16)
Exiting pipeline (1,17).
shutting down layout for page (1,17)
Exiting pipeline (1,18).
shutting down layout for page (1,18)
Exiting pipeline (1,19).
shutting down layout for page (1,19)
Exiting pipeline (1,20).
...
Exiting pipeline (1,446).
shutting down layout for page (1,446)
Exiting pipeline (1,447).
shutting down layout for page (1,447)
Could not stop player Backend("Missing dependency: playbin")
Exiting pipeline (1,448).
shutting down layout for page (1,448)
Exiting pipeline (1,449).
shutting down layout for page (1,449)
Exiting pipeline (1,450).
shutting down layout for page (1,450)
Exiting pipeline (1,451).
shutting down layout for page (1,451)
...
Exiting pipeline (1,705).
shutting down layout for page (1,705)
Exiting pipeline (1,706).
shutting down layout for page (1,706)
Exiting pipeline (1,707).
shutting down layout for page (1,707)
Exiting pipeline (1,708).
shutting down layout for page (1,708)
called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" } (thread ScriptThread Pipelin
eId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h8c27ca2bbd15260c (0x5582aa0070af)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x5582ad21bda5)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x5582ad21b841)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x5582ad21b725)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x5582ad23e48c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x5582ad23e586)
             at src/libcore/result.rs:1051
   6: script::stylesheet_loader::StylesheetLoader::load::h3dd3ec4cbe06d8f4 (0x5582aac30664)
   7: script::dom::htmllinkelement::HTMLLinkElement::handle_stylesheet_url::ha79b800c69addf04 (0x5582aad25714)
   8: <script::dom::htmllinkelement::HTMLLinkElement as script::dom::virtualmethods::VirtualMethods>::bind_to_tree::hcf3684abf360a4d0 
(0x5582aad24fa5)
   9: script::dom::node::Node::insert::h671b04c4cfccf944 (0x5582aadd2464)
  10: script::dom::node::Node::pre_insert::hcd7ab729106469bb (0x5582aadd17a2)
  11: script::dom::servoparser::insert::h023231dd0bb029f6 (0x5582aaa43e76)
  12: html5ever::tree_builder::TreeBuilder<Handle,Sink>::insert_element::hca0ba5a42a6a47a6 (0x5582aaf83abf)
  13: html5ever::tree_builder::TreeBuilder<Handle,Sink>::step::h6293e7deefac81b3 (0x5582aafaf4da)
  14: <html5ever::tree_builder::TreeBuilder<Handle,Sink> as html5ever::tokenizer::interface::TokenSink>::process_token::hab5b17dae1a3b
a13 (0x5582aaf400f3)
  15: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$13process_token17h22febec6a83030edE.llvm.8593906748013008867 (0x5582aa5b4f48)
  16: html5ever::tokenizer::Tokenizer<Sink>::emit_current_tag::h94be9c6ea3b74d2c (0x5582aa5b59d1)
  17: html5ever::tokenizer::Tokenizer<Sink>::step::ha584afa55150868b (0x5582aa5be263)
  18: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$3run17hb2242a222c9c977bE.llvm.8593906748013008867 (0x5582aa5b948a)
  19: script::dom::servoparser::html::Tokenizer::feed::h35b993cc2c817cb6 (0x5582aa5f70ea)
  20: script::dom::servoparser::ServoParser::do_parse_sync::h2df0c4cbffab8da1 (0x5582aaa3e88c)
  21: profile_traits::time::profile::h7d5f5cd3fcac1ddd (0x5582aad8dc2f)
  22: _ZN6script3dom11servoparser11ServoParser10parse_sync17h635a7466468263c3E.llvm.2781459336530327564 (0x5582aaa3e3cf)
  23: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response_chunk::h5501101f641181cb (0x558
2aaa42e7a)
  24: script::script_thread::ScriptThread::handle_msg_from_constellation::hf6b2b6295212cd4c (0x5582aac11758)
  25: _ZN6script13script_thread12ScriptThread11handle_msgs17hd3872b21368d5c05E.llvm.1792426946693623503 (0x5582aac0ba91)
  26: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h62e13d53efdc9cd6 (0x5582aad8bf17)
  27: std::sys_common::backtrace::__rust_begin_short_backtrace::h9104709f54b740ee (0x5582ab0c4862)
  28: _ZN3std9panicking3try7do_call17h491abb1e9f5ae8a8E.llvm.4875581894382309184 (0x5582ab1d3c95)
  29: __rust_maybe_catch_panic (0x5582ad225a59)
             at src/libpanic_unwind/lib.rs:82
  30: core::ops::function::FnOnce::call_once{{vtable.shim}}::hedfa976ea06cd78a (0x5582aa6e5175)
  31: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x5582ad20a58e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  32: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x5582ad224d7f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  33: start_thread (0x7fcfcee776b9)
  34: clone (0x7fcfcd71341c)
  35: <unknown> (0x0)
...

@gterzian hmm, we're not closing event loops? That's not good. What's meant to happen is that https://github.com/servo/servo/blob/master/components/constellation/constellation.rs#L3797 trim_history evicts old pipelines, and when there's no pipeline keeping en event loop alive it gets shut down by the Drop impl at

Oh yeah, you're right, the constellation only keeps a Weak<EventLoop>, which should be dropped once every pipeline in a given script-thread has been exited(since they keep an Rc<EventLoop>, and those are dropped when history is trimmed).

So this might be StylesheetLoader::load creating an ipc-channel, which gives "Too many open files" when a page loads lots of stylesheets?

Is webdriver not trimming the session history for some reason?

Could be, however it seems webdrive just ends-up using the "normal pipeline flow", so unless there is some webdriver specific logic that would prevent it, I think it should trim.

Maybe in general we shouldn't be creating an ipc-channel for each route, instead hiding the routing behind some interface that would clone a sender and route based on some sort of Id, re-using the same receiver?

I think it's the call to socketpair on each ipc::channel() that is causing this.

Something like:

GlobalScope {
    fn add_ipc_callback<T>(&self callback: Box<Fn(T)>) -> IPCHandle<T>;
}

struct IPCHandle<T> {
    callback_id: CallbackId, // based on the pipeline namespace
    sender: IpcSender<CallbackMsg<T>>
}

impl IPCHandle<T> {
    fn send(&self, msg: T) {
        self.sender.send((self.callback_id.clone(), msg));
    }
}

struct CallbackMsg<T>((CallbackId, T));

impl IpcRouter {
    fn handle_msg(&self, msg: OpaqueIpcMessage) {
        let (id, msg) = msg.to().unwrap();
        let callback = self.callbacks.remove(&id).unwrap();
        callback(msg);
    }
    fn add_callback(&self, callback: Box<Fn(T)>) -> IPCHandle<T> {
        let callback_id = CallbackId::new();
        self.callbacks.borrow_mut().insert(callback_id.clone(), callback);
        let sender = self.opaque_sender.clone().to();
        IPCHandle {
            callback_id, 
            sender,
        }
    }
}

Not completely sure about the generics sketched above, but I think one could make it work.

then for each global in a script-process, upon init, do:

let (router_sender, router_receiver) = ipc::channel().unwrap();
let ipc_router = IpcRouter::new(router_sender);
ROUTER.add_route(
     router_receiver.to_opaque(),
     Box::new(move |message| {
          ipc_router.handle_message(message);
     }),
);

// Store ipc_router on the global

In practice it would be used like:

impl<'a> StylesheetLoader<'a> {
    pub fn load(..) {
        let listener = NetworkListener {
            context,
            task_source,
            canceller: Some(canceller),
        };
        let callback = Box::new(move |message| {
            listener.notify_fetch(message);
        }),
        let handle = self.global().add_ipc_callback(callback);
        document.fetch_async(LoadType::Stylesheet(url), request, handle);
    }
}

There's a prototype on the way: https://github.com/servo/servo/pull/23909

I'm not issuing an explicit close, just doing RemoteWebDriver.get calls to load a new page in the same window. My understanding was that a close command would close the browser if there is no other window loaded. I'm no creating new windows so my assumption was that I can just load a new page in the same window and I don't see issues in other browsers.

@kanaka

That's right, yes "close" was the wrong wording on my part, and it would indeed close the browser.

You could give it a try with the above mentioned branch, although I have only done some type checking, I haven't actually run it and I wouldn't be surprised if it crashes.

Also, even if this could fix one issue, we might still get the same crash as the current one for some other reasons with another part of your test-suite, since the problem of creating an ipc-channel(which creates a pair of sockets) for each operation is prevalent across the script component.

Besides the current problem that seems to happen when lots of stylesheets are loaded, we could get it with images, websockets, and a few other DOM objects. But if the proposed approach work, it shouldn't be hard to make the switch to all of them.

However it would be interesting to see if with this change we don't get a crash anymore with script::stylesheet_loader::StylesheetLoader::load.

Not sure that #23909 will help, because AFAICT a new fd is generated every time an IPC channel is sent, even if it's an IPC channel that's been sent before. @kanaka can you test that PR and see if it fixes your problem?

@gterzian @asajeffrey I will test with #23909 and report my results. I'm may or may not be able to get to it in the next couple of hours but hopefully at least by tonight sometime.

@gterzian I built it but I get several stacktraces on startup and trying to connect via webdriver causes more exceptions and fails to stay connected. Here are the first couple of traces:

./mach run --release -z --webdriver=7002 --resolution=400x300



called `Option::unwrap()` on a `None` value (thread <unnamed>, at src/libcore/option.rs:378)
stack backtrace:
   0: servo::main::{{closure}}::h80e145b734dbd0e6 (0x55e752214f1f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55e755424985)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55e755424421)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55e755424305)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55e75544706c)
             at src/libcore/panicking.rs:85
   5: core::panicking::panic::h73f4a74a29ff704a (0x55e755446fab)
             at src/libcore/panicking.rs:49
   6: script::dom::globalscope::IpcScriptRouter::new::{{closure}}::h75ab22a00e7dfa42 (0x55e752e7d797)
   7: ipc_channel::router::Router::run::hbd6147421ea41418 (0x55e7553f0843)
   8: std::sys_common::backtrace::__rust_begin_short_backtrace::h6fd18e901e895f8b (0x55e7553f2566)
   9: _ZN3std9panicking3try7do_call17h1b444f82dc748186E.llvm.5578204044005263353 (0x55e7553f3fcb)
  10: __rust_maybe_catch_panic (0x55e75542e639)
             at src/libpanic_unwind/lib.rs:82
  11: core::ops::function::FnOnce::call_once{{vtable.shim}}::h8d0a0826419b9608 (0x55e7553f2afe)
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55e75541316e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  13: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55e75542d95f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  14: start_thread (0x7f6fd42d66b9)
  15: clone (0x7f6fd2b7241c)
  16: <unknown> (0x0)
[2019-08-02T19:04:15Z ERROR servo] called `Option::unwrap()` on a `None` value
Unexpected BHM channel panic in constellation: RecvError (thread Constellation, at src/libcore/result.rs:1051)
called `Result::unwrap()` on an `Err` value: RecvError (thread LayoutThread PipelineId { namespace_id: PipelineNamespaceId(1), index: 
PipelineIndex(1) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h80e145b734dbd0e6 (0x55e752214f1f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55e755424985)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55e755424421)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55e755424305)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55e75544706c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x55e755447166)
             at src/libcore/result.rs:1051
   6: constellation::constellation::Constellation<Message,LTF,STF>::run::h47db071c3d684e72 (0x55e7523d1897)
   7: std::sys_common::backtrace::__rust_begin_short_backtrace::hfad83bbe04097343 (0x55e7522c72c3)
   8: std::panicking::try::do_call::hd86cbdbacff4f086 (0x55e7522e0db5)
   9: __rust_maybe_catch_panic (0x55e75542e639)
             at src/libpanic_unwind/lib.rs:82
  10: core::ops::function::FnOnce::call_once{{vtable.shim}}::hdecb57578c643b74 (0x55e7522e0fd5)
  11: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55e75541316e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55e75542d95f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  13: start_thread (0x7f6fd42d66b9)
  14: clone (0x7f6fd2b7241c)
  15: <unknown> (0x0)

Related: https://github.com/servo/ipc-channel/issues/240

TL;DR sending the same ipc channel over ipc many times uses up an fd each time it's sent, so recycling the ipc channel doesn't help.

I have developed a very small script that can crash servo:

#!/bin/bash

set -e

SESSIONID=$(curl -X POST -d "{}" http://localhost:7002/session | jq -r ".value.sessionId")
echo "SESSIONID: ${SESSIONID}"

mkdir -p data

i=0
while true; do
    echo "Run ${i}"
    i=$(( i + 1 ))
    curl -s -X POST -d '{"url": "http://localhost:9080/test3.html"}' http://localhost:7002/session/${SESSIONID}/url > /dev/null
    curl -s http://localhost:7002/session/${SESSIONID}/screenshot | jq -r ".value" | base64 -d > data/test${i}.png
done

The HTML and CSS file that I used, along with the script the commands I used to start the various pieces is captured at the following gist: https://gist.github.com/kanaka/119f5ed9841e23e35d07e8944cca6aa7

I built servo just now from d2856ce8aeca and then ran the test against it. I got a crash on the 819th page load. The crash is not a "too many open files" crash, but I'm posting it here because I suspect it may be related; it's simplified version of the process I used earlier and it crashed after a similar order of page loads. I've included the crash below. If you determine that this is likely a different test case I'm happy to file a new ticket.

$ ./mach run --release -z --webdriver=7002 --resolution=400x300



assertion failed: self.is_double() (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }, 
at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/src/jsval.rs:439)
stack backtrace:
   0: servo::main::{{closure}}::ha22eb134993c4424 (0x555692dcbe8f)
   1: std::panicking::rust_panic_with_hook::hec63884fa234b28d (0x555695fc4ae5)
             at src/libstd/panicking.rs:481
   2: std::panicking::begin_panic::hdcff812b537809ad (0x555693f72804)
   3: script::dom::windowproxy::trace::h0eb5d450dfe2e522 (0x555693af3cf7)
   4: _ZNK2js5Class7doTraceEP8JSTracerP8JSObject (0x5556945a3366)
             at /data/joelm/personal/UTA/dissertation/servo/servo-20190724.git/target/release/build/mozjs_sys-79c54c059d530e2a/out/dis
t/include/js/Class.h:872
      _ZL13CallTraceHookIZN2js14TenuringTracer11traceObjectEP8JSObjectE4$_11EPNS0_12NativeObjectEOT_P8JSTracerS3_15CheckGeneration
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:1545
      _ZN2js14TenuringTracer11traceObjectEP8JSObject
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2907
   5: _ZL14TraceWholeCellRN2js14TenuringTracerEP8JSObject (0x5556945a315b)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2798
      _ZL18TraceBufferedCellsI8JSObjectEvRN2js14TenuringTracerEPNS1_2gc5ArenaEPNS4_12ArenaCellSetE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2835
      _ZN2js2gc11StoreBuffer15WholeCellBuffer5traceERNS_14TenuringTracerE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2852
   6: _ZN2js2gc11StoreBuffer15traceWholeCellsERNS_14TenuringTracerE (0x5556945ba215)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/StoreBuffer.h:479
      _ZN2js7Nursery12doCollectionEN2JS8GCReasonERNS_2gc16TenureCountCacheE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Nursery.cpp:946
   7: _ZN2js7Nursery7collectEN2JS8GCReasonE (0x5556945b94a5)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Nursery.cpp:783
   8: _ZN2js2gc9GCRuntime7minorGCEN2JS8GCReasonENS_7gcstats9PhaseKindE (0x55569459aa83)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/GC.cpp:7787
   9: _ZN2js2gc9GCRuntime13gcIfRequestedEv (0x55569457e074)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/GC.cpp:7846
  10: _ZN2js2gc9GCRuntime22gcIfNeededAtAllocationEP9JSContext (0x55569457a63c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:343
      _ZN2js2gc9GCRuntime19checkAllocatorStateILNS_7AllowGCE1EEEbP9JSContextNS0_9AllocKindE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:300
      _ZN2js14AllocateObjectILNS_7AllowGCE1EEEP8JSObjectP9JSContextNS_2gc9AllocKindEmNS6_11InitialHeapEPKNS_5ClassE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:55
  11: _ZN2js11ProxyObject6createEP9JSContextPKNS_5ClassEN2JS6HandleINS_11TaggedProtoEEENS_2gc9AllocKindENS_13NewObjectKindE (0x5556943
4b63f)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/ProxyObject.cpp:199
  12: _ZN2js11ProxyObject3NewEP9JSContextPKNS_16BaseProxyHandlerEN2JS6HandleINS6_5ValueEEENS_11TaggedProtoERKNS_12ProxyOptionsE (0x555
69434b0e4)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/ProxyObject.cpp:100
  13: _ZN2js14NewProxyObjectEP9JSContextPKNS_16BaseProxyHandlerEN2JS6HandleINS5_5ValueEEEP8JSObjectRKNS_12ProxyOptionsE (0x5556944bd7a
1)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/Proxy.cpp:779
      _ZN2js7Wrapper3NewEP9JSContextP8JSObjectPKS0_RKNS_14WrapperOptionsE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/Wrapper.cpp:282
  14: WrapperNew (0x5556941eb3d7)
  15: _ZN2JS11Compartment18getOrCreateWrapperEP9JSContextNS_6HandleIP8JSObjectEENS_13MutableHandleIS5_EE (0x55569427b51c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/Compartment.cpp:268
  16: _ZN2JS11Compartment6rewrapEP9JSContextNS_13MutableHandleIP8JSObjectEENS_6HandleIS5_EE (0x55569427b901)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/Compartment.cpp:357
  17: _ZN2js12RemapWrapperEP9JSContextP8JSObjectS3_ (0x5556944acf0e)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/CrossCompartmentWrapper.cpp:582
  18: _ZN2js25RemapAllWrappersForObjectEP9JSContextP8JSObjectS3_ (0x5556944ad462)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/CrossCompartmentWrapper.cpp:633
  19: _Z19JS_TransplantObjectP9JSContextN2JS6HandleIP8JSObjectEES5_ (0x55569447ec9c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/jsapi.cpp:731
  20: _ZN6script3dom11windowproxy11WindowProxy10set_window17h52ce1951a2216063E.llvm.1583787634904419619 (0x555693af1fd3)
  21: script::dom::window::Window::resume::h6e86d312ad70993d (0x5556934f5dbd)
  22: script::script_thread::ScriptThread::load::hf15b75fcfcd60129 (0x5556939e0c6e)
  23: script::script_thread::ScriptThread::handle_page_headers_available::ha41037a7982b0452 (0x5556939dc581)
  24: std::thread::local::LocalKey<T>::with::h3bc7497eb1a5059a (0x55569386eefd)
  25: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response::h72bfea20576003c5 (0x5556937fe
d92)
  26: script::script_thread::ScriptThread::handle_msg_from_constellation::h37e0394a1b85329d (0x5556939cb48f)
  27: _ZN6script13script_thread12ScriptThread11handle_msgs17hac0c1aacd24f0633E.llvm.18161580367316437218 (0x5556939c57f8)   [186/1983]
  28: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h1b629c9f7392a890 (0x555693b450b7)
  29: std::sys_common::backtrace::__rust_begin_short_backtrace::h0097fbef27bdc8ce (0x555693e7a4f2)
  30: _ZN3std9panicking3try7do_call17h47262c4eba228992E.llvm.2547905980101764475 (0x555693f89d45)
  31: __rust_maybe_catch_panic (0x555695fce789)
             at src/libpanic_unwind/lib.rs:80
  32: core::ops::function::FnOnce::call_once{{vtable.shim}}::h51b7628431ca69a2 (0x5556934a3285)
  33: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h82a57145aa4239a7 (0x555695fb32ee)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
  34: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h167c1ef971e93086 (0x555695fcdaaf)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
      std::sys_common::thread::start_thread::h739b9b99c7f25b24
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h79a2f27ba62f96ae
             at src/libstd/sys/unix/thread.rs:79
  35: start_thread (0x7fb6f6dd06b9)
  36: clone (0x7fb6f566c41c)
  37: <unknown> (0x0)
[2019-08-04T03:33:58Z ERROR servo] assertion failed: self.is_double()
Stack trace for thread "ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }"
stack backtrace:
   0: servo::install_crash_handler::handler::h3244c709fa5cd0dc (0x555692dcb0d0)
   1: _ZL15WasmTrapHandleriP9siginfo_tPv (0x5556948c718e)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/wasm/WasmSignalHandlers.cpp:967
   2: <unknown> (0x7fb6f6dda38f)
   3: script::dom::windowproxy::trace::h0eb5d450dfe2e522 (0x555693af3d32)
   4: _ZNK2js5Class7doTraceEP8JSTracerP8JSObject (0x5556945a3366)
             at /data/joelm/personal/UTA/dissertation/servo/servo-20190724.git/target/release/build/mozjs_sys-79c54c059d530e2a/out/dis
t/include/js/Class.h:872
      _ZL13CallTraceHookIZN2js14TenuringTracer11traceObjectEP8JSObjectE4$_11EPNS0_12NativeObjectEOT_P8JSTracerS3_15CheckGeneration
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:1545
      _ZN2js14TenuringTracer11traceObjectEP8JSObject
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2907
   5: _ZL14TraceWholeCellRN2js14TenuringTracerEP8JSObject (0x5556945a315b)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2798
      _ZL18TraceBufferedCellsI8JSObjectEvRN2js14TenuringTracerEPNS1_2gc5ArenaEPNS4_12ArenaCellSetE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2835
      _ZN2js2gc11StoreBuffer15WholeCellBuffer5traceERNS_14TenuringTracerE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2852
   6: _ZN2js2gc11StoreBuffer15traceWholeCellsERNS_14TenuringTracerE (0x5556945ba215)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/StoreBuffer.h:479
      _ZN2js7Nursery12doCollectionEN2JS8GCReasonERNS_2gc16TenureCountCacheE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Nursery.cpp:946
   7: _ZN2js7Nursery7collectEN2JS8GCReasonE (0x5556945b94a5)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Nursery.cpp:783
   8: _ZN2js2gc9GCRuntime7minorGCEN2JS8GCReasonENS_7gcstats9PhaseKindE (0x55569459aa83)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/GC.cpp:7787
   9: _ZN2js2gc9GCRuntime13gcIfRequestedEv (0x55569457e074)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/GC.cpp:7846
  10: _ZN2js2gc9GCRuntime22gcIfNeededAtAllocationEP9JSContext (0x55569457a63c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:343
      _ZN2js2gc9GCRuntime19checkAllocatorStateILNS_7AllowGCE1EEEbP9JSContextNS0_9AllocKindE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:300
      _ZN2js14AllocateObjectILNS_7AllowGCE1EEEP8JSObjectP9JSContextNS_2gc9AllocKindEmNS6_11InitialHeapEPKNS_5ClassE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:55
  11: _ZN2js11ProxyObject6createEP9JSContextPKNS_5ClassEN2JS6HandleINS_11TaggedProtoEEENS_2gc9AllocKindENS_13NewObjectKindE (0x5556943
4b63f)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/ProxyObject.cpp:199
  12: _ZN2js11ProxyObject3NewEP9JSContextPKNS_16BaseProxyHandlerEN2JS6HandleINS6_5ValueEEENS_11TaggedProtoERKNS_12ProxyOptionsE (0x555
69434b0e4)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/ProxyObject.cpp:100
  13: _ZN2js14NewProxyObjectEP9JSContextPKNS_16BaseProxyHandlerEN2JS6HandleINS5_5ValueEEEP8JSObjectRKNS_12ProxyOptionsE (0x5556944bd7a
1)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/Proxy.cpp:779
      _ZN2js7Wrapper3NewEP9JSContextP8JSObjectPKS0_RKNS_14WrapperOptionsE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/Wrapper.cpp:282
  14: WrapperNew (0x5556941eb3d7)
  15: _ZN2JS11Compartment18getOrCreateWrapperEP9JSContextNS_6HandleIP8JSObjectEENS_13MutableHandleIS5_EE (0x55569427b51c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/Compartment.cpp:268
  16: _ZN2JS11Compartment6rewrapEP9JSContextNS_13MutableHandleIP8JSObjectEENS_6HandleIS5_EE (0x55569427b901)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/Compartment.cpp:357
  17: _ZN2js12RemapWrapperEP9JSContextP8JSObjectS3_ (0x5556944acf0e)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/CrossCompartmentWrapper.cpp:582
  18: _ZN2js25RemapAllWrappersForObjectEP9JSContextP8JSObjectS3_ (0x5556944ad462)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/CrossCompartmentWrapper.cpp:633
  19: _Z19JS_TransplantObjectP9JSContextN2JS6HandleIP8JSObjectEES5_ (0x55569447ec9c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/jsapi.cpp:731
  20: _ZN6script3dom11windowproxy11WindowProxy10set_window17h52ce1951a2216063E.llvm.1583787634904419619 (0x555693af1fd3)
  21: script::dom::window::Window::resume::h6e86d312ad70993d (0x5556934f5dbd)
  22: script::script_thread::ScriptThread::load::hf15b75fcfcd60129 (0x5556939e0c6e)
  23: script::script_thread::ScriptThread::handle_page_headers_available::ha41037a7982b0452 (0x5556939dc581)
  24: std::thread::local::LocalKey<T>::with::h3bc7497eb1a5059a (0x55569386eefd)
  25: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response::h72bfea20576003c5 (0x5556937fe
d92)
  26: script::script_thread::ScriptThread::handle_msg_from_constellation::h37e0394a1b85329d (0x5556939cb48f)
  27: _ZN6script13script_thread12ScriptThread11handle_msgs17hac0c1aacd24f0633E.llvm.18161580367316437218 (0x5556939c57f8)
  28: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h1b629c9f7392a890 (0x555693b450b7)
  29: std::sys_common::backtrace::__rust_begin_short_backtrace::h0097fbef27bdc8ce (0x555693e7a4f2)
  30: _ZN3std9panicking3try7do_call17h47262c4eba228992E.llvm.2547905980101764475 (0x555693f89d45)
  31: __rust_maybe_catch_panic (0x555695fce789)
             at src/libpanic_unwind/lib.rs:80
  32: core::ops::function::FnOnce::call_once{{vtable.shim}}::h51b7628431ca69a2 (0x5556934a3285)
  33: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h82a57145aa4239a7 (0x555695fb32ee)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
  34: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h167c1ef971e93086 (0x555695fcdaaf)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
      std::sys_common::thread::start_thread::h739b9b99c7f25b24
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h79a2f27ba62f96ae
             at src/libstd/sys/unix/thread.rs:79
  35: start_thread (0x7fb6f6dd06b9)
  36: clone (0x7fb6f566c41c)
  37: <unknown> (0x0)
Servo exited with return value 4

I ran it again and it crashed at exactly the same point (819th page load) and with the same traces. I'm going to try and make the page a touch more complex and see if that affects anything as that might give a clue about where the crash is being triggered.

Okay, so with the more complicated test case (addition of second stylesheet file), it now panics with "Too many open files" traces after the 696th load. I've updated the gist with the new rend.css stylesheet and updated the HTML file to include it. Here is the trace/panic that was triggered this time:

$ ./mach run --release -z --webdriver=7002 --resolution=400x300



called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" } (thread ScriptThread Pipelin
eId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }, at src/libcore/result.rs:1084)
stack backtrace:
   0: servo::main::{{closure}}::ha22eb134993c4424 (0x556fbdcc8e8f)
   1: std::panicking::rust_panic_with_hook::hec63884fa234b28d (0x556fc0ec1ae5)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h1272190bb9afc9ca (0x556fc0ec1581)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x556fc0ec1465)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::h25bfecd575ec5ea2 (0x556fc0ee41bc)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::hc09c6b44d3dd6f08 (0x556fc0ee42b6)
             at src/libcore/result.rs:1084
   6: script::fetch::FetchCanceller::initialize::hf310501f87637b73 (0x556fbe8b6c85)
   7: script::document_loader::DocumentLoader::fetch_async_background::h8687ec5c482a6f20 (0x556fbee1d09f)
   8: script::document_loader::DocumentLoader::fetch_async::h915b5e1878d99c28 (0x556fbee1cffc)
   9: script::dom::document::Document::fetch_async::h4749657441058e6d (0x556fbe857c18)
  10: script::stylesheet_loader::StylesheetLoader::load::hb13f68bb9bd0715c (0x556fbe8e6c5e)
  11: script::dom::htmllinkelement::HTMLLinkElement::handle_stylesheet_url::hbfdd56f3b098b5fc (0x556fbe9dc06f)
  12: <script::dom::htmllinkelement::HTMLLinkElement as script::dom::virtualmethods::VirtualMethods>::bind_to_tree::h0a3ee608fcdb3fa4 
(0x556fbe9db925)
  13: script::dom::node::Node::insert::h55baed3a814aa657 (0x556fbea880b4)
  14: script::dom::node::Node::pre_insert::hfd38fed65d3bff83 (0x556fbea87412)
  15: script::dom::servoparser::insert::hbe38dce40e96d280 (0x556fbe6ffd46)
  16: html5ever::tree_builder::TreeBuilder<Handle,Sink>::insert_element::h8083b8e2fe4adefe (0x556fbec39d1f)
  17: html5ever::tree_builder::TreeBuilder<Handle,Sink>::step::hd2875f6f9563d33d (0x556fbec656ea)
  18: <html5ever::tree_builder::TreeBuilder<Handle,Sink> as html5ever::tokenizer::interface::TokenSink>::process_token::he7420a061148e
71b (0x556fbebf4f63)
  19: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$13process_token17hff7c4f1817192eebE.llvm.18110466598814287016 (0x556fbe271988)
  20: html5ever::tokenizer::Tokenizer<Sink>::emit_current_tag::he37613754012cdc7 (0x556fbe2722e1)
  21: html5ever::tokenizer::Tokenizer<Sink>::step::h5bfde99192403598 (0x556fbe27ab73)
  22: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$3run17h161c6703aa6bbb12E.llvm.18110466598814287016 (0x556fbe27604a)
  23: script::dom::servoparser::html::Tokenizer::feed::hb6ebdb521cf099cc (0x556fbe2b327a)
  24: script::dom::servoparser::ServoParser::do_parse_sync::h30e5abb471f6aaa8 (0x556fbe6fa7ac)
  25: profile_traits::time::profile::h19e93b2dd772b80a (0x556fbea42b8f)
  26: _ZN6script3dom11servoparser11ServoParser10parse_sync17h9ef69ffecf72c88fE.llvm.8010111341268598948 (0x556fbe6fa305)
  27: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response_chunk::h82f963e90fd00dac (0x556
fbe6fed4a)
  28: script::script_thread::ScriptThread::handle_msg_from_constellation::h37e0394a1b85329d (0x556fbe8c8368)
  29: _ZN6script13script_thread12ScriptThread11handle_msgs17hac0c1aacd24f0633E.llvm.18161580367316437218 (0x556fbe8c27f8)
  30: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h1b629c9f7392a890 (0x556fbea420b7)
  31: std::sys_common::backtrace::__rust_begin_short_backtrace::h0097fbef27bdc8ce (0x556fbed774f2)
  32: _ZN3std9panicking3try7do_call17h47262c4eba228992E.llvm.2547905980101764475 (0x556fbee86d45)
  33: __rust_maybe_catch_panic (0x556fc0ecb789)
             at src/libpanic_unwind/lib.rs:80
  34: core::ops::function::FnOnce::call_once{{vtable.shim}}::h51b7628431ca69a2 (0x556fbe3a0285)
  35: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h82a57145aa4239a7 (0x556fc0eb02ee)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
  36: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h167c1ef971e93086 (0x556fc0ecaaaf)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
      std::sys_common::thread::start_thread::h739b9b99c7f25b24
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h79a2f27ba62f96ae
             at src/libstd/sys/unix/thread.rs:79
  37: start_thread (0x7f2a1d9536b9)
  38: clone (0x7f2a1c1ef41c)
  39: <unknown> (0x0)
[2019-08-04T04:42:51Z ERROR servo] called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open fi
les" }
Pipeline script content process shutdown chan: Os { code: 24, kind: Other, message: "Too many open files" } (thread Constellation, at 
src/libcore/result.rs:1084)
stack backtrace:
   0: servo::main::{{closure}}::ha22eb134993c4424 (0x556fbdcc8e8f)
   1: std::panicking::rust_panic_with_hook::hec63884fa234b28d (0x556fc0ec1ae5)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h1272190bb9afc9ca (0x556fc0ec1581)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x556fc0ec1465)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::h25bfecd575ec5ea2 (0x556fc0ee41bc)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::hc09c6b44d3dd6f08 (0x556fc0ee42b6)
             at src/libcore/result.rs:1084
   6: constellation::pipeline::Pipeline::spawn::h5d0c2ec40f26cd14 (0x556fbde4e4a8)
   7: constellation::constellation::Constellation<Message,LTF,STF>::new_pipeline::h52453b5b66a96577 (0x556fbde63c91)
   8: constellation::constellation::Constellation<Message,LTF,STF>::handle_panic::hc5c580ab97accd1e (0x556fbde627ab)
   9: constellation::constellation::Constellation<Message,LTF,STF>::handle_log_entry::h90f95fbfb6405087 (0x556fbde65f01)
  10: constellation::constellation::Constellation<Message,LTF,STF>::handle_request_from_compositor::h84ca30df4fac3450 (0x556fbde7b45d)
  11: constellation::constellation::Constellation<Message,LTF,STF>::run::hab0833f05fdcdb29 (0x556fbde835a9)
  12: std::sys_common::backtrace::__rust_begin_short_backtrace::h8e5b89d53f23b04f (0x556fbdd7b023)
  13: std::panicking::try::do_call::hf7a763f365204be2 (0x556fbdd943c5)
  14: __rust_maybe_catch_panic (0x556fc0ecb789)
             at src/libpanic_unwind/lib.rs:80
  15: core::ops::function::FnOnce::call_once{{vtable.shim}}::h5abd434a6e2da7db (0x556fbdd945e5)
  16: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h82a57145aa4239a7 (0x556fc0eb02ee)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
  17: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h167c1ef971e93086 (0x556fc0ecaaaf)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
      std::sys_common::thread::start_thread::h739b9b99c7f25b24
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h79a2f27ba62f96ae
             at src/libstd/sys/unix/thread.rs:79
  18: start_thread (0x7f2a1d9536b9)
  19: clone (0x7f2a1c1ef41c)
  20: <unknown> (0x0)
[2019-08-04T04:42:51Z ERROR servo] Pipeline script content process shutdown chan: Os { code: 24, kind: Other, message: "Too many open 
files" }
called `Result::unwrap()` on an `Err` value: RecvError (thread LayoutThread PipelineId { namespace_id: PipelineNamespaceId(1), index: 
PipelineIndex(677) }, at src/libcore/result.rs:1084)
stack backtrace:
   0: servo::main::{{closure}}::ha22eb134993c4424 (0x556fbdcc8e8f)
   1: std::panicking::rust_panic_with_hook::hec63884fa234b28d (0x556fc0ec1ae5)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h1272190bb9afc9ca (0x556fc0ec1581)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x556fc0ec1465)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::h25bfecd575ec5ea2 (0x556fc0ee41bc)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::hc09c6b44d3dd6f08 (0x556fc0ee42b6)
             at src/libcore/result.rs:1084
   6: layout_thread::LayoutThread::start::h2c49640050891be9 (0x556fbe0e7303)
   7: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h0ca43ddf4d481a98 (0x556fbe15cde3)
   8: std::sys_common::backtrace::__rust_begin_short_backtrace::h662b9825ec83f9f6 (0x556fbe211774)
   9: _ZN3std9panicking3try7do_call17hc64d38bd13a9e5fbE.llvm.13006379931387932559 (0x556fbe17d5a3)
  10: __rust_maybe_catch_panic (0x556fc0ecb789)
             at src/libpanic_unwind/lib.rs:80
  11: core::ops::function::FnOnce::call_once{{vtable.shim}}::h27113effb06c5b11 (0x556fbe2123c2)
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h82a57145aa4239a7 (0x556fc0eb02ee)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
  13: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h167c1ef971e93086 (0x556fc0ecaaaf)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
      std::sys_common::thread::start_thread::h739b9b99c7f25b24
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h79a2f27ba62f96ae
             at src/libstd/sys/unix/thread.rs:79
  14: start_thread (0x7f2a1d9536b9)
  15: clone (0x7f2a1c1ef41c)
  16: <unknown> (0x0)
[2019-08-04T04:42:51Z ERROR servo] called `Result::unwrap()` on an `Err` value: RecvError
called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders for this socket closed" }) (thread
 StorageManager, at src/libcore/result.rs:1084)
stack backtrace:
   0: servo::main::{{closure}}::ha22eb134993c4424 (0x556fbdcc8e8f)
   1: std::panicking::rust_panic_with_hook::hec63884fa234b28d (0x556fc0ec1ae5)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h1272190bb9afc9ca (0x556fc0ec1581)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x556fc0ec1465)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::h25bfecd575ec5ea2 (0x556fc0ee41bc)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::hc09c6b44d3dd6f08 (0x556fc0ee42b6)
             at src/libcore/result.rs:1084
   6: net::storage_thread::StorageManager::start::h4a6d83404313abbe (0x556fbfcb570c)
   7: std::sys_common::backtrace::__rust_begin_short_backtrace::haa3dfb6db761201d (0x556fbfc08712)
   8: _ZN3std9panicking3try7do_call17h92e35334f31bfdfaE.llvm.9729766681674225750 (0x556fbfc5df5b)
   9: __rust_maybe_catch_panic (0x556fc0ecb789)
             at src/libpanic_unwind/lib.rs:80
  10: core::ops::function::FnOnce::call_once{{vtable.shim}}::hdaa65cd474a13c4d (0x556fbfcedd8f)
  11: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h82a57145aa4239a7 (0x556fc0eb02ee)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h167c1ef971e93086 (0x556fc0ecaaaf)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
      std::sys_common::thread::start_thread::h739b9b99c7f25b24
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h79a2f27ba62f96ae
             at src/libstd/sys/unix/thread.rs:79
  13: start_thread (0x7f2a1d9536b9)
  14: clone (0x7f2a1c1ef41c)
  15: <unknown> (0x0)
[2019-08-04T04:42:51Z ERROR servo] called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders
 for this socket closed" })

@kanaka Thanks for the test case!

I ran it locally, and didn't get the crash(at least not until the 2000s when I stopped), but I'm on mac so it might not be as easy to reproduce.

Interesting to see from the last traceback that it doesn't crash at script::stylesheet_loader::StylesheetLoader::load anymore, however it does crash at script::fetch::FetchCanceller::initialize, which happens a bit later in the same call, and which also creates an ipc channel.

So I've turned off fetch cancelling in https://github.com/servo/servo/pull/23909, and expanded the use case across alsmost every place in script where we create an ipc-channel(with the notable exception of the image-cache, which is harder to do because the images are in shared-memory...).

(I think the fetch canceller could be re-implemented with shared memory, maybe just have a byte be 0 or 1?)

sending the same ipc channel over ipc many times uses up an fd each time it's sent, so recycling the ipc channel doesn't help

@asajeffrey Thanks for pointing that out, I've had a look at the various issues you linked to.

Ok so each time we use the ROUTER in script, we create 3 fds? Since we create a channel, and then we ipc the sender to fetch or elsewhere.

In that case, the "shared router" approach still saves us 2/3 of fds for each callback, since we only need to ipc the sender half, and don't need to re-create a channel each time.

What would more feasible to "fix" inside ipc-channel, avoiding the extra fd upon sending of a sender, or avoiding the two fds when creating the channel?

And could a change like the below, even if just at the interface level, give use more flexibility to optimize what an "ipc callback" is later down the road?

For example an IpcHandle could, instead of internally containing a sender, contain something else allowing it to create a sender after having been transferred into the process where it will be used?

Screen Shot 2019-08-05 at 10 38 33 AM

sending the same ipc channel over ipc many times uses up an fd each time it's sent, so recycling the ipc channel doesn't help

@asajeffrey

I think the problem we're faced here might be less one of optimizing the underlying OS operations, and rather rationalizing the higher-level interfaces, so that we don't create a channel for each operation, and also avoid sending a clone of a sender across process for each operation.

I can imagine the IpcHandle only including it's callback id, not an actual sender.

Then, each operation would consist of sending a handle across process, with each handle corresponding to a callback on the "receiving" process.

The SharedRouter would have to be paired with some sort of SharedDispatcher, which would use a single sender per process it is paired with, and the actual underlying "message" would be routed using the CallbackId.

So you'd have only one channel with fetch per script-process, and each channel would be used for all operations between a given script process and fetch(resource process). And when a callback is registered, no sender would have to be sent across process, we would only send the handle containing a unique callback id.

And we should somehow do this while retaining the ability for script to just register a callback "locally", like is currently done with the router(versus having to go through the main event-loop).

Opening a dedicated issue for this...

@gterzian well that's unfortunate it didn't reproduce for you. Although I suppose it's not too surprising that the behavior of open files limits/handling is significantly different on OS X.

It's pretty easy for me to reproduce a crash. I made 10 copies of each of the normalize.css and rend.css files and include them in the test HTML. It now crashes after 155 loads for me and takes less than 4 minutes. That's in the noise compared to time to rebuild a new git version/tag. I'm happy to be the build/test "automation" for a while since this high priority for me. Anything else I can do besides building/testing and posting stack traces?

The latest trace (after 155 loads of the HTML file with 20 CSS includes) has yet another signature (although the final line does mention too many open files):

$ ./mach run --release -z --webdriver=7002 --resolution=400x300                                                                                                                                                              [25/1838]



index out of bounds: the len is 1 but the index is 1 (thread ResourceManager, at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/libcore/slice/mod.rs:2696)
called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" } (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }, at src/libcore/result.rs:1084)
stack backtrace:
   0: servo::main::{{closure}}::ha22eb134993c4424 (0x55e52757ee8f)
   1: std::panicking::rust_panic_with_hook::hec63884fa234b28d (0x55e52a777ae5)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h1272190bb9afc9ca (0x55e52a777581)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55e52a777465)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::h25bfecd575ec5ea2 (0x55e52a79a1bc)
             at src/libcore/panicking.rs:85
   5: core::panicking::panic_bounds_check::h0f74a512f14c92eb (0x55e52a79a174)
             at src/libcore/panicking.rs:61
   6: std::thread::local::LocalKey<T>::with::h4c5abe575f5a6d42 (0x55e52959c25e)
   7: <&mut bincode::de::Deserializer<R,O> as serde::de::Deserializer>::deserialize_option::h5b5c756e4e2c376a (0x55e529525cbf)
   8: <&mut bincode::de::Deserializer<R,O> as serde::de::VariantAccess>::tuple_variant::h2f734c1b3c5ad465 (0x55e529537586)
   9: <&mut bincode::de::Deserializer<R,O> as serde::de::VariantAccess>::tuple_variant::ha2dd0101cdbdbe28 (0x55e5295421ec)
  10: <net_traits::_IMPL_DESERIALIZE_FOR_CoreResourceMsg::<impl serde::de::Deserialize for net_traits::CoreResourceMsg>::deserialize::__Visitor as serde::de::Visitor>::visit_enum::h03259753b02c44fe (0x55e529511543)
  11: std::thread::local::LocalKey<T>::with::hf3db807819165627 (0x55e5295a0399)
  12: net::resource_thread::ResourceChannelManager::start::hdf906af053ccbf40 (0x55e5294fd6b9)
  13: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h2ef8ee0a6562427f (0x55e5295c5de9)
  14: std::sys_common::backtrace::__rust_begin_short_backtrace::h32fbf41692eccddf (0x55e5294bcf33)
  15: _ZN3std9panicking3try7do_call17hf2b65c8c835e4623E.llvm.9729766681674225750 (0x55e5295140f5)
  16: __rust_maybe_catch_panic (0x55e52a781789)
             at src/libpanic_unwind/lib.rs:80
  17: core::ops::function::FnOnce::call_once{{vtable.shim}}::h6a2a4ed203f25f75 (0x55e5295a38b5)
  18: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h82a57145aa4239a7 (0x55e52a7662ee)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
  19: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h167c1ef971e93086 (0x55e52a780aaf)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
      std::sys_common::thread::start_thread::h739b9b99c7f25b24
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h79a2f27ba62f96ae
             at src/libstd/sys/unix/thread.rs:79
  20: start_thread (0x7f69fefd36b9)
  21: clone (0x7f69fd86f41c)
  22: <unknown> (0x0)
[2019-08-05T04:47:46Z ERROR servo] index out of bounds: the len is 1 but the index is 1
stack backtrace:
   0: servo::main::{{closure}}::ha22eb134993c4424 (0x55e52757ee8f)
   1: std::panicking::rust_panic_with_hook::hec63884fa234b28d (0x55e52a777ae5)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h1272190bb9afc9ca (0x55e52a777581)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55e52a777465)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::h25bfecd575ec5ea2 (0x55e52a79a1bc)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::hc09c6b44d3dd6f08 (0x55e52a79a2b6)
             at src/libcore/result.rs:1084
   6: script::fetch::FetchCanceller::initialize::hf310501f87637b73 (0x55e52816cc85)
   7: script::document_loader::DocumentLoader::fetch_async_background::h8687ec5c482a6f20 (0x55e5286d309f)
   8: script::document_loader::DocumentLoader::fetch_async::h915b5e1878d99c28 (0x55e5286d2ffc)
   9: script::dom::document::Document::fetch_async::h4749657441058e6d (0x55e52810dc18)
  10: script::stylesheet_loader::StylesheetLoader::load::hb13f68bb9bd0715c (0x55e52819cc5e)
  11: script::dom::htmllinkelement::HTMLLinkElement::handle_stylesheet_url::hbfdd56f3b098b5fc (0x55e52829206f)
  12: <script::dom::htmllinkelement::HTMLLinkElement as script::dom::virtualmethods::VirtualMethods>::bind_to_tree::h0a3ee608fcdb3fa4 (0x55e528291925)
  13: script::dom::node::Node::insert::h55baed3a814aa657 (0x55e52833e0b4)
  14: script::dom::node::Node::pre_insert::hfd38fed65d3bff83 (0x55e52833d412)
  15: script::dom::servoparser::insert::hbe38dce40e96d280 (0x55e527fb5d46)
  16: html5ever::tree_builder::TreeBuilder<Handle,Sink>::insert_element::h8083b8e2fe4adefe (0x55e5284efd1f)
  17: html5ever::tree_builder::TreeBuilder<Handle,Sink>::step::hd2875f6f9563d33d (0x55e52851b6ea)
  18: <html5ever::tree_builder::TreeBuilder<Handle,Sink> as html5ever::tokenizer::interface::TokenSink>::process_token::he7420a061148e71b (0x55e5284aaf63)
  19: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$13process_token17hff7c4f1817192eebE.llvm.18110466598814287016 (0x55e527b27988)
  20: html5ever::tokenizer::Tokenizer<Sink>::emit_current_tag::he37613754012cdc7 (0x55e527b282e1)
  21: html5ever::tokenizer::Tokenizer<Sink>::step::h5bfde99192403598 (0x55e527b30b73)
  22: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$3run17h161c6703aa6bbb12E.llvm.18110466598814287016 (0x55e527b2c04a)
  23: script::dom::servoparser::html::Tokenizer::feed::hb6ebdb521cf099cc (0x55e527b6927a)
  24: script::dom::servoparser::ServoParser::do_parse_sync::h30e5abb471f6aaa8 (0x55e527fb07ac)
  25: profile_traits::time::profile::h19e93b2dd772b80a (0x55e5282f8b8f)
  26: _ZN6script3dom11servoparser11ServoParser10parse_sync17h9ef69ffecf72c88fE.llvm.8010111341268598948 (0x55e527fb0305)
  27: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response_chunk::h82f963e90fd00dac (0x55e527fb4d4a)
  28: script::script_thread::ScriptThread::handle_msg_from_constellation::h37e0394a1b85329d (0x55e52817e368)
  29: _ZN6script13script_thread12ScriptThread11handle_msgs17hac0c1aacd24f0633E.llvm.18161580367316437218 (0x55e5281787f8)
  30: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h1b629c9f7392a890 (0x55e5282f80b7)
  31: std::sys_common::backtrace::__rust_begin_short_backtrace::h0097fbef27bdc8ce (0x55e52862d4f2)
  32: _ZN3std9panicking3try7do_call17h47262c4eba228992E.llvm.2547905980101764475 (0x55e52873cd45)
  33: __rust_maybe_catch_panic (0x55e52a781789)
             at src/libpanic_unwind/lib.rs:80
  34: core::ops::function::FnOnce::call_once{{vtable.shim}}::h51b7628431ca69a2 (0x55e527c56285)
  35: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h82a57145aa4239a7 (0x55e52a7662ee)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
  36: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h167c1ef971e93086 (0x55e52a780aaf)
             at /rustc/dddb7fca09dc817ba275602b950bb81a9032fb6d/src/liballoc/boxed.rs:770
      std::sys_common::thread::start_thread::h739b9b99c7f25b24
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h79a2f27ba62f96ae
             at src/libstd/sys/unix/thread.rs:79
  37: start_thread (0x7f69fefd36b9)
  38: clone (0x7f69fd86f41c)
  39: <unknown> (0x0)
[2019-08-05T04:47:46Z ERROR servo] called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" }

Thanks!

index out of bounds: the len is 1 but the index is 1

I've seen this one before in the context of shared memory with the image cache, although now it seems to come up with _IMPL_DESERIALIZE_FOR_CoreResourceMsg.

Does anyone know if we have some shared-memory somewhere in the CoreResourceMsg?

 9: script::dom::document::Document::fetch_async::h4749657441058e6d (0x55e52810dc18)
  10: script::stylesheet_loader::StylesheetLoader::load::hb13f68bb9bd0715c (0x55e52819cc5e)
  11: script::dom::htmllinkelement::HTMLLinkElement::handle_stylesheet_url::hbfdd56f3b098b5fc (0x55e52829206f)

Did you rebuild from master by any chance? The traceback seems to imply the branch at https://github.com/servo/servo/pull/23909 was not used(since it would have included a call to script::dom::document::Document::fetch_async_with_handle instead...

I've also just updated the branch about 25 min ago with something that might be relevant.

@gterzian Yes, sorry that wasn't clear. I updated to master yesterday since I wasn't able to start servo with #23761 branch that I previously tried. I will re-target at #23909 and report back (although maybe not tonight as it's already fairly late here).

Thanks a lot, and sorry for making you re-compile so often!

Also to be clear my branch is really just some band-aid which might, or might not, help...

We're going to have to review our overall approach to this issue more fundamentally, which might not be doable within an acceptable time-frame for your own project...

In any case, thank you for stressing Servo, it helps to make it better!

@gterzian no problem. One of my original inspirations for this dissertation topic was discovering Servo and realizing that there was an interesting gap in the space of "automated testing of software with multiple implementations". I do hope that I can contribute to the quality of Servo's rendering engine (and browser rendering standardization in general). So I'm fairly motivated to include Servo in my results/data but depending on timing I may have to make due with comparing Chrome and Firefox for my actual dissertation results. On the other hand, since I'm spending a lot of time writing, it's pretty easy for me to do builds/tests in the background.

Well, it does appear to have done something positive. The test ran for 428 loads and then the screenshot command hung. I manually loaded a page again and it seems to have unstuck it. So I'll add a little more intelligent error handling to the script so that it will keep going in that case (there was no error from servo itself). But that will have to wait for tomorrow because it's 1:45am in the morning here.

BTW, for anybody curious the system I'm building as part of my research is at https://github.com/kanaka/bartender. I presented an early version of it at COMPSAC last year ("Property-Based Testing of Browser Rendering Engines with a Consensus Oracle"). The code base is still really rough and lacking in friendly documentation and likely to remain so for a while so caveat emptor.

Note that the branch at https://github.com/servo/servo/pull/23909 has been updated again(hopefully for the better).

Looks like a very interesting project, good luck with the dissertation!

@gterzian I've rebuilt using #23909 just now and am doing a test run now.

Okay, things look better at 585 page loads (same test case was the one that crashed after 155 load on master), but I just hit a "too many open files" again. BTW, is posting the stack traces here the right process flow for this? I'm happy to keep doing that but it's pretty noisy so I wanted to make sure I'm using the preferred process for this sort of debugging. Here is the trace:

$ ./mach run --release -z --webdriver=7002 --resolution=400x300                             



called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" } (thread FontCacheThread, at 
src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::hc640cc81285f1d17 (0x55dd03eb855f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55dd07104ca5)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55dd07104741)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55dd07104625)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55dd0712738c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x55dd07127486)
             at src/libcore/result.rs:1051
   6: net_traits::fetch_async::ha94b8325e48d695a (0x55dd0603689b)
   7: gfx::font_cache_thread::FontCache::run::h4e3b77fafc00098e (0x55dd060511a9)
   8: _ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17ha24f6f6eeeeab227E.llvm.8879056639272367168 (0x55dd06031281)
   9: _ZN3std9panicking3try7do_call17hf50e2a3403b07585E.llvm.8342416727357815644 (0x55dd06025d94)
  10: __rust_maybe_catch_panic (0x55dd0710e959)
             at src/libpanic_unwind/lib.rs:82
  11: core::ops::function::FnOnce::call_once{{vtable.shim}}::h69f7be2cb98e9948 (0x55dd06025e49)
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55dd070f348e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  13: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55dd0710dc7f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  14: start_thread (0x7f766aa046b9)
  15: clone (0x7f76692a041c)
  16: <unknown> (0x0)
[2019-08-05T16:33:36Z ERROR servo] called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open fi
les" }
Font cache thread has already exited. (thread LayoutThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(587
) }, at components/gfx/font_cache_thread.rs:547)
stack backtrace:
   0: servo::main::{{closure}}::hc640cc81285f1d17 (0x55dd03eb855f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55dd07104ca5)
             at src/libstd/panicking.rs:481
   2: std::panicking::begin_panic::h7d65f24c8a25d032 (0x55dd06025d64)
   3: <gfx::font_cache_thread::FontCacheThread as gfx::font_context::FontSource>::font_template::he984596d9d227d2c (0x55dd0605337d)
   4: gfx::font_context::FontContext<S>::font::h9f8f8f2b3d94829f (0x55dd05c275d8)
   5: <core::iter::adapters::FilterMap<I,F> as core::iter::traits::iterator::Iterator>::try_fold::{{closure}}::h3feef5a4d56b0114 (0x55
dd05bdf558)
   6: gfx::font::FontGroup::find_by_codepoint::h6e8943b4b03009c7 (0x55dd05bdfd80)
   7: layout::text::TextRunScanner::scan_for_runs::h9cf52ebdbef6abe6 (0x55dd05b94d43)
   8: std::thread::local::LocalKey<T>::with::h345ac84dd5d8d399 (0x55dd042b7deb)
   9: layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode>::build_flow_for_block_starting_with_fragments::h5f3cbfad4aa463c
7 (0x55dd0432d32e)
  10: layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode>::build_flow_for_block_like::h9cbefffe4162378a (0x55dd04329d84)
  11: layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode>::build_flow_for_block::h53b7a57c6cbd9488 (0x55dd043292f7)
  12: <layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode> as layout::traversal::PostorderNodeMutTraversal<ConcreteThread
SafeLayoutNode>>::process::had0ede1d5e30fcd5 (0x55dd04315e0f)
  13: style::traversal::DomTraversal::handle_postorder_traversal::hfe569770760df667 (0x55dd042ce47d)
  14: style::driver::traverse_dom::h3a9fa037b95b9c1d (0x55dd04320e85)
  15: profile_traits::time::profile::h22a7d3486c3037bd (0x55dd043fdd2d)
  16: layout_thread::LayoutThread::handle_reflow::h6eef906207d3532e (0x55dd042e0d21)
  17: profile_traits::time::profile::hc7aed90799e144a5 (0x55dd043ffee0)
  18: layout_thread::LayoutThread::handle_request_helper::h2ae9f6dced700516 (0x55dd042d9741)
  19: layout_thread::LayoutThread::start::h08e3cf758da40028 (0x55dd042d8142)
  20: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h35bdb6813930e34c (0x55dd0434e1f3)
  21: std::sys_common::backtrace::__rust_begin_short_backtrace::h0a5fc8dbd652687d (0x55dd04402c94)
  22: _ZN3std9panicking3try7do_call17h4d6a9707ab958757E.llvm.16612881806986651592 (0x55dd0436e6a3)
  23: __rust_maybe_catch_panic (0x55dd0710e959)
             at src/libpanic_unwind/lib.rs:82
  24: core::ops::function::FnOnce::call_once{{vtable.shim}}::hf1597540cbda1157 (0x55dd044038e2)
  25: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55dd070f348e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  26: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55dd0710dc7f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  27: start_thread (0x7f766aa046b9)
  28: clone (0x7f76692a041c)
  29: <unknown> (0x0)
[2019-08-05T16:33:36Z ERROR servo] Font cache thread has already exited.
assertion failed: !self.Document().needs_reflow() ||
    (!for_display && self.Document().needs_paint()) ||
    self.suppress_reflow.get() (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }, at c
omponents/script/dom/window.rs:1569)
stack backtrace:
   0: servo::main::{{closure}}::hc640cc81285f1d17 (0x55dd03eb855f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55dd07104ca5)
             at src/libstd/panicking.rs:481
   2: std::panicking::begin_panic::h837454fe99a7b859 (0x55dd05072e74)
   3: script::dom::window::Window::reflow::h4ae77664da473ef4 (0x55dd045f76ff)
   4: _ZN6script13script_thread12ScriptThread11handle_msgs17h3d6e4667a46c1aacE.llvm.9635453818626840606 (0x55dd04acb54b)
   5: profile_traits::mem::ProfilerChan::run_with_memory_reporting::ha4fa5c2683b8610a (0x55dd04c44647)
   6: std::sys_common::backtrace::__rust_begin_short_backtrace::h08e8a7427af58b63 (0x55dd04f80bd2)
   7: _ZN3std9panicking3try7do_call17h1f7f1756cff56eb2E.llvm.431560401037778374 (0x55dd0507e065)
   8: __rust_maybe_catch_panic (0x55dd0710e959)
             at src/libpanic_unwind/lib.rs:82
   9: core::ops::function::FnOnce::call_once{{vtable.shim}}::h0c22c75b1f15d589 (0x55dd04591a65)
  10: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55dd070f348e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  11: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55dd0710dc7f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  12: start_thread (0x7f766aa046b9)
  13: clone (0x7f76692a041c)
  14: <unknown> (0x0)
[2019-08-05T16:33:36Z ERROR servo] assertion failed: !self.Document().needs_reflow() ||
    (!for_display && self.Document().needs_paint()) ||
    self.suppress_reflow.get()
called `Result::unwrap()` on an `Err` value: "SendError(..)" (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), i
ndex: PipelineIndex(588) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::hc640cc81285f1d17 (0x55dd03eb855f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55dd07104ca5)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55dd07104741)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55dd07104625)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55dd0712738c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x55dd07127486)
             at src/libcore/result.rs:1051
   6: script::script_thread::InProgressLoad::new::hecd58dcd9b7d57f2 (0x55dd04ac2ca7)
   7: std::sys_common::backtrace::__rust_begin_short_backtrace::h08e8a7427af58b63 (0x55dd04f80a5c)
   8: _ZN3std9panicking3try7do_call17h1f7f1756cff56eb2E.llvm.431560401037778374 (0x55dd0507e065)
   9: __rust_maybe_catch_panic (0x55dd0710e959)
             at src/libpanic_unwind/lib.rs:82
  10: core::ops::function::FnOnce::call_once{{vtable.shim}}::h0c22c75b1f15d589 (0x55dd04591a65)
  11: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55dd070f348e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55dd0710dc7f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  13: start_thread (0x7f766aa046b9)
  14: clone (0x7f76692a041c)
  15: <unknown> (0x0)
[2019-08-05T16:33:36Z ERROR servo] called `Result::unwrap()` on an `Err` value: "SendError(..)"
Stack trace for thread "ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(588) }"
stack backtrace:
   0: servo::install_crash_handler::handler::h899be081ae6e691f (0x55dd03eb77a0)
   1: _ZL15WasmTrapHandleriP9siginfo_tPv (0x55dd059c463e)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/wasm/WasmSignalHandlers.cpp:967
   2: <unknown> (0x7f766aa0e38f)
   3: isfree (0x55dd060cb9a0)
             at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/jemalloc-sys-0.1.4/jemalloc/src/jemalloc.c:0
   4: core::ptr::real_drop_in_place::h3e12bc60007f3f42 (0x55dd04f88e05)
   5: std::sys_common::backtrace::__rust_begin_short_backtrace::h08e8a7427af58b63 (0x55dd04f80dfd)
   6: <unknown> (0x55dd08c08d17)
Servo exited with return value 11

I'm running again to see if the failure is consistent.

Looks consistent. Second run crashed on 586th page load. Trace looks similar so I won't post it again.

Thanks.

Things are going in the right direction, because the current crash appears at gfx::font_cache_thread::FontCache::run, instead of script. I only changed stuff in script so far, so it's encouraging that it doesn't crash anymore(at least not until that particular crash).

I'll look into applying the same pattern to the font-cache...

is posting the stack traces here the right process flow for this?

Works well, at least for me...

@kanaka Ok it's worth another try with the updated branch, the font-cache crash should have been fixed.

I updated the build. I got a crash at 588th load but the crash is missing full location data. I'm going to try and do a clean full build to see if it's a build issue. Here is the trace:

$ ./mach run --release -z --webdriver=7002 --resolution=400x300



index out of bounds: the len is 0 but the index is 0 (thread ResourceManager, at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/l
ibcore/slice/mod.rs:2687)
stack backtrace:
   0: <no info> (0x56347618d68f)
   1: <no info> (0x5634793e8525)
   2: <no info> (0x5634793e7fc1)
   3: <no info> (0x5634793e7ea5)
   4: <no info> (0x56347940ac0c)
   5: <no info> (0x56347940abc4)
   6: <no info> (0x5634781fd4c8)
   7: <no info> (0x563478184aab)
   8: <no info> (0x563478155e8c)
   9: <no info> (0x5634781ed177)
  10: <no info> (0x5634781f6a7d)
  11: <no info> (0x563478143785)
  12: <no info> (0x563478210f29)
  13: <no info> (0x56347810aa33)
  14: <no info> (0x563478127f65)
  15: <no info> (0x5634793f21d9)
  16: <no info> (0x5634781e3955)
  17: <no info> (0x5634793d6d0e)
  18: <no info> (0x5634793f14ff)
  19: <no info> (0x7f8cc26b86b9)
  20: <no info> (0x7f8cc0f5441c)
  21: <no info> (0x0)
[2019-08-06T15:48:33Z ERROR servo] index out of bounds: the len is 0 but the index is 0
called `Result::unwrap()` on an `Err` value: Io(Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }) (thread ScriptThread Pipeli
neId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(2) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: <no info> (0x56347618d68f)
   1: <no info> (0x5634793e8525)
   2: <no info> (0x5634793e7fc1)
   3: <no info> (0x5634793e7ea5)
   4: <no info> (0x56347940ac0c)
   5: <no info> (0x56347940ad06)
   6: <no info> (0x5634771f01b5)
   7: <no info> (0x5634771f0075)
   8: <no info> (0x563476d35d39)
   9: <no info> (0x563476dc9bdf)
  10: <no info> (0x563476eb3b84)
  11: <no info> (0x563476eb3415)
  12: <no info> (0x563476f5bde4)
  13: <no info> (0x563476f5b122)
  14: <no info> (0x563476d6d22e)
  15: <no info> (0x563477103cdf)
  16: <no info> (0x56347712f838)
  17: <no info> (0x5634770b3723)
  18: <no info> (0x563476786ad8)
  19: <no info> (0x563476787561)
  20: <no info> (0x5634767973b3)
  21: <no info> (0x56347678b01a)
  22: <no info> (0x5634767be19a)
  23: <no info> (0x563476d68a2e)
  24: <no info> (0x563476f22d6f)
  25: <no info> (0x563476d6861f)
  26: <no info> (0x563476d6c8d5)
  27: <no info> (0x563476dab5b8)
  28: <no info> (0x563476da5d4d)
  29: <no info> (0x563476f21ad7)
  30: <no info> (0x56347725f9d2)
  31: <no info> (0x56347735ce65)
  32: <no info> (0x5634793f21d9)
  33: <no info> (0x56347686c675)
  34: <no info> (0x5634793d6d0e)
  35: <no info> (0x5634793f14ff)
  36: <no info> (0x7f8cc26b86b9)
  37: <no info> (0x7f8cc0f5441c)
  38: <no info> (0x0)
[2019-08-06T15:48:33Z ERROR servo] called `Result::unwrap()` on an `Err` value: Io(Os { code: 32, kind: BrokenPipe, message: "Broken p
ipe" })
Pipeline script content process shutdown chan: Os { code: 24, kind: Other, message: "Too many open files" } (thread Constellation, at 
src/libcore/result.rs:1051)
stack backtrace:
   0: <no info> (0x56347618d68f)
   1: <no info> (0x5634793e8525)
   2: <no info> (0x5634793e7fc1)
   3: <no info> (0x5634793e7ea5)
   4: <no info> (0x56347940ac0c)
   5: <no info> (0x56347940ad06)
   6: <no info> (0x563476315888)
   7: <no info> (0x56347632b071)
   8: <no info> (0x563476329b8b)
   9: <no info> (0x56347632d2e1)
  10: <no info> (0x56347634283d)
  11: <no info> (0x56347634a989)
  12: <no info> (0x563476242323)
  13: <no info> (0x56347625b615)
  14: <no info> (0x5634793f21d9)
  15: <no info> (0x56347625b835)
  16: <no info> (0x5634793d6d0e)
  17: <no info> (0x5634793f14ff)
  18: <no info> (0x7f8cc26b86b9)
  19: <no info> (0x7f8cc0f5441c)
  20: <no info> (0x0)
[2019-08-06T15:48:33Z ERROR servo] Pipeline script content process shutdown chan: Os { code: 24, kind: Other, message: "Too many open 
files" }
called `Result::unwrap()` on an `Err` value: RecvError (thread LayoutThread PipelineId { namespace_id: PipelineNamespaceId(2), index: 
PipelineIndex(569) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: <no info> (0x56347618d68f)
   1: <no info> (0x5634793e8525)
   2: <no info> (0x5634793e7fc1)
   3: <no info> (0x5634793e7ea5)
   4: <no info> (0x56347940ac0c)
   5: <no info> (0x56347940ad06)
   6: <no info> (0x5634765afa63)
   7: <no info> (0x563476625723)
   8: <no info> (0x5634766dace4)
   9: <no info> (0x563476645ea3)
  10: <no info> (0x5634793f21d9)
  11: <no info> (0x5634766db932)
  12: <no info> (0x5634793d6d0e)
  13: <no info> (0x5634793f14ff)
  14: <no info> (0x7f8cc26b86b9)
  15: <no info> (0x7f8cc0f5441c)
  16: <no info> (0x0)
[2019-08-06T15:48:33Z ERROR servo] called `Result::unwrap()` on an `Err` value: RecvError
called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders for this socket closed" }) (thread
 StorageManager, at src/libcore/result.rs:1051)
stack backtrace:
   0: <no info> (0x56347618d68f)
   1: <no info> (0x5634793e8525)
   2: <no info> (0x5634793e7fc1)
   3: <no info> (0x5634793e7ea5)
   4: <no info> (0x56347940ac0c)
   5: <no info> (0x56347940ad06)
   6: <no info> (0x5634781b75cf)
   7: <no info> (0x56347810acd2)
   8: <no info> (0x56347812809b)
   9: <no info> (0x5634793f21d9)
  10: <no info> (0x5634781e2d7f)
  11: <no info> (0x5634793d6d0e)
  12: <no info> (0x5634793f14ff)
  13: <no info> (0x7f8cc26b86b9)
  14: <no info> (0x7f8cc0f5441c)
  15: <no info> (0x0)
[2019-08-06T15:48:34Z ERROR servo] called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders
 for this socket closed" })
^CServo exited with return value -2

With the clean build (2a6c93f1ba) I got a crash at the 596th page load. And this time the frames are labelled. Looks like there may still be a font_cache issue. One thing I noticed this time is that the process starts with 275MB of resident memory and slowly grows to about 600MB right before it crashes. I haven't looked at memory usage prior to this so I don't know whether this is new behavior or not. My expectation is that repeatedly loading the same content in the same window/tab shouldn't have constant memory growth if things are behaving correctly.

$ ./mach run --release -z --webdriver=7002 --resolution=400x300                             



called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" } (thread main, at src/libcore
/result.rs:1051)
failed to create IPC channel: Os { code: 24, kind: Other, message: "Too many open files" } (thread LayoutThread PipelineId { namespace
_id: PipelineNamespaceId(2), index: PipelineIndex(598) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::he6dd7bec62c94245 (0x55acfaa4168f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55acfdc9c525)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55acfdc9bfc1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55acfdc9bea5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55acfdcbec0c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x55acfdcbed06)
             at src/libcore/result.rs:1051
   6: webrender_api::api::RenderApi::get_scroll_node_state::hd293254e30b6f279 (0x55acfdb5a837)
   7: compositing::compositor::IOCompositor<Window>::send_viewport_rects::had43b3860988bdba (0x55acfaa75900)
   8: _ZN11compositing10compositor26IOCompositor$LT$Window$GT$22handle_browser_message17h35ede344d68d17fbE.llvm.11108698028848252580 (
0x55acfaa76d50)
   9: compositing::compositor::IOCompositor<Window>::receive_messages::h9c21b79ecc7864f4 (0x55acfaa7453c)
  10: servo::Servo<Window>::handle_events::h2e8a235109d8b416 (0x55acfaa921b6)
  11: servo::app::App::handle_events::h719e551468bebb07 (0x55acfaa884c8)
  12: servo::app::App::run::hfb05858ec4b0a2e1 (0x55acfaa87278)
  13: servo::main::hbf1ae120024a1f3f (0x55acfaa40f97)
  14: std::rt::lang_start::{{closure}}::ha3e485d15a4449ac (0x55acfaa490a2)
  15: std::rt::lang_start_internal::{{closure}}::hbf11e2eac4637ca8 (0x55acfdc9be42)
             at src/libstd/rt.rs:49
      std::panicking::try::do_call::hce2b88a55d321203
             at src/libstd/panicking.rs:296
  16: __rust_maybe_catch_panic (0x55acfdca61d9)
             at src/libpanic_unwind/lib.rs:82
  17: std::panicking::try::hcfbcbb3944be5b74 (0x55acfdc9ca0c)
             at src/libstd/panicking.rs:275
      std::panic::catch_unwind::h65d3049f65e755e2
             at src/libstd/panic.rs:394
      std::rt::lang_start_internal::h97d8c129acb51f99
             at src/libstd/rt.rs:48
  18: main (0x55acfaa41841)
  19: __libc_start_main (0x7fd8c09bc82f)
  20: _start (0x55acfa9af6c8)
  21: <unknown> (0x0)
[2019-08-06T18:06:08Z ERROR servo] called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open fi
les" }
stack backtrace:
   0: servo::main::{{closure}}::he6dd7bec62c94245 (0x55acfaa4168f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55acfdc9c525)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55acfdc9bfc1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55acfdc9bea5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55acfdcbec0c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x55acfdcbed06)
             at src/libcore/result.rs:1051
   6: <gfx::font_cache_thread::FontCacheThread as gfx::font_context::FontSource>::font_template::hde8935ea5f085623 (0x55acfcbeabfb)
   7: gfx::font_context::FontContext<S>::font::h533cd1ded1f8cf6b (0x55acfc7ba428)
   8: <core::iter::adapters::FilterMap<I,F> as core::iter::traits::iterator::Iterator>::try_fold::{{closure}}::h67c77a70dd5dd718 (0x55
acfc772168)
   9: gfx::font::FontGroup::find_by_codepoint::h6a060cb6143e5db0 (0x55acfc772ac0)
  10: layout::text::TextRunScanner::scan_for_runs::hcb8f27e81a0ec327 (0x55acfc727ca3)
  11: std::thread::local::LocalKey<T>::with::h64ecd24f28675f35 (0x55acfae433fb)
  12: layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode>::build_flow_for_block_starting_with_fragments::h4fd5c7a08d78ec7
c (0x55acfaeb89ee)
  13: layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode>::build_flow_for_block_like::hc6c869d4885b16a7 (0x55acfaeb5444)
  14: layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode>::build_flow_for_block::hcfe65738a5c4f4f6 (0x55acfaeb49b7)
  15: <layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode> as layout::traversal::PostorderNodeMutTraversal<ConcreteThread
SafeLayoutNode>>::process::h47e97aa5f57c3e64 (0x55acfaea133f)
  16: style::traversal::DomTraversal::handle_postorder_traversal::heafa09f64e169fe2 (0x55acfae599ad)
  17: style::driver::traverse_dom::h29338c1fc58056e5 (0x55acfaeac545)
  18: profile_traits::time::profile::h9aa760a7046a2e92 (0x55acfaf8c11d)
  19: layout_thread::LayoutThread::handle_reflow::h8bcbcf556a92250c (0x55acfae6c251)
  20: profile_traits::time::profile::h0ec36fd3809228a0 (0x55acfaf89d80)
  21: layout_thread::LayoutThread::handle_request_helper::h8cd8cac302aa1d62 (0x55acfae64c71)
  22: layout_thread::LayoutThread::start::h8165d87f493745a3 (0x55acfae63672)
  23: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h614fc3ed4f2f5507 (0x55acfaed9723)
  24: std::sys_common::backtrace::__rust_begin_short_backtrace::hf77a46581014d943 (0x55acfaf8ece4)
  25: _ZN3std9panicking3try7do_call17ha1e1ccbc2a987704E.llvm.3426839180471264104 (0x55acfaef9ea3)
  26: __rust_maybe_catch_panic (0x55acfdca61d9)
             at src/libpanic_unwind/lib.rs:82
  27: core::ops::function::FnOnce::call_once{{vtable.shim}}::h42c58161f8eea005 (0x55acfaf8f932)
  28: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55acfdc8ad0e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  29: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55acfdca54ff)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  30: start_thread (0x7fd8c22076b9)
  31: clone (0x7fd8c0aa341c)
  32: <unknown> (0x0)
[2019-08-06T18:06:08Z ERROR servo] failed to create IPC channel: Os { code: 24, kind: Other, message: "Too many open files" }
Pipeline layout content shutdown chan: Os { code: 24, kind: Other, message: "Too many open files" } (thread Constellation, at src/libc
ore/result.rs:1051)
thread panicked while processing panic. aborting.
thread panicked while processing panic. aborting.
Servo exited with return value 101

Ok still progressing, because those are different crashes. There are two different issues:

https://github.com/servo/webrender/blob/1e044fe1862373a055dbfb39930ead035e55b30f/webrender_api/src/api.rs#L1345

That's in webrender, so I filed https://github.com/servo/webrender/issues/3732

The other is at:

https://github.com/servo/servo/blob/b6cdf931987e71707c217f2953e0b2a62a92868c/components/gfx/font_cache_thread.rs#L525

It looks like layout is blocking on getting a font from the font-cache, relying on creating an ipc-channel and blocking on the reply.

That pattern will loop and make a blocking ipc call at each iteration, consisting of creating a channel and sending the sender across process:

https://github.com/servo/servo/blob/b6cdf931987e71707c217f2953e0b2a62a92868c/components/gfx/font.rs#L431

For this I've filed https://github.com/servo/servo/issues/23925

I'm not sure these two are easy fixes like the previous stuff...

Ok I've came-up with a conceptual way to make the router support blocking IPC calls, so that should fix the layout/font-cache issue, I'll try to implement it today.

@gterzian sounds good. I'll check back tomorrow and if you've come up with something I'll give it a test.

Ok, this one https://github.com/servo/servo/pull/23909 is ready for another spin.

I've applied the same pattern to the gfx::font_cache_thread::FontCacheThread as gfx::font_context::FontSource>::font_template, and for the webrender issue, I just commented out the code! I think your tests could run without getting those updates of scroll states(I can actually still scroll, not completely sure what that code relates to). At least it should get us to the next unexpected crash...

Hang on, this crashes with parallel layout(because the rayon thread doesn't have a pipeline namespace, and the ipc is done from within rayon). I'll try to fix it and let you know...

@kanaka Ok, good for another spin(had to turn-off parallel layout)

Panic after 155 loads this time. This is the same test case that went 596 loads last time. Just a note that every time the page loads I'm getting a number of the "Layout threadpool None" message printed. I've elided them from the beginning of the output:

...
...
Layout threadpool None
Layout threadpool None
Layout threadpool None
assertion failed: address != MAP_FAILED (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(2
) }, at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/ipc-channel-0.11.3/src/platform/unix/mod.rs:697)
stack backtrace:
   0: servo::main::{{closure}}::h60525228e0a9f42a (0x557d8f6b381f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x557d92911815)
             at src/libstd/panicking.rs:481
   2: std::panicking::begin_panic::hd50f1f70850a0b10 (0x557d928e0e34)
   3: <ipc_channel::platform::unix::OsIpcSharedMemory as core::clone::Clone>::clone::ha02b3c7c4e1b15e6 (0x557d928d802b)
   4: script::fetch::FetchCanceller::initialize::hc74c6a86a8247a50 (0x557d902b0700)
   5: script::document_loader::DocumentLoader::fetch_async_background_with_handle::h2965fb25ffc885e5 (0x557d90551747)
   6: script::document_loader::DocumentLoader::fetch_async_with_handle::h299656c5c6d57ecd (0x557d90551695)
   7: script::dom::document::Document::fetch_async_with_handle::h1d53b40cbd452601 (0x557d9024c119)
   8: script::stylesheet_loader::StylesheetLoader::load::h6101ee193845a111 (0x557d902dff6f)
   9: script::dom::htmllinkelement::HTMLLinkElement::handle_stylesheet_url::hc9f31003c31ba34b (0x557d903ca0e4)
  10: <script::dom::htmllinkelement::HTMLLinkElement as script::dom::virtualmethods::VirtualMethods>::bind_to_tree::h20cb9a65218ec3cb 
(0x557d903c9975)
  11: script::dom::node::Node::insert::h0cf554a892e1e2b4 (0x557d90471e24)
  12: script::dom::node::Node::pre_insert::h15f66df088fd0d56 (0x557d90471162)
  13: script::dom::servoparser::insert::h07eaf568032e87bf (0x557d902837be)
  14: html5ever::tree_builder::TreeBuilder<Handle,Sink>::insert_element::h515de15a3384bc7d (0x557d9061d2ef)
  15: html5ever::tree_builder::TreeBuilder<Handle,Sink>::step::h14e2e1c7863cd3e3 (0x557d90634a08)
  16: <html5ever::tree_builder::TreeBuilder<Handle,Sink> as html5ever::tokenizer::interface::TokenSink>::process_token::h3c726807d3a25
0c4 (0x557d905ca5d3)
  17: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$13process_token17h72287ad448a59040E.llvm.16483899024605272016 (0x557d8fc9c4d8)
  18: html5ever::tokenizer::Tokenizer<Sink>::emit_current_tag::hf862aa7472a90c5a (0x557d8fc9cf61)
  19: html5ever::tokenizer::Tokenizer<Sink>::step::hca8288eb4826f188 (0x557d8fcacdb3)
  20: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$3run17h5d9c658a7f18217dE.llvm.16483899024605272016 (0x557d8fca0cca)
  21: script::dom::servoparser::html::Tokenizer::feed::h2a8250c58eba2f33 (0x557d8fcd434a)
  22: script::dom::servoparser::ServoParser::do_parse_sync::h008a9762ab6e11fd (0x557d9027efbe)
  23: profile_traits::time::profile::h68e51e1dd99e1a9d (0x557d9043815f)
  24: _ZN6script3dom11servoparser11ServoParser10parse_sync17hd8eeaf9d5d6d97e1E.llvm.11352473516434932923 (0x557d9027ebaf)
  25: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response_chunk::h47277bfd0c50f210 (0x557
d90282e65)
  26: script::script_thread::ScriptThread::handle_msg_from_constellation::hb6598e64e0108a83 (0x557d902c1b18)
  27: _ZN6script13script_thread12ScriptThread11handle_msgs17h5e27f5d6a2fa27a7E.llvm.11352473516434932923 (0x557d902bc27d)
  28: profile_traits::mem::ProfilerChan::run_with_memory_reporting::hd3dc945019b6cd59 (0x557d90437af7)
  29: std::sys_common::backtrace::__rust_begin_short_backtrace::he8d3dd1d346dafa0 (0x557d9077960d)
  30: _ZN3std9panicking3try7do_call17h8ff058ddb9dca993E.llvm.1398661747740031626 (0x557d90896e65)
  31: __rust_maybe_catch_panic (0x557d9291b4c9)
             at src/libpanic_unwind/lib.rs:82
  32: core::ops::function::FnOnce::call_once{{vtable.shim}}::hde4c9792015a3419 (0x557d8fd82e65)
  33: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x557d928ffffe)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  34: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x557d9291a7ef)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  35: start_thread (0x7f7d855e86b9)
  36: clone (0x7f7d83e8441c)
  37: <unknown> (0x0)
[2019-08-08T18:04:01Z ERROR servo] assertion failed: address != MAP_FAILED
Pipeline layout content shutdown chan: Os { code: 24, kind: Other, message: "Too many open files" } (thread Constellation, at src/libc
ore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h60525228e0a9f42a (0x557d8f6b381f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x557d92911815)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x557d929112b1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x557d92911195)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x557d92933efc)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x557d92933ff6)
             at src/libcore/result.rs:1051
   6: constellation::pipeline::Pipeline::spawn::h9c704d00bbd7e0d8 (0x557d8f82f3a0)
   7: constellation::constellation::Constellation<Message,LTF,STF>::new_pipeline::h5ed9224e89096273 (0x557d8f84dc81)
   8: constellation::constellation::Constellation<Message,LTF,STF>::handle_panic::h7e5373f125733019 (0x557d8f84c79b)
   9: constellation::constellation::Constellation<Message,LTF,STF>::handle_log_entry::h8beeb081ec7b187e (0x557d8f84fef1)
  10: constellation::constellation::Constellation<Message,LTF,STF>::handle_request_from_compositor::hc424a6fc34ada869 (0x557d8f8654bd)
  11: constellation::constellation::Constellation<Message,LTF,STF>::run::h08fa371efa88aa3a (0x557d8f86d613)
  12: std::sys_common::backtrace::__rust_begin_short_backtrace::h677f09aabbca1ec1 (0x557d8f76ceef)
  13: std::panicking::try::do_call::hc170fd9aa1bedd4a (0x557d8f7717e5)
  14: __rust_maybe_catch_panic (0x557d9291b4c9)
             at src/libpanic_unwind/lib.rs:82
  15: core::ops::function::FnOnce::call_once{{vtable.shim}}::h70e5314c6c21c735 (0x557d8f771b85)
  16: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x557d928ffffe)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  17: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x557d9291a7ef)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  18: start_thread (0x7f7d855e86b9)
  19: clone (0x7f7d83e8441c)
  20: <unknown> (0x0)
[2019-08-08T18:04:01Z ERROR servo] Pipeline layout content shutdown chan: Os { code: 24, kind: Other, message: "Too many open files" }
called `Result::unwrap()` on an `Err` value: RecvError (thread LayoutThread PipelineId { namespace_id: PipelineNamespaceId(2), index: 
PipelineIndex(136) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h60525228e0a9f42a (0x557d8f6b381f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x557d92911815)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x557d929112b1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x557d92911195)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x557d92933efc)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x557d92933ff6)
             at src/libcore/result.rs:1051
   6: layout_thread::LayoutThread::start::had947240cbd73635 (0x557d8fad1cf3)
   7: profile_traits::mem::ProfilerChan::run_with_memory_reporting::hf45b025b46065f78 (0x557d8fb47d83)
   8: _ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17hb8136fc9a528dee3E.llvm.6633224336944688299 (0x557d8fbd00de)
   9: _ZN3std9panicking3try7do_call17hc12b321bb854f3e4E.llvm.9267663342173610026 (0x557d8fb66503)
  10: __rust_maybe_catch_panic (0x557d9291b4c9)
             at src/libpanic_unwind/lib.rs:82
  11: core::ops::function::FnOnce::call_once{{vtable.shim}}::h483b26e3078f7b57 (0x557d8fb91d92)
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x557d928ffffe)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  13: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x557d9291a7ef)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  14: start_thread (0x7f7d855e86b9)
  15: clone (0x7f7d83e8441c)
  16: <unknown> (0x0)
[2019-08-08T18:04:01Z ERROR servo] called `Result::unwrap()` on an `Err` value: RecvError
called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders for this socket closed" }) (thread
 StorageManager, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h60525228e0a9f42a (0x557d8f6b381f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x557d92911815)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x557d929112b1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x557d92911195)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x557d92933efc)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x557d92933ff6)
             at src/libcore/result.rs:1051
   6: net::storage_thread::StorageManager::start::h8eeece525ce345e8 (0x557d916d162f)
   7: std::sys_common::backtrace::__rust_begin_short_backtrace::ha904cc00c6b0933c (0x557d91628112)
   8: _ZN3std9panicking3try7do_call17he2da217e7601c238E.llvm.16610627900611519721 (0x557d916446eb)
   9: __rust_maybe_catch_panic (0x557d9291b4c9)
             at src/libpanic_unwind/lib.rs:82
  10: core::ops::function::FnOnce::call_once{{vtable.shim}}::h0143da5e67cc8fb6 (0x557d916ff66f)
  11: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x557d928ffffe)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x557d9291a7ef)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  13: start_thread (0x7f7d855e86b9)
  14: clone (0x7f7d83e8441c)
  15: <unknown> (0x0)
[2019-08-08T18:04:01Z ERROR servo] called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders
 for this socket closed" })

Ok I was able to remove the use of ipc-channel there, because the content-process only needs to wait on the shutdown of the script-thread in multiprocess mode, and in that case we can use crossbeam channels created in the content-process, as opposed to using ipc-senders which are sent across process as part of UnprivilegedPipelineContent.

I also removed the use of layout in that workflow, since the script-thread will itself wait on layout threads to shutdown, hence the main thread in the process only needs to wait on the script-thread.

See https://github.com/servo/servo/pull/23909/commits/c1d43137a87b60833a30c6dfc7a9f7577381a300

Good for another spin. Wondering what will come-up next...

Just a note that every time the page loads I'm getting a number of the "Layout threadpool None" message printed.

Thanks, removed that print out.

Sorry for the slow response. I had a busy work day (I'm a part-time PhD student with a full-time job as well). I'm rebuilding to test right now.

Crash at 156 but looks like frame labels were lost again. I'm going to do a clean build to try and get the full annotated trace. Watching the resident memory usage it looks like it's increasing by about 1MB every two or three page loads so I suspect we haven't hit the last of the issues. Anyways, here is the trace missing labels in case it's interesting. More to come.

$ ./mach run --release -z --webdriver=7002 --resolution=400x300



assertion failed: address != MAP_FAILED (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(2) }, at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/ipc-channel-0.11.3/src/platform/unix/mod.rs:697)
stack backtrace:
   0: <no info> (0x55d3d26e77cf)
   1: <no info> (0x55d3d593f415)
   2: <no info> (0x55d3d590ea34)
   3: <no info> (0x55d3d5905c2b)
   4: <no info> (0x55d3d3184f70)
   5: <no info> (0x55d3d2de8133)
   6: <no info> (0x55d3d3525ffb)
   7: <no info> (0x55d3d31e6a78)
   8: <no info> (0x55d3d3184467)
   9: <no info> (0x55d3d2db2b4f)
  10: <no info> (0x55d3d357e907)
  11: <no info> (0x55d3d357e6f5)
  12: <no info> (0x55d3d3275bf9)
  13: <no info> (0x55d3d33095df)
  14: <no info> (0x55d3d33f73d4)
  15: <no info> (0x55d3d33f6c65)
  16: <no info> (0x55d3d349f694)
  17: <no info> (0x55d3d349e9d2)
  18: <no info> (0x55d3d32ad3ce)
  19: <no info> (0x55d3d364969f)
  20: <no info> (0x55d3d3660db8)
  21: <no info> (0x55d3d35f7663)
  22: <no info> (0x55d3d2cc8108)
  23: <no info> (0x55d3d2cc8b91)
  24: <no info> (0x55d3d2cd89e3)
  25: <no info> (0x55d3d2ccc8fa)
  26: <no info> (0x55d3d2cfff7a)
  27: <no info> (0x55d3d32a8bce)
  28: <no info> (0x55d3d34663bf)
  29: <no info> (0x55d3d32a87bf)
  30: <no info> (0x55d3d32aca75)
  31: <no info> (0x55d3d32eb298)
  32: <no info> (0x55d3d32e5aac)
  33: <no info> (0x55d3d3465757)
  34: <no info> (0x55d3d37a5310)
  35: <no info> (0x55d3d38c2643)
  36: <no info> (0x55d3d59490c9)
  37: <no info> (0x55d3d2dc2792)
  38: <no info> (0x55d3d592dbfe)
  39: <no info> (0x55d3d59483ef)
  40: <no info> (0x7f787eea86b9)
  41: <no info> (0x7f787d74441c)
  42: <no info> (0x0)
[2019-08-10T20:50:17Z ERROR servo] assertion failed: address != MAP_FAILED
^CServo exited with return value -2

Hmm, I did a clean rebuild and the test fails at the same point with the same trace and I still don't have any stack frame labels.

@kanaka, thanks, that looks like the shared memory in the fetch canceller, so I've turned it off completely. It's worth another try...

Progress! 819 loads this time. The signature of this one looks fairly different (maybe GC in mozjs?).

I monitored the memory usage during the run and it appears that there is a pretty consistent/steady increase of 50-60MB resident size for every 100 page loads.

$ ./mach run --release -z --webdriver=7002 --resolution=400x300                                                                                                                                                              [90/1785]



assertion failed: self.is_double() (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(2) }, at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/src/jsval.rs:439)
stack backtrace:
   0: servo::main::{{closure}}::h60525228e0a9f42a (0x55c4763597cf)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55c4795b0455)
             at src/libstd/panicking.rs:481
   2: std::panicking::begin_panic::h8a8ec1e1f80328f5 (0x55c477504ad4)
   3: script::dom::windowproxy::trace::h562b81e7ed033c33 (0x55c477081108)
   4: _ZNK2js5Class7doTraceEP8JSTracerP8JSObject (0x55c477b326c6)
             at /data/joelm/personal/UTA/dissertation/servo/servo-20190724.git/target/release/build/mozjs_sys-f383b7e9a63fc67f/out/dist/include/js/Class.h:872
      _ZL13CallTraceHookIZN2js14TenuringTracer11traceObjectEP8JSObjectE4$_11EPNS0_12NativeObjectEOT_P8JSTracerS3_15CheckGeneration
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:1545
      _ZN2js14TenuringTracer11traceObjectEP8JSObject
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2907
   5: _ZL14TraceWholeCellRN2js14TenuringTracerEP8JSObject (0x55c477b324bb)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2798
      _ZL18TraceBufferedCellsI8JSObjectEvRN2js14TenuringTracerEPNS1_2gc5ArenaEPNS4_12ArenaCellSetE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2835
      _ZN2js2gc11StoreBuffer15WholeCellBuffer5traceERNS_14TenuringTracerE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2852
   6: _ZN2js2gc11StoreBuffer15traceWholeCellsERNS_14TenuringTracerE (0x55c477b49575)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/StoreBuffer.h:479
      _ZN2js7Nursery12doCollectionEN2JS8GCReasonERNS_2gc16TenureCountCacheE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Nursery.cpp:946
   7: _ZN2js7Nursery7collectEN2JS8GCReasonE (0x55c477b48805)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Nursery.cpp:783
   8: _ZN2js2gc9GCRuntime7minorGCEN2JS8GCReasonENS_7gcstats9PhaseKindE (0x55c477b29de3)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/GC.cpp:7787
   9: _ZN2js2gc9GCRuntime13gcIfRequestedEv (0x55c477b0d3d4)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/GC.cpp:7846
  10: _ZN2js2gc9GCRuntime22gcIfNeededAtAllocationEP9JSContext (0x55c477b0999c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:343
      _ZN2js2gc9GCRuntime19checkAllocatorStateILNS_7AllowGCE1EEEbP9JSContextNS0_9AllocKindE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:300
      _ZN2js14AllocateObjectILNS_7AllowGCE1EEEP8JSObjectP9JSContextNS_2gc9AllocKindEmNS6_11InitialHeapEPKNS_5ClassE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:55
  11: _ZN2js11ProxyObject6createEP9JSContextPKNS_5ClassEN2JS6HandleINS_11TaggedProtoEEENS_2gc9AllocKindENS_13NewObjectKindE (0x55c4778da99f)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/ProxyObject.cpp:199
  12: _ZN2js11ProxyObject3NewEP9JSContextPKNS_16BaseProxyHandlerEN2JS6HandleINS6_5ValueEEENS_11TaggedProtoERKNS_12ProxyOptionsE (0x55c4778da444)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/ProxyObject.cpp:100
  13: _ZN2js14NewProxyObjectEP9JSContextPKNS_16BaseProxyHandlerEN2JS6HandleINS5_5ValueEEEP8JSObjectRKNS_12ProxyOptionsE (0x55c477a4cb01)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/Proxy.cpp:779
      _ZN2js7Wrapper3NewEP9JSContextP8JSObjectPKS0_RKNS_14WrapperOptionsE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/Wrapper.cpp:282
  14: WrapperNew (0x55c47777a737)
  15: _ZN2JS11Compartment18getOrCreateWrapperEP9JSContextNS_6HandleIP8JSObjectEENS_13MutableHandleIS5_EE (0x55c47780a87c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/Compartment.cpp:268
  16: _ZN2JS11Compartment6rewrapEP9JSContextNS_13MutableHandleIP8JSObjectEENS_6HandleIS5_EE (0x55c47780ac61)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/Compartment.cpp:357
  17: _ZN2js12RemapWrapperEP9JSContextP8JSObjectS3_ (0x55c477a3c26e)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/CrossCompartmentWrapper.cpp:582
  18: _ZN2js25RemapAllWrappersForObjectEP9JSContextP8JSObjectS3_ (0x55c477a3c7c2)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/CrossCompartmentWrapper.cpp:633
  19: _Z19JS_TransplantObjectP9JSContextN2JS6HandleIP8JSObjectEES5_ (0x55c477a0dffc)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/jsapi.cpp:731
  20: _ZN6script3dom11windowproxy11WindowProxy10set_window17h7e22b4eb25e69698E.llvm.14539727287563764700 (0x55c47707f3b3)
  21: script::dom::window::Window::resume::h463e05b98e601c21 (0x55c476a8311d)
  22: script::script_thread::ScriptThread::load::h840f42e218b48e42 (0x55c476f72344)
  23: script::script_thread::ScriptThread::handle_page_headers_available::hc7e938eff32254e6 (0x55c476f6dd42)
  24: std::thread::local::LocalKey<T>::with::h242cf5a3248d4b16 (0x55c476df4a9d)
  25: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response::hfea17b04486e1186 (0x55c476f1bc2d)
  26: script::script_thread::ScriptThread::handle_msg_from_constellation::hb6598e64e0108a83 (0x55c476f5ce4f)
  27: _ZN6script13script_thread12ScriptThread11handle_msgs17h5e27f5d6a2fa27a7E.llvm.9809419130639410203 (0x55c476f5753c)
  28: profile_traits::mem::ProfilerChan::run_with_memory_reporting::hd3dc945019b6cd59 (0x55c4770d70e7)
  29: std::sys_common::backtrace::__rust_begin_short_backtrace::he8d3dd1d346dafa0 (0x55c4774163b0)
  30: _ZN3std9panicking3try7do_call17h8ff058ddb9dca993E.llvm.11103976371393050535 (0x55c4775336e3)
  31: __rust_maybe_catch_panic (0x55c4795ba109)
             at src/libpanic_unwind/lib.rs:82
  32: core::ops::function::FnOnce::call_once{{vtable.shim}}::hde4c9792015a3419 (0x55c476a34392)
  33: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55c47959ec3e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  34: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55c4795b942f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  35: start_thread (0x7f2c7eb0b6b9)
  36: clone (0x7f2c7d3a741c)
  37: <unknown> (0x0)
[2019-08-12T18:08:11Z ERROR servo] assertion failed: self.is_double()
Stack trace for thread "ScriptThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(2) }"
stack backtrace:
   0: servo::install_crash_handler::handler::h5aff865ed5432d52 (0x55c476358a3a)
   1: _ZL15WasmTrapHandleriP9siginfo_tPv (0x55c477e564ee)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/wasm/WasmSignalHandlers.cpp:967
   2: <unknown> (0x7f2c7eb1538f)
   3: script::dom::windowproxy::trace::h562b81e7ed033c33 (0x55c477081125)
   4: _ZNK2js5Class7doTraceEP8JSTracerP8JSObject (0x55c477b326c6)
             at /data/joelm/personal/UTA/dissertation/servo/servo-20190724.git/target/release/build/mozjs_sys-f383b7e9a63fc67f/out/dist/include/js/Class.h:872
      _ZL13CallTraceHookIZN2js14TenuringTracer11traceObjectEP8JSObjectE4$_11EPNS0_12NativeObjectEOT_P8JSTracerS3_15CheckGeneration
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:1545
      _ZN2js14TenuringTracer11traceObjectEP8JSObject
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2907
   5: _ZL14TraceWholeCellRN2js14TenuringTracerEP8JSObject (0x55c477b324bb)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2798
      _ZL18TraceBufferedCellsI8JSObjectEvRN2js14TenuringTracerEPNS1_2gc5ArenaEPNS4_12ArenaCellSetE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2835
      _ZN2js2gc11StoreBuffer15WholeCellBuffer5traceERNS_14TenuringTracerE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Marking.cpp:2852
   6: _ZN2js2gc11StoreBuffer15traceWholeCellsERNS_14TenuringTracerE (0x55c477b49575)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/StoreBuffer.h:479
      _ZN2js7Nursery12doCollectionEN2JS8GCReasonERNS_2gc16TenureCountCacheE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Nursery.cpp:946
   7: _ZN2js7Nursery7collectEN2JS8GCReasonE (0x55c477b48805)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Nursery.cpp:783
   8: _ZN2js2gc9GCRuntime7minorGCEN2JS8GCReasonENS_7gcstats9PhaseKindE (0x55c477b29de3)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/GC.cpp:7787
   9: _ZN2js2gc9GCRuntime13gcIfRequestedEv (0x55c477b0d3d4)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/GC.cpp:7846
  10: _ZN2js2gc9GCRuntime22gcIfNeededAtAllocationEP9JSContext (0x55c477b0999c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:343
      _ZN2js2gc9GCRuntime19checkAllocatorStateILNS_7AllowGCE1EEEbP9JSContextNS0_9AllocKindE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:300
      _ZN2js14AllocateObjectILNS_7AllowGCE1EEEP8JSObjectP9JSContextNS_2gc9AllocKindEmNS6_11InitialHeapEPKNS_5ClassE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/gc/Allocator.cpp:55
  11: _ZN2js11ProxyObject6createEP9JSContextPKNS_5ClassEN2JS6HandleINS_11TaggedProtoEEENS_2gc9AllocKindENS_13NewObjectKindE (0x55c4778da99f)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/ProxyObject.cpp:199
  12: _ZN2js11ProxyObject3NewEP9JSContextPKNS_16BaseProxyHandlerEN2JS6HandleINS6_5ValueEEENS_11TaggedProtoERKNS_12ProxyOptionsE (0x55c4778da444)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/ProxyObject.cpp:100
  13: _ZN2js14NewProxyObjectEP9JSContextPKNS_16BaseProxyHandlerEN2JS6HandleINS5_5ValueEEEP8JSObjectRKNS_12ProxyOptionsE (0x55c477a4cb01)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/Proxy.cpp:779
      _ZN2js7Wrapper3NewEP9JSContextP8JSObjectPKS0_RKNS_14WrapperOptionsE
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/Wrapper.cpp:282
  14: WrapperNew (0x55c47777a737)
  15: _ZN2JS11Compartment18getOrCreateWrapperEP9JSContextNS_6HandleIP8JSObjectEENS_13MutableHandleIS5_EE (0x55c47780a87c)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/Compartment.cpp:268
  16: _ZN2JS11Compartment6rewrapEP9JSContextNS_13MutableHandleIP8JSObjectEENS_6HandleIS5_EE (0x55c47780ac61)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/vm/Compartment.cpp:357
  17: _ZN2js12RemapWrapperEP9JSContextP8JSObjectS3_ (0x55c477a3c26e)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/CrossCompartmentWrapper.cpp:582
  18: _ZN2js25RemapAllWrappersForObjectEP9JSContextP8JSObjectS3_ (0x55c477a3c7c2)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/proxy/CrossCompartmentWrapper.cpp:633
  19: _Z19JS_TransplantObjectP9JSContextN2JS6HandleIP8JSObjectEES5_ (0x55c477a0dffc)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/jsapi.cpp:731
  20: _ZN6script3dom11windowproxy11WindowProxy10set_window17h7e22b4eb25e69698E.llvm.14539727287563764700 (0x55c47707f3b3)
  21: script::dom::window::Window::resume::h463e05b98e601c21 (0x55c476a8311d)
  22: script::script_thread::ScriptThread::load::h840f42e218b48e42 (0x55c476f72344)
  23: script::script_thread::ScriptThread::handle_page_headers_available::hc7e938eff32254e6 (0x55c476f6dd42)
  24: std::thread::local::LocalKey<T>::with::h242cf5a3248d4b16 (0x55c476df4a9d)
  25: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response::hfea17b04486e1186 (0x55c476f1bc2d)
  26: script::script_thread::ScriptThread::handle_msg_from_constellation::hb6598e64e0108a83 (0x55c476f5ce4f)
  27: _ZN6script13script_thread12ScriptThread11handle_msgs17h5e27f5d6a2fa27a7E.llvm.9809419130639410203 (0x55c476f5753c)
  28: profile_traits::mem::ProfilerChan::run_with_memory_reporting::hd3dc945019b6cd59 (0x55c4770d70e7)
  29: std::sys_common::backtrace::__rust_begin_short_backtrace::he8d3dd1d346dafa0 (0x55c4774163b0)
  30: _ZN3std9panicking3try7do_call17h8ff058ddb9dca993E.llvm.11103976371393050535 (0x55c4775336e3)
  31: __rust_maybe_catch_panic (0x55c4795ba109)
             at src/libpanic_unwind/lib.rs:82
  32: core::ops::function::FnOnce::call_once{{vtable.shim}}::hde4c9792015a3419 (0x55c476a34392)
  33: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55c47959ec3e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  34: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55c4795b942f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  35: start_thread (0x7f2c7eb0b6b9)
  36: clone (0x7f2c7d3a741c)
  37: <unknown> (0x0)
Servo exited with return value 4

maybe GC in mozjs?

It looks related to tracing the windowproxy, but I'm not sure what this is, it can be summarized as:

  1. A new page loads
  2. script::dom::window::Window::resume is called
  3. This calls into the window proxy WindowProxy.set_window
  4. It ends in a crash at script::dom::windowproxy::trace.

@asajeffrey any ideas?

I monitored the memory usage during the run and it appears that there is a pretty consistent/steady increase of 50-60MB resident size for every 100 page loads.

@kanaka I've just made a change that removes all caching of pages in the session history, this might have an effect and keep the memory usage down.

Oh gosh, that's a nasty stack trace. A GC got triggered during the middle of a brain transplant (https://github.com/servo/servo/blob/9b247983902d1fca945a0caff029c42d13c846e7/components/script/dom/windowproxy.rs#L586), and that crashed while tracing the window proxy that was in the middle of being transplanted.

Can you file a separate issue for this and cc me? I doubt it's got anything to do with the issues around fds, it's more likely that it's caused by not rooting enough objects in set_window.

@gterzian I'm rebuilding to test again with your session history caching change. Shouldn't the session history caching detect that the file hasn't changed and re-use the existing content? Maybe the simple Python web server I'm using for this doesn't allow proper caching?

Shouldn't the session history caching detect that the file hasn't changed and re-use the existing content?

I think so, it would only cache previous history states for a given "tab". It might not do anything if you are indeed reloading the same page.

I'm not sure where the steady memory build-up would come from only re-loading the same page many times. It might be because we create a new layout thread each time? I'm not sure, there are quite a few moving pieces involved in loading a page...

Although in theory we should also remove layout-threads when a page is unloaded...

I'm not completely sure, but I think each load at the same url would create a new pipeline, with a new layout-thread, and those would stay alive until the session history containing them is trimmed(and even if the content is cached at the HTTP layer, I think we still create new pipelines).

It's worth trying out running this with 0 session-history length, and then looking into the details...

Yes, it's reloading http://localhost:9080/test3.html every time. I did a manual curl of the page and it looks like python's simple web server is in fact returning a correct Last-Modified header.

I just started a test run and should have a result in a few minutes.

it looks like python's simple web server is in fact returning a correct Last-Modified header.

Yes the HTTP caching behavior would be unrelated to creating a new pipeline, I think. So the caching would save the network round-trip, while still creating a new layout thread(need to check).

The pipeline would be retrieved"cached" from the session history only if you were to do stuff like history.back(or pushing the back button if there was one).

So loading a page many times might actually fill-up the session-history, I'm not quite sure.

I'll let it run to conclusion, but it's up to 400+ loads so far and the resident memory is growing at a steady 57MB per 100 page loads.

Forgot to post an update here since #23959 is the tracker for that. It crashed at the same spot after 819 loads and the memory growth was at the same rate.

I have built using 316532e0cfc81 (part of #23909) and I've also increased the complexity of one of the CSS files that is included 10 times. I got the following crash on the 1236th page load:

$ ./mach run --release -z --webdriver=7002 --resolution=400x300



index out of bounds: the len is 0 but the index is 0 (thread Memory profiler, at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/libcore/slice/mod.rs:2687)
Failed to create IPC channel!: Os { code: 24, kind: Other, message: "Too many open files" } (thread Constellation, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h60525228e0a9f42a (0x55c4073aa70f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55c40a601135)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55c40a600bd1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55c40a600ab5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55c40a62381c)
             at src/libcore/panicking.rs:85
   5: core::panicking::panic_bounds_check::h3334a903dc8fe229 (0x55c40a6237d4)
             at src/libcore/panicking.rs:61
   6: std::thread::local::LocalKey<T>::with::hf394f06cd5f96290 (0x55c407720500)
   7: _ZN11ipc_channel3ipc25deserialize_os_ipc_sender17h028f5a20169aec4fE.llvm.426881690177531104 (0x55c4076f78f9)
   8: <&mut bincode::de::Deserializer<R,O> as serde::de::VariantAccess>::tuple_variant::h901b33aeccdedf74 (0x55c407709a76)
   9: <profile_traits::mem::_IMPL_DESERIALIZE_FOR_ProfilerMsg::<impl serde::de::Deserialize for profile_traits::mem::ProfilerMsg>::deserialize::__Visitor as serde::de::Visitor>::visit_enum::h7639dceb33562ecc (0x55c407703e46)
  10: bincode::deserialize::hfffe5305b07d63bf (0x55c407702727)
  11: std::thread::local::LocalKey<T>::with::h327d59b0449ee740 (0x55c407720c2d)
  12: ipc_channel::ipc::IpcReceiver<T>::recv::h86d3d5985252ac64 (0x55c4076f74dd)
  13: profile::mem::Profiler::start::h669e1ee0f0648ed2 (0x55c4076f4a48)
  14: std::sys_common::backtrace::__rust_begin_short_backtrace::hfd5194097962eef1 (0x55c40770da0f)
  15: __rust_maybe_catch_panic (0x55c40a60ade9)
             at src/libpanic_unwind/lib.rs:82
  16: core::ops::function::FnOnce::call_once{{vtable.shim}}::hc6335b49c5f364c9 (0x55c4076f1ad6)
  17: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55c40a5ef91e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  18: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55c40a60a10f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  19: start_thread (0x7ff9a50136b9)
  20: clone (0x7ff9a38af41c)
  21: <unknown> (0x0)
[2019-08-14T16:19:52Z ERROR servo] index out of bounds: the len is 0 but the index is 0
stack backtrace:
   0: servo::main::{{closure}}::h60525228e0a9f42a (0x55c4073aa70f)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55c40a601135)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55c40a600bd1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55c40a600ab5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55c40a62381c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x55c40a623916)
             at src/libcore/result.rs:1051
   6: constellation::network_listener::NetworkListener::initiate_fetch::h3af52c632e2be6c4 (0x55c4091c16eb)
   7: constellation::constellation::Constellation<Message,LTF,STF>::handle_request_from_script::h41903a232fe17de6 (0x55c4075510c9)
   8: constellation::constellation::Constellation<Message,LTF,STF>::run::h08fa371efa88aa3a (0x55c4075664b1)
   9: std::sys_common::backtrace::__rust_begin_short_backtrace::h677f09aabbca1ec1 (0x55c407468465)
  10: std::panicking::try::do_call::hc170fd9aa1bedd4a (0x55c40746cd35)
  11: __rust_maybe_catch_panic (0x55c40a60ade9)
             at src/libpanic_unwind/lib.rs:82
  12: core::ops::function::FnOnce::call_once{{vtable.shim}}::h70e5314c6c21c735 (0x55c40746d0d5)
  13: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55c40a5ef91e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  14: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55c40a60a10f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  15: start_thread (0x7ff9a50136b9)
  16: clone (0x7ff9a38af41c)
  17: <unknown> (0x0)
[2019-08-14T16:19:52Z ERROR servo] Failed to create IPC channel!: Os { code: 24, kind: Other, message: "Too many open files" }
^CServo exited with return value -2

Ok so that's constellation::network_listener::NetworkListener::initiate_fetch which also creates an ipc-channel for each fetch. It should not be a super hard fix, since the constellation can just use it's own local router, like I've already done in script.

Since the ipc-channel is created repeatedly on a "per-operation" basis, this is still a feasable refactoring.

A harder to overcome limit will be when we hit a crash on creating an essential ipc-channel, one that is used for as long as the browser is running to setup some communication line between two long-running components, or one that is used for as long as a page is running for communication between that page and another long-running component. If we're lucky we will not hit such a limit, since those channels aren't created that often.

Also, adding lots of images to the test will make it crash, and that's not something that is fixable with the router prototype, because it does not support messages containing shared-memory, which ipc-messages from the image-cache contain.

@kanaka OK, this was taking care of in https://github.com/servo/servo/pull/23909, ready for another spin...

I'll give it another shot.

I rebuilt and tested. Panic on 1235:

$ ./mach run --release -z --webdriver=7002 --resolution=400x300



index out of bounds: the len is 0 but the index is 0 (thread FontCacheThread, at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/libcore/slice/mod.rs:2687)
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x55a7caa7d8df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x55a7cdcda155)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x55a7cdcd9bf1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x55a7cdcd9ad5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55a7cdcfc83c)
             at src/libcore/panicking.rs:85
   5: core::panicking::panic_bounds_check::h3334a903dc8fe229 (0x55a7cdcfc7f4)
             at src/libcore/panicking.rs:61
   6: std::thread::local::LocalKey<T>::with::he24c106e0d8aa549 (0x55a7ccc131e0)
   7: _ZN11ipc_channel3ipc25deserialize_os_ipc_sender17hb93c845a1ccecf09E.llvm.10778825050291159297 (0x55a7ccbf4729)
   8: <&mut bincode::de::Deserializer<R,O> as serde::de::VariantAccess>::tuple_variant::hb3ea552309d411b6 (0x55a7ccc52eda)
   9: <gfx::font_cache_thread::_IMPL_DESERIALIZE_FOR_Command::<impl serde::de::Deserialize for gfx::font_cache_thread::Command>::deserialize::__Visitor as serde::de::Visitor>::visit_enum::h76b38db0cc975890 (0x55a7ccc38eb9)
  10: bincode::deserialize::he738282d38e18dcd (0x55a7ccbfc467)
  11: std::thread::local::LocalKey<T>::with::hb36f88408bc774e8 (0x55a7ccc132fd)
  12: ipc_channel::ipc::IpcReceiver<T>::recv::he4170025608adad8 (0x55a7ccbf43ad)
  13: gfx::font_cache_thread::FontCache::run::h097efedd8dbdbe23 (0x55a7ccc31714)
  14: std::sys_common::backtrace::__rust_begin_short_backtrace::h0b5b51c7cd8fa7ba (0x55a7ccbec825)
  15: _ZN3std9panicking3try7do_call17h8f79c11b9595428eE.llvm.11595909282632282622 (0x55a7ccbfbb04)
  16: __rust_maybe_catch_panic (0x55a7cdce3e09)
             at src/libpanic_unwind/lib.rs:82
  17: core::ops::function::FnOnce::call_once{{vtable.shim}}::hecf6ea68b5a7a379 (0x55a7ccbfbbb9)
  18: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x55a7cdcc893e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  19: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x55a7cdce312f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  20: start_thread (0x7f1cefd856b9)
  21: clone (0x7f1cee62141c)
  22: <unknown> (0x0)
[2019-08-14T21:05:50Z ERROR servo] index out of bounds: the len is 0 but the index is 0
^CServo exited with return value -2

This looks like a panic when receiving an ipc message containing an ipc-sender, hit when trying to deserialize the sender.

Essentially it hits:

https://github.com/servo/ipc-channel/blob/ca6d860118fdb3fd7d0d0872866ded9f89cbde7b/src/ipc.rs#L818

However it's not clear why, is this another kind of resource exhaustion? I don't know.


On another matter, I was suddenly wondering whether the root case of this might be that we're somehow leaking some resource related to taking screeshots? Could it be that we're not closing those image files properly?

@kanaka Could you perhaps try running without this line:

https://gist.github.com/kanaka/119f5ed9841e23e35d07e8944cca6aa7#file-load_test3-sh-L15

Also on my mac, I can't actually open any of the screenshots, it seems they're corrupted in some way, can you?

@gterzian Okay, I ran a test without the screenshot (a 0.3 second sleep instead, although I plan to shrink that later to see if it's actually needed). I updated the gist with the current test script that I used. I also do the CSS file duplication in the script and download the AHEM font if it isn't already there. I'm actually wondering if the reason you don't see any panics/crashes at all is because you don't have the AHEM font. My fault for not making that clear in the gist but the new script should download it automatically if it's not there.

The test certainly runs a lot faster without the screenshot. However, it still crashed (after 1151 loads) and I still see steady memory growth (about 71MB per 100 loads). One data point that may or may not be interesting is that the loads speed up quite a bit after the first 9 or 10. Here is the trace. The "No font found for text run!" in the middle sticks out to me, not sure if that's interesting or not.

$ ./mach run --release -z --webdriver=7002 --resolution=400x300



No font found for text run! (thread LayoutThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(2305) }, at src/libcore/option.rs:1155)
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x559331d868df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x559334fe3155)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x559334fe2bf1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x559334fe2ad5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55933500583c)
             at src/libcore/panicking.rs:85
   5: core::option::expect_failed::hda3371b431c2ae7c (0x5593350058a6)
             at src/libcore/option.rs:1155
   6: layout::text::TextRunScanner::scan_for_runs::hbe2e999d911c9c22 (0x559333a57234)
   7: std::thread::local::LocalKey<T>::with::h1ae8f1e904247562 (0x559332184d3d)
   8: layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode>::build_flow_for_block_starting_with_fragments::h2046a790499a0421 (0x5593321f1646)
   9: layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode>::build_flow_for_block_like::h0d5acb411a1f7bee (0x5593321ee403)
  10: layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode>::build_flow_for_block::hb857f1248aa6f0fc (0x5593321ed9a7)
  11: <layout::construct::FlowConstructor<ConcreteThreadSafeLayoutNode> as layout::traversal::PostorderNodeMutTraversal<ConcreteThreadSafeLayoutNode>>::process::heb2614a25083e698 (0x5593321dff50)
  12: style::traversal::DomTraversal::handle_postorder_traversal::ha55581bf46408a23 (0x55933219ae1d)
  13: style::driver::traverse_dom::h5dfd1bc593a7ea93 (0x5593321ebd15)
  14: profile_traits::time::profile::h42151df1cd5c051d (0x5593322bff0d)
  15: layout_thread::LayoutThread::handle_reflow::h5590aa3ae833f52a (0x5593321ad261)
  16: profile_traits::time::profile::hbe91ab6345b23f61 (0x5593322c20e0)
  17: layout_thread::LayoutThread::handle_request_helper::h7828d77a13e4a469 (0x5593321a5f57)
  18: layout_thread::LayoutThread::start::had947240cbd73635 (0x5593321a4952)
  19: profile_traits::mem::ProfilerChan::run_with_memory_reporting::hf45b025b46065f78 (0x55933221ac43)
  20: std::sys_common::backtrace::__rust_begin_short_backtrace::hb2625d2eeaccc082 (0x559332264e7b)
  21: _ZN3std9panicking3try7do_call17h446a09e2d3abf1feE.llvm.18010773716281589967 (0x55933221dab5)
  22: __rust_maybe_catch_panic (0x559334fece09)
             at src/libpanic_unwind/lib.rs:82
  23: core::ops::function::FnOnce::call_once{{vtable.shim}}::h8343d019ca476207 (0x5593322c52e5)
  24: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x559334fd193e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  25: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x559334fec12f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  26: start_thread (0x7fb948bf86b9)
  27: clone (0x7fb94749441c)
  28: <unknown> (0x0)
[2019-08-15T15:23:26Z ERROR servo] No font found for text run!
ipc channel failure: Os { code: 24, kind: Other, message: "Too many open files" } (thread LayoutThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(2307) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x559331d868df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x559334fe3155)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x559334fe2bf1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x559334fe2ad5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55933500583c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x559335005936)
             at src/libcore/result.rs:1051
   6: std::sys_common::backtrace::__rust_begin_short_backtrace::hb2625d2eeaccc082 (0x559332264f43)
   7: _ZN3std9panicking3try7do_call17h446a09e2d3abf1feE.llvm.18010773716281589967 (0x55933221dab5)
   8: __rust_maybe_catch_panic (0x559334fece09)
             at src/libpanic_unwind/lib.rs:82
   9: core::ops::function::FnOnce::call_once{{vtable.shim}}::h8343d019ca476207 (0x5593322c52e5)
  10: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x559334fd193e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  11: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x559334fec12f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  12: start_thread (0x7fb948bf86b9)
  13: clone (0x7fb94749441c)
  14: <unknown> (0x0)
[2019-08-15T15:23:26Z ERROR servo] ipc channel failure: Os { code: 24, kind: Other, message: "Too many open files" }
Pipeline main chan: Os { code: 24, kind: Other, message: "Too many open files" } (thread Constellation, at src/libcore/result.rs:1051)
assertion failed: !self.Document().needs_reflow() ||
    (!for_display && self.Document().needs_paint()) ||
    self.suppress_reflow.get() (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(3) }, at components/script/dom/window.rs:1569)
called `Result::unwrap()` on an `Err` value: "SendError(..)" (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(2307) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x559331d868df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x559334fe3155)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x559334fe2bf1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x559334fe2ad5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55933500583c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x559335005936)
             at src/libcore/result.rs:1051
   6: constellation::pipeline::Pipeline::spawn::h2e0dd6502433886e (0x559331f03c32)
   7: constellation::constellation::Constellation<Message,LTF,STF>::new_pipeline::h351ec532d56d1c3e (0x559331f23330)
   8: constellation::constellation::Constellation<Message,LTF,STF>::handle_panic::h042f04c7b3b9871a (0x559331f21ecb)
   9: constellation::constellation::Constellation<Message,LTF,STF>::handle_log_entry::h8b6d45b4c1ba0fdb (0x559331f25391)
  10: constellation::constellation::Constellation<Message,LTF,STF>::handle_request_from_compositor::h654ac91ab92dcc30 (0x559331f3a95d)
  11: constellation::constellation::Constellation<Message,LTF,STF>::run::h34106f2ed53bd9f4 (0x559331f42ab3)
  12: std::sys_common::backtrace::__rust_begin_short_backtrace::h117d5309a5f2acd1 (0x559331e44565)
  13: std::panicking::try::do_call::hb2ecaaff51af18d9 (0x559331e48e35)
  14: __rust_maybe_catch_panic (0x559334fece09)
             at src/libpanic_unwind/lib.rs:82
  15: core::ops::function::FnOnce::call_once{{vtable.shim}}::hb04874afd1aa4e55 (0x559331e491d5)
  16: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x559334fd193e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  17: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x559334fec12f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  18: start_thread (0x7fb948bf86b9)
  19: clone (0x7fb94749441c)
  20: <unknown> (0x0)
[2019-08-15T15:23:26Z ERROR servo] Pipeline main chan: Os { code: 24, kind: Other, message: "Too many open files" }
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x559331d868df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x559334fe3155)
             at src/libstd/panicking.rs:481
   2: std::panicking::begin_panic::h8a8ec1e1f80328f5 (0x559332f318b4)
   3: script::dom::window::Window::reflow::h18e2fc28b2f65e02 (0x5593324adcb0)
   4: script::dom::document::Document::finish_load::hfee20734d6704f7b (0x559332914483)
   5: script::dom::servoparser::ServoParser::do_parse_sync::h008a9762ab6e11fd (0x559332947667)
   6: profile_traits::time::profile::h68e51e1dd99e1a9d (0x559332b04b6f)
   7: _ZN6script3dom11servoparser11ServoParser10parse_sync17hd8eeaf9d5d6d97e1E.llvm.9251349843847225484 (0x559332946f1f)
   8: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response_eof::h165014bd10d4e390 (0x55933294b54d)
   9: script::script_thread::ScriptThread::handle_msg_from_constellation::hb6598e64e0108a83 (0x5593329899b1)
  10: _ZN6script13script_thread12ScriptThread11handle_msgs17h5e27f5d6a2fa27a7E.llvm.9251349843847225484 (0x55933298420c)
  11: profile_traits::mem::ProfilerChan::run_with_memory_reporting::hd3dc945019b6cd59 (0x559332b03f07)
  12: std::sys_common::backtrace::__rust_begin_short_backtrace::he8d3dd1d346dafa0 (0x559332e43190)
  13: _ZN3std9panicking3try7do_call17h8ff058ddb9dca993E.llvm.11103976371393050535 (0x559332f604c3)
  14: __rust_maybe_catch_panic (0x559334fece09)
             at src/libpanic_unwind/lib.rs:82
  15: core::ops::function::FnOnce::call_once{{vtable.shim}}::hde4c9792015a3419 (0x5593324610e2)
  16: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x559334fd193e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  17: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x559334fec12f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  18: start_thread (0x7fb948bf86b9)
  19: clone (0x7fb94749441c)
  20: <unknown> (0x0)
[2019-08-15T15:23:26Z ERROR servo] assertion failed: !self.Document().needs_reflow() ||
    (!for_display && self.Document().needs_paint()) ||
    self.suppress_reflow.get()
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x559331d868df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x559334fe3155)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x559334fe2bf1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x559334fe2ad5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x55933500583c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x559335005936)
             at src/libcore/result.rs:1051
   6: script::script_thread::InProgressLoad::new::h130dfff9ee7e9eb0 (0x55933297beda)
   7: std::sys_common::backtrace::__rust_begin_short_backtrace::he8d3dd1d346dafa0 (0x559332e4307f)
   8: _ZN3std9panicking3try7do_call17h8ff058ddb9dca993E.llvm.11103976371393050535 (0x559332f604c3)
   9: __rust_maybe_catch_panic (0x559334fece09)
             at src/libpanic_unwind/lib.rs:82
  10: core::ops::function::FnOnce::call_once{{vtable.shim}}::hde4c9792015a3419 (0x5593324610e2)
  11: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x559334fd193e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x559334fec12f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  13: start_thread (0x7fb948bf86b9)
  14: clone (0x7fb94749441c)
  15: <unknown> (0x0)
[2019-08-15T15:23:26Z ERROR servo] called `Result::unwrap()` on an `Err` value: "SendError(..)"
Stack trace for thread "ScriptThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(2307) }"
stack backtrace:
   0: servo::install_crash_handler::handler::h8504517799a5dbb4 (0x559331d85b4a)
   1: _ZL15WasmTrapHandleriP9siginfo_tPv (0x5593338832ce)
             at /home/joelm/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/6dff104/mozjs/js/src/wasm/WasmSignalHandlers.cpp:967
   2: <unknown> (0x7fb948c0238f)
   3: je_rtree_val_read (0x559333fbf157)
             at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/jemalloc-sys-0.1.4/jemalloc/include/jemalloc/internal/rtree.h:200
      je_rtree_get
             at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/jemalloc-sys-0.1.4/jemalloc/include/jemalloc/internal/rtree.h:325
      je_chunk_lookup
             at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/jemalloc-sys-0.1.4/jemalloc/include/jemalloc/internal/chunk.h:89
      huge_node_get
             at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/jemalloc-sys-0.1.4/jemalloc/src/huge.c:11
   4: je_huge_dalloc (0x559333fbedf8)
             at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/jemalloc-sys-0.1.4/jemalloc/src/huge.c:424
   5: je_arena_sdalloc (0x559333fa6cd1)
             at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/jemalloc-sys-0.1.4/jemalloc/include/jemalloc/internal/arena.h:1522
      je_isdalloct
             at include/jemalloc/internal/jemalloc_internal.h:1195
      je_isqalloc
             at include/jemalloc/internal/jemalloc_internal.h:1205
      isfree
             at /home/joelm/.cargo/registry/src/github.com-1ecc6299db9ec823/jemalloc-sys-0.1.4/jemalloc/src/jemalloc.c:1921
   6: core::ptr::real_drop_in_place::hf5d5b4cd0f620e71 (0x559332e50767)
   7: std::sys_common::backtrace::__rust_begin_short_backtrace::he8d3dd1d346dafa0 (0x559332e433bd)
   8: <unknown> (0x559336ae9d87)
Servo exited with return value 11

I removed the sleep and this has revealed a new symptom. At the start of the run there some high jitter in the amount of time each load takes to complete and return a response to the script. The first 10 seem to be slow, then there are some significant hesitations in the first 100 loads or so. After that things seem pretty consistent per load. However, there is a very noticeable slow down over time. I'll try and characterize the nature of the slowdown a bit better for the next run.

This run (without the sleep) panic'd on load 972 with a short stack:

$ ./mach run --release -z --webdriver=7002 --resolution=400x300



index out of bounds: the len is 0 but the index is 0 (thread FontCacheThread, at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/libcore/slice/mod.rs:2687)
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x55706d7748df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x5570709d1155)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x5570709d0bf1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x5570709d0ad5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x5570709f383c)
             at src/libcore/panicking.rs:85
   5: core::panicking::panic_bounds_check::h3334a903dc8fe229 (0x5570709f37f4)
             at src/libcore/panicking.rs:61
   6: std::thread::local::LocalKey<T>::with::he24c106e0d8aa549 (0x55706f90a1e0)
   7: _ZN11ipc_channel3ipc25deserialize_os_ipc_sender17hb93c845a1ccecf09E.llvm.10778825050291159297 (0x55706f8eb729)
   8: <&mut bincode::de::Deserializer<R,O> as serde::de::VariantAccess>::tuple_variant::h5659fddb694847fd (0x55706f9479f6)
   9: <gfx::font_cache_thread::_IMPL_DESERIALIZE_FOR_Command::<impl serde::de::Deserialize for gfx::font_cache_thread::Command>::deserialize::__Visitor as serde::de::Visitor>::visit_enum::h76b38db0cc975890 (0x55706f92fecb)
  10: bincode::deserialize::he738282d38e18dcd (0x55706f8f3467)
  11: std::thread::local::LocalKey<T>::with::hb36f88408bc774e8 (0x55706f90a2fd)
  12: ipc_channel::ipc::IpcReceiver<T>::recv::he4170025608adad8 (0x55706f8eb3ad)
  13: gfx::font_cache_thread::FontCache::run::h097efedd8dbdbe23 (0x55706f928714)
  14: std::sys_common::backtrace::__rust_begin_short_backtrace::h0b5b51c7cd8fa7ba (0x55706f8e3825)
  15: _ZN3std9panicking3try7do_call17h8f79c11b9595428eE.llvm.11595909282632282622 (0x55706f8f2b04)
  16: __rust_maybe_catch_panic (0x5570709dae09)
             at src/libpanic_unwind/lib.rs:82
  17: core::ops::function::FnOnce::call_once{{vtable.shim}}::hecf6ea68b5a7a379 (0x55706f8f2bb9)
  18: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x5570709bf93e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  19: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x5570709da12f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  20: start_thread (0x7f1d8b2b26b9)
  21: clone (0x7f1d89b4e41c)
  22: <unknown> (0x0)
[2019-08-15T15:56:07Z ERROR servo] index out of bounds: the len is 0 but the index is 0
^CServo exited with return value -2

I added timestamps to every run and tried again. This time script hung on the 739th load. No messages from the running servo process. I tried doing a manual verbose curl POST to the webdriver port and while it opens a network connection and accepts the POST, there is no response at all.

However, the timing behavior looked pretty similar to before so here are the timestamp for the first 20 loads and every 100 after that along with the delta seconds from the previous line:

Thu Aug 15 11:02:00 CDT 2019: Run 0
Thu Aug 15 11:02:01 CDT 2019: Run 1, Delta: 1
Thu Aug 15 11:02:02 CDT 2019: Run 2, Delta: 1
Thu Aug 15 11:02:03 CDT 2019: Run 3, Delta: 1
Thu Aug 15 11:02:04 CDT 2019: Run 4, Delta: 1
Thu Aug 15 11:02:05 CDT 2019: Run 5, Delta: 1
Thu Aug 15 11:02:06 CDT 2019: Run 6, Delta: 1
Thu Aug 15 11:02:07 CDT 2019: Run 7, Delta: 1
Thu Aug 15 11:02:08 CDT 2019: Run 8, Delta: 1
Thu Aug 15 11:02:09 CDT 2019: Run 9, Delta: 1
Thu Aug 15 11:02:10 CDT 2019: Run 10, Delta: 1
Thu Aug 15 11:02:11 CDT 2019: Run 11, Delta: 1
Thu Aug 15 11:02:12 CDT 2019: Run 12, Delta: 1
Thu Aug 15 11:02:13 CDT 2019: Run 13, Delta: 1
Thu Aug 15 11:02:13 CDT 2019: Run 14, Delta: 0
Thu Aug 15 11:02:13 CDT 2019: Run 15, Delta: 0
Thu Aug 15 11:02:13 CDT 2019: Run 16, Delta: 0
Thu Aug 15 11:02:13 CDT 2019: Run 17, Delta: 0
Thu Aug 15 11:02:13 CDT 2019: Run 18, Delta: 0
Thu Aug 15 11:02:13 CDT 2019: Run 19, Delta: 0
Thu Aug 15 11:02:13 CDT 2019: Run 20, Delta: 0
...
Thu Aug 15 11:02:25 CDT 2019: Run 100, Delta: 12 (25 from Run 0)
Thu Aug 15 11:02:35 CDT 2019: Run 200, Delta: 10
Thu Aug 15 11:02:48 CDT 2019: Run 300, Delta: 13
Thu Aug 15 11:03:05 CDT 2019: Run 400, Delta: 17
Thu Aug 15 11:03:25 CDT 2019: Run 500, Delta: 20
Thu Aug 15 11:03:49 CDT 2019: Run 600, Delta: 24
Thu Aug 15 11:04:16 CDT 2019: Run 700, Delta: 27

Ran again and panic on load 974. Here are the are the number of seconds for each 100 loads: 20s, 11s, 14s, 16s, 21s, 24s, 29s, 32s, 35s, 29s (for last 74).

Here is the panic for this run:

$ ./mach run --release -z --webdriver=7002 --resolution=400x300                                                                                                                                                               [35/384]



called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" } (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(2), index: PipelineIndex(3) }, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x559f402398df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x559f43496155)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x559f43495bf1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x559f43495ad5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x559f434b883c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x559f434b8936)
             at src/libcore/result.rs:1051
   6: script::script_thread::ScriptThread::load::h840f42e218b48e42 (0x559f40e52470)
   7: script::script_thread::ScriptThread::handle_page_headers_available::hc7e938eff32254e6 (0x559f40e4db62)
   8: std::thread::local::LocalKey<T>::with::h242cf5a3248d4b16 (0x559f40cd476d)
   9: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response::hfea17b04486e1186 (0x559f40dfb8fd)
  10: script::script_thread::ScriptThread::handle_msg_from_constellation::hb6598e64e0108a83 (0x559f40e3cbef)
  11: _ZN6script13script_thread12ScriptThread11handle_msgs17h5e27f5d6a2fa27a7E.llvm.9251349843847225484 (0x559f40e3720c)
  12: profile_traits::mem::ProfilerChan::run_with_memory_reporting::hd3dc945019b6cd59 (0x559f40fb6f07)
  13: std::sys_common::backtrace::__rust_begin_short_backtrace::he8d3dd1d346dafa0 (0x559f412f6190)
  14: _ZN3std9panicking3try7do_call17h8ff058ddb9dca993E.llvm.11103976371393050535 (0x559f414134c3)
  15: __rust_maybe_catch_panic (0x559f4349fe09)
             at src/libpanic_unwind/lib.rs:82
  16: core::ops::function::FnOnce::call_once{{vtable.shim}}::hde4c9792015a3419 (0x559f409140e2)
  17: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x559f4348493e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  18: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x559f4349f12f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  19: start_thread (0x7f4316fe06b9)
  20: clone (0x7f431587c41c)
  21: <unknown> (0x0)
[2019-08-15T16:18:40Z ERROR servo] called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" }
Pipeline main chan: Os { code: 24, kind: Other, message: "Too many open files" } (thread Constellation, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x559f402398df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x559f43496155)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x559f43495bf1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x559f43495ad5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x559f434b883c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x559f434b8936)
             at src/libcore/result.rs:1051
   6: constellation::pipeline::Pipeline::spawn::h2e0dd6502433886e (0x559f403b6c32)
   7: constellation::constellation::Constellation<Message,LTF,STF>::new_pipeline::h351ec532d56d1c3e (0x559f403d6330)
   8: constellation::constellation::Constellation<Message,LTF,STF>::handle_panic::h042f04c7b3b9871a (0x559f403d4ecb)
   9: constellation::constellation::Constellation<Message,LTF,STF>::handle_log_entry::h8b6d45b4c1ba0fdb (0x559f403d8391)
  10: constellation::constellation::Constellation<Message,LTF,STF>::handle_request_from_compositor::h654ac91ab92dcc30 (0x559f403ed95d)
  11: constellation::constellation::Constellation<Message,LTF,STF>::run::h34106f2ed53bd9f4 (0x559f403f5ab3)
  12: std::sys_common::backtrace::__rust_begin_short_backtrace::h117d5309a5f2acd1 (0x559f402f7565)
  13: std::panicking::try::do_call::hb2ecaaff51af18d9 (0x559f402fbe35)
  14: __rust_maybe_catch_panic (0x559f4349fe09)
             at src/libpanic_unwind/lib.rs:82
  15: core::ops::function::FnOnce::call_once{{vtable.shim}}::hb04874afd1aa4e55 (0x559f402fc1d5)
  16: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x559f4348493e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  17: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x559f4349f12f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  18: start_thread (0x7f4316fe06b9)
  19: clone (0x7f431587c41c)
  20: <unknown> (0x0)
[2019-08-15T16:18:40Z ERROR servo] Pipeline main chan: Os { code: 24, kind: Other, message: "Too many open files" }
called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders for this socket closed" }) (thread StorageManager, at src/libcore/result.rs:1051)
stack backtrace:
   0: servo::main::{{closure}}::h4626533f66eeeaff (0x559f402398df)
   1: std::panicking::rust_panic_with_hook::h0529069ab88f357a (0x559f43496155)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h6a820a3cd2914e74 (0x559f43495bf1)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x559f43495ad5)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::he00cfaca5555542a (0x559f434b883c)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h4239b9d80132a0db (0x559f434b8936)
             at src/libcore/result.rs:1051
   6: net::storage_thread::StorageManager::start::h8eeece525ce345e8 (0x559f422534ef)
   7: std::sys_common::backtrace::__rust_begin_short_backtrace::ha904cc00c6b0933c (0x559f421a9ab2)
   8: _ZN3std9panicking3try7do_call17he2da217e7601c238E.llvm.10592537453742836018 (0x559f421c608b)
   9: __rust_maybe_catch_panic (0x559f4349fe09)
             at src/libpanic_unwind/lib.rs:82
  10: core::ops::function::FnOnce::call_once{{vtable.shim}}::h0143da5e67cc8fb6 (0x559f4228152f)
  11: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h10faab75e4737451 (0x559f4348493e)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
  12: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h1aad66e3be56d28c (0x559f4349f12f)
             at /rustc/273f42b5964c29dda2c5a349dd4655529767b07f/src/liballoc/boxed.rs:766
      std::sys_common::thread::start_thread::h9a8131af389e9a10
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::h5c5575a5d923977b
             at src/libstd/sys/unix/thread.rs:79
  13: start_thread (0x7f4316fe06b9)
  14: clone (0x7f431587c41c)
  15: <unknown> (0x0)
[2019-08-15T16:18:40Z ERROR servo] called `Result::unwrap()` on an `Err` value: Io(Custom { kind: ConnectionReset, error: "All senders for this socket closed" })
^CServo exited with return value -2

Thanks for all the new data.

So at least we know it's not related to taking the screenshots.

it looks like we're hitting the panic now when creating "essential" channels, for example with Pipeline main chan: Os { code: 24, kind: Other, message: "Too many open files" }.

And those unfortunately cannot be refactored easily, since it's really a channel that needs to be created and used repeatedly, versus the other cases where channels were created on a "per operation" basis.

I do wonder what is causing this, since in theory we should not be constantly adding open files to the system, since we do drop those channels on each new load(since your branch doesn't have any pages cached in the session history).

I don't have an immediate solution in mind for this. Is there a way to count open files in linux, so that you could print it out on each load? It would be interesting to see if it indeed builds up gradually over time.

Regarding the slow down you noticed, I do not have an idea as to the reason.

The "No font found for text run!" in the middle sticks out to me, not sure if that's interesting or not.

I think that is a result of one of the other panics.

It's definitely using more file descriptors over time. Looks like it uses 4 more FDs for every page load. It starts with about 200 and shows 4080 right before the final failing page load. The soft limit for that process is in fact 4096:

$ cat /proc/PID/limits
...
Max open files            4096                 65536                files     
...

I ran the following command periodically during the run:

ls -l /proc/PID/fd/ > fdX

Here is the first part of the diff from around 200 page loads to somewhere in the 900 range (the diff is 3028 lines long so I've omitted most of it:

--- fd1s        2019-08-15 13:37:04.590485158 -0500
+++ fd10s       2019-08-15 13:39:55.088990482 -0500
@@ -39,7 +39,7 @@ lrwx------ 1 joelm joelm 64 Aug 15 13:27
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 37 -> socket:[90796561]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 38 -> socket:[90796562]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 39 -> socket:[90796563]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 40 -> socket:[90814589]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 40 -> socket:[90851512]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 41 -> socket:[90796565]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 42 -> socket:[90798049]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 43 -> socket:[90798050]
@@ -84,8 +84,8 @@ lr-x------ 1 joelm joelm 64 Aug 15 13:27
 l-wx------ 1 joelm joelm 64 Aug 15 13:27 82 -> pipe:[90799269]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 83 -> socket:[90796564]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 84 -> socket:[90798598]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 85 -> socket:[90801330]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 86 -> socket:[90798056]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 85 -> socket:[90827745]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 86 -> socket:[90830654]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 87 -> anon_inode:[eventpoll]
 lr-x------ 1 joelm joelm 64 Aug 15 13:27 88 -> pipe:[90798057]
 l-wx------ 1 joelm joelm 64 Aug 15 13:27 89 -> pipe:[90798057]
@@ -858,7 +858,7 @@ lrwx------ 1 joelm joelm 64 Aug 15 13:27
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 856 -> socket:[90809054]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 857 -> socket:[90813832]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 858 -> socket:[90810984]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 859 -> socket:[90812729]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 859 -> socket:[90814612]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 860 -> socket:[90810985]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 861 -> /dev/shm/ipc-channel-shared-memory.173.14850.1565893668.490793947 (deleted)
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 862 -> socket:[90814543]
@@ -868,20 +868,20 @@ lrwx------ 1 joelm joelm 64 Aug 15 13:27
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 866 -> socket:[90811823]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 867 -> socket:[90813840]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 868 -> /dev/shm/ipc-channel-shared-memory.172.14850.1565893668.379359854 (deleted)
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 869 -> socket:[90813841]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 870 -> socket:[90814591]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 869 -> socket:[90814617]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 870 -> socket:[90812776]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 871 -> socket:[90813776]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 872 -> /dev/shm/ipc-channel-shared-memory.174.14850.1565893668.587668573 (deleted)
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 873 -> /dev/shm/ipc-channel-shared-memory.177.14850.1565893668.895284886 (deleted)
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 874 -> socket:[90814544]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 875 -> socket:[90809058]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 875 -> socket:[90811112]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 876 -> socket:[90814545]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 877 -> socket:[90813805]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 878 -> socket:[90813806]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 879 -> socket:[90812702]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 880 -> socket:[90812703]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 881 -> socket:[90809059]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 882 -> socket:[90812730]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 881 -> socket:[90812781]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 882 -> socket:[90812793]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 883 -> /dev/shm/ipc-channel-shared-memory.176.14850.1565893668.805121950 (deleted)
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 884 -> socket:[90813777]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 885 -> socket:[90812735]
@@ -889,25 +889,2956 @@ lrwx------ 1 joelm joelm 64 Aug 15 13:27
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 887 -> socket:[90814572]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 888 -> socket:[90811922]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 889 -> /dev/shm/ipc-channel-shared-memory.178.14850.1565893668.985107936 (deleted)
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 890 -> socket:[90814592]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 890 -> socket:[90809639]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 891 -> socket:[90809068]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 892 -> socket:[90814573]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 893 -> socket:[90809062]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 893 -> socket:[90814659]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 894 -> socket:[90813821]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 895 -> socket:[90809063]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 895 -> socket:[90811936]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 896 -> socket:[90813822]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 897 -> /dev/shm/ipc-channel-shared-memory.180.14850.1565893669.175446700 (deleted)
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 898 -> socket:[90814593]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 898 -> socket:[90806178]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 899 -> socket:[90811923]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 900 -> socket:[90812740]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 901 -> socket:[90809061]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 900 -> socket:[90811149]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 901 -> /dev/shm/ipc-channel-shared-memory.181.14850.1565893669.270539619 (deleted)
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 902 -> socket:[90812736]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 903 -> socket:[90814594]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 904 -> socket:[90809067]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 905 -> socket:[90809066]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 906 -> socket:[90812741]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 907 -> socket:[90812742]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 903 -> /dev/shm/ipc-channel-shared-memory.182.14850.1565893669.361700004 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 904 -> socket:[90811155]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 905 -> socket:[90806207]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 906 -> socket:[90811938]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 907 -> socket:[90811165]
 lrwx------ 1 joelm joelm 64 Aug 15 13:27 908 -> socket:[90809069]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 909 -> socket:[90812743]
-lrwx------ 1 joelm joelm 64 Aug 15 13:27 910 -> socket:[90812744]
-lr-x------ 1 joelm joelm 64 Aug 15 13:27 911 -> /usr/share/fonts/truetype/dejavu/DejaVuSerif.ttf
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 909 -> /dev/shm/ipc-channel-shared-memory.183.14850.1565893669.457169862 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 910 -> socket:[90811937]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 911 -> socket:[90809081]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 912 -> /dev/shm/ipc-channel-shared-memory.185.14850.1565893669.643429018 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 913 -> /dev/shm/ipc-channel-shared-memory.184.14850.1565893669.547205403 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 914 -> socket:[90811939]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 915 -> socket:[90809690]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 916 -> /dev/shm/ipc-channel-shared-memory.186.14850.1565893669.737494344 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 917 -> socket:[90814652]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 918 -> /dev/shm/ipc-channel-shared-memory.187.14850.1565893669.856791973 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 919 -> socket:[90814628]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 920 -> socket:[90814629]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 921 -> socket:[90809115]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 922 -> socket:[90809093]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 923 -> socket:[90814638]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 924 -> socket:[90814639]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 925 -> socket:[90811189]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 926 -> socket:[90809694]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 927 -> /dev/shm/ipc-channel-shared-memory.189.14850.1565893670.63582502 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 928 -> socket:[90809082]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 929 -> /dev/shm/ipc-channel-shared-memory.188.14850.1565893669.957956083 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 930 -> /dev/shm/ipc-channel-shared-memory.190.14850.1565893670.288127870 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 931 -> socket:[90811200]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 932 -> socket:[90809094]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 933 -> socket:[90814745]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 934 -> socket:[90814747]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 935 -> socket:[90814751]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 936 -> socket:[90814653]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 937 -> /dev/shm/ipc-channel-shared-memory.192.14850.1565893670.485832586 (deleted)
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 938 -> socket:[90809116]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 939 -> socket:[90806176]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 940 -> socket:[90806177]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 941 -> socket:[90809741]
+lrwx------ 1 joelm joelm 64 Aug 15 13:27 942 -> /dev/shm/ipc-channel-shared-memory.191.14850.1565893670.391665158 (deleted)
...
...

Thanks, then the question is why are they building up steadily, since in theory we should be trimming pages and dropping associated channels...

Indeed. @gterzian, can you think of an easy way to find out if the Document and Window objects are being dropped?

I can take a look.

Ok so it looks like documents are properly dropped, layout threads exited, once we start trimming the session history. So the size of documents stored by the script-thread is stable.

However I can see the thread count going up for each load, and for some reason it stops going up at around 350, and then seems to go down and then fluctuate above 300.

The number of mach ports goes up steadily.

Hmm, is each document going into it's own script thread? Are the documents similar origin?

All loads are in the same script-thread, and, except the first one, are hitting the "we've got an existing event-loop, just create a new layout" branch at

https://github.com/servo/servo/blob/3658a8cc591ef4ca827ce1cda9565a1bca7d7b3c/components/constellation/pipeline.rs#L227

The script is at https://gist.github.com/kanaka/119f5ed9841e23e35d07e8944cca6aa7

Hmm, but you say there's threads being created but not shut down? Do you
know what kind of thread?

On Sun, Aug 18, 2019, 16:20 Gregory Terzian notifications@github.com
wrote:

All loads, except the first one, are in the same script-thread, hitting
the "we've got an existing event-loop, just create a new layout" branch at

https://github.com/servo/servo/blob/3658a8cc591ef4ca827ce1cda9565a1bca7d7b3c/components/constellation/pipeline.rs#L227

The script is at
https://gist.github.com/kanaka/119f5ed9841e23e35d07e8944cca6aa7


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/servo/servo/issues/23905?email_source=notifications&email_token=AADCPBNW72GK2MMMHEZWIF3QFG4LDA5CNFSM4IIMWPS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4RIOGY#issuecomment-522356507,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AADCPBPIJFNZRUKW3O3HAW3QFG4LDANCNFSM4IIMWPSQ
.

I don't know what those threads are, I can only see in the mac activity monitor that the thread count goes up to 350, and then somehow levels off. Perhaps threads are kept around for a while even though they logically exited?

Can you run it in a debugger and see what it reckons the thread names are?
You can see this in gdb, not sure about the Mac debuggers.

On Sun, Aug 18, 2019, 16:27 Gregory Terzian notifications@github.com
wrote:

I don't know what those threads are, I can only see in the mac activity
monitor that the thread count goes up to 350, and then somehow levels off.
Perhaps threads are kept around for a while even though they logically
exited?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/servo/servo/issues/23905?email_source=notifications&email_token=AADCPBO2XQQOL32G3T57YDTQFG5EDA5CNFSM4IIMWPS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4RISOA#issuecomment-522357048,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AADCPBKOBC3K55F4STWLCEDQFG5EDANCNFSM4IIMWPSQ
.

For now I used the mac process sampler.

There seems to be something starting threads for each load,

It's either this sequence that repeats itself:

    188 Thread_24060
    + 188 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
    +   188 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
    +     188 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
    +       188 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
    +         188 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
    +           188 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
    +             188 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
    +               188 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
    +                 188 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
    +                   188 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
    +                     188 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]
    188 Thread_24086
    + 188 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
    +   188 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
    +     188 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
    +       188 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
    +         188 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
    +           188 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
    +             188 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
    +               188 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
    +                 188 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
    +                   188 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
    +                     188 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]
    188 Thread_24107
    + 188 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
    +   188 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
    +     188 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
    +       188 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
    +         188 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
    +           188 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
    +             188 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
    +               188 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
    +                 188 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
    +                   188 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
    +                     188 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]
    59 Thread_23706
      59 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
        59 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
          59 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
            59 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
              59 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
                59 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
                  59 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
                    59 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
                      59 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
                        59 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
                          59 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]

Or it this, with layout-thread creation interleaved:

   189 Thread_29185
    + 189 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
    +   189 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
    +     189 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
    +       189 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
    +         189 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
    +           189 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
    +             189 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
    +               189 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
    +                 189 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
    +                   189 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
    +                     189 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]
    189 Thread_29186
    + 189 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
    +   189 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
    +     189 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
    +       189 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
    +         189 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
    +           189 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdc11b6039746e213  (in servo) + 134  [0x10b2e0e96]
    +             189 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
    +               189 std::panicking::try::do_call::h1b949dc0208e6809 (.llvm.7221400381382098667)  (in servo) + 43  [0x10b434bbb]
    +                 189 std::sys_common::backtrace::__rust_begin_short_backtrace::hfa7fe4ae900483c1  (in servo) + 1080  [0x10b2dff38]
    +                   189 profile_traits::mem::ProfilerChan::run_with_memory_reporting::h8f593171927c2ba6  (in servo) + 441  [0x10b4166e9]
    +                     189 layout_thread::LayoutThread::start::h7084f4b577a47674  (in servo) + 749  [0x10b3911ed]
    +                       189 crossbeam_channel::select::Select::select::h56a6cc1d6f6992d0  (in servo) + 63  [0x10ded230f]
    +                         189 crossbeam_channel::select::run_select::h221757acda86a7d6  (in servo) + 516  [0x10ded1c24]
    +                           189 std::thread::local::LocalKey$LT$T$GT$::try_with::hf82a44f465639aa3  (in servo) + 96  [0x10ded27a0]
    +                             189 crossbeam_channel::context::Context::with::_$u7b$$u7b$closure$u7d$$u7d$::hd32d0c94c9ffa5ca  (in servo) + 485  [0x10ded2af5]
    +                               189 crossbeam_channel::context::Context::wait_until::h11bcf5c54180eca7  (in servo) + 325  [0x10ded2e65]
    +                                 189 std::thread::park::h43fa24e67028fcd9  (in servo) + 258  [0x10ded8f92]
    +                                   189 _pthread_cond_wait  (in libsystem_pthread.dylib) + 722  [0x7fff7de5556e]
    +                                     189 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7fff7dd9686a]
    189 Thread_29200
    + 189 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
    +   189 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
    +     189 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
    +       189 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
    +         189 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
    +           189 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
    +             189 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
    +               189 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
    +                 189 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
    +                   189 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
    +                     189 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]
    189 Thread_29201
    + 189 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
    +   189 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
    +     189 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
    +       189 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
    +         189 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
    +           189 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdc11b6039746e213  (in servo) + 134  [0x10b2e0e96]
    +             189 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
    +               189 std::panicking::try::do_call::h1b949dc0208e6809 (.llvm.7221400381382098667)  (in servo) + 43  [0x10b434bbb]
    +                 189 std::sys_common::backtrace::__rust_begin_short_backtrace::hfa7fe4ae900483c1  (in servo) + 1080  [0x10b2dff38]
    +                   189 profile_traits::mem::ProfilerChan::run_with_memory_reporting::h8f593171927c2ba6  (in servo) + 441  [0x10b4166e9]
    +                     189 layout_thread::LayoutThread::start::h7084f4b577a47674  (in servo) + 749  [0x10b3911ed]
    +                       189 crossbeam_channel::select::Select::select::h56a6cc1d6f6992d0  (in servo) + 63  [0x10ded230f]
    +                         189 crossbeam_channel::select::run_select::h221757acda86a7d6  (in servo) + 516  [0x10ded1c24]
    +                           189 std::thread::local::LocalKey$LT$T$GT$::try_with::hf82a44f465639aa3  (in servo) + 96  [0x10ded27a0]
    +                             189 crossbeam_channel::context::Context::with::_$u7b$$u7b$closure$u7d$$u7d$::hd32d0c94c9ffa5ca  (in servo) + 485  [0x10ded2af5]
    +                               189 crossbeam_channel::context::Context::wait_until::h11bcf5c54180eca7  (in servo) + 325  [0x10ded2e65]
    +                                 189 std::thread::park::h43fa24e67028fcd9  (in servo) + 258  [0x10ded8f92]
    +                                   189 _pthread_cond_wait  (in libsystem_pthread.dylib) + 722  [0x7fff7de5556e]
    +                                     189 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  

it's especially weird that sometimes this "anymous" thread is created interleaved with creation layout-threads, and sometimes you just get a long sequence where only that anonymous thread is created.

Hmm, any way either to get the thread name, or to run a debug build to get
a full stack trace?

On Mon, Aug 19, 2019 at 8:11 AM Gregory Terzian notifications@github.com
wrote:

For now I used the mac process sampler.

There seems to be something starting threads for each load, this sequence
just repeats itself:

188 Thread_24060
+ 188 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
+   188 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
+     188 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
+       188 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
+         188 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
+           188 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
+             188 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
+               188 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
+                 188 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
+                   188 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
+                     188 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]
188 Thread_24086
+ 188 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
+   188 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
+     188 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
+       188 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
+         188 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
+           188 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
+             188 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
+               188 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
+                 188 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
+                   188 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
+                     188 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]
188 Thread_24107
+ 188 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
+   188 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
+     188 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
+       188 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
+         188 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
+           188 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
+             188 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
+               188 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
+                 188 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
+                   188 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
+                     188 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]
59 Thread_23706
  59 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff7de5140d]
    59 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff7de55249]
      59 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff7de522eb]
        59 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757  (in servo) + 142  [0x10def053e]
          59 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534  (in servo) + 62  [0x10ded8ade]
            59 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hac8341c880067512  (in servo) + 118  [0x10b116e96]
              59 __rust_maybe_catch_panic  (in servo) + 31  [0x10def100f]
                59 std::sys_common::backtrace::__rust_begin_short_backtrace::hffc558be7d81a611  (in servo) + 66  [0x10b16be52]
                  59 std::thread::sleep::h166d71ab05a9767f  (in servo) + 80  [0x10ded8dd0]
                    59 nanosleep  (in libsystem_c.dylib) + 199  [0x7fff7dd22914]
                      59 __semwait_signal  (in libsystem_kernel.dylib) + 10  [0x7fff7dd96f32]

And it doesn't seem to be the layout-thread, since it shows-up differently:

189 Thread_29201
+ 189 thread_start (in libsystem_pthread.dylib) + 13 [0x7fff7de5140d]
+ 189 _pthread_start (in libsystem_pthread.dylib) + 66 [0x7fff7de55249]
+ 189 _pthread_body (in libsystem_pthread.dylib) + 126 [0x7fff7de522eb]
+ 189 std::sys::unix::thread::Thread::new::thread_start::h4fa68360d5b30757 (in servo) + 142 [0x10def053e]
+ 189 _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h96c14a1408a74534 (in servo) + 62 [0x10ded8ade]
+ 189 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdc11b6039746e213 (in servo) + 134 [0x10b2e0e96]
+ 189 __rust_maybe_catch_panic (in servo) + 31 [0x10def100f]
+ 189 std::panicking::try::do_call::h1b949dc0208e6809 (.llvm.7221400381382098667) (in servo) + 43 [0x10b434bbb]
+ 189 std::sys_common::backtrace::__rust_begin_short_backtrace::hfa7fe4ae900483c1 (in servo) + 1080 [0x10b2dff38]
+ 189 profile_traits::mem::ProfilerChan::run_with_memory_reporting::h8f593171927c2ba6 (in servo) + 441 [0x10b4166e9]
+ 189 layout_thread::LayoutThread::start::h7084f4b577a47674 (in servo) + 749 [0x10b3911ed]
+ 189 crossbeam_channel::select::Select::select::h56a6cc1d6f6992d0 (in servo) + 63 [0x10ded230f]
+ 189 crossbeam_channel::select::run_select::h221757acda86a7d6 (in servo) + 516 [0x10ded1c24]
+ 189 std::thread::local::LocalKey$LT$T$GT$::try_with::hf82a44f465639aa3 (in servo) + 96 [0x10ded27a0]
+ 189 crossbeam_channel::context::Context::with::_$u7b$$u7b$closure$u7d$$u7d$::hd32d0c94c9ffa5ca (in servo) + 485 [0x10ded2af5]
+ 189 crossbeam_channel::context::Context::wait_until::h11bcf5c54180eca7 (in servo) + 325 [0x10ded2e65]
+ 189 std::thread::park::h43fa24e67028fcd9 (in servo) + 258 [0x10ded8f92]
+ 189 _pthread_cond_wait (in libsystem_pthread.dylib) + 722 [0x7fff7de5556e]
+ 189 __psynch_cvwait (in libsystem_kernel.dylib) + 10


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/servo/servo/issues/23905?email_source=notifications&email_token=AADCPBJ6GOUGPRFSQ4HKUY3QFKLWVA5CNFSM4IIMWPS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4S34TI#issuecomment-522567245,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AADCPBKXUJOCXCJAUPXW75LQFKLWVANCNFSM4IIMWPSQ
.

I'll try to do a debug build, it's been a while and last time I tried my mac crashed.

Regarding the name, it seems to not have one, because some other threads do show up on the sample with a name, such as 1189 Thread_17936: tokio-runtime-worker-3. But this might be a function of it being a release build, I don't know.

Given that we're running webdriver tests and all of those new threads are sitting in sleep calls, I suspect this code is spawning the threads.

@jdm do you know why that code is spawning threads? For example rather than adding a ROUTER route to a crossbeam channel then using recv_timeout?

@asajeffrey Probably because it was written 4 years ago and the webdriver code is more or less passively maintained.

Thanks for pointing that one out! @jdm

@asajeffrey I think it can easily be refactored as you suggest. I'll give it a try.

OK, sounds like a plan!

Ok so with https://github.com/servo/servo/pull/24007 the number of threads stays stable(at 87), however the number of mach ports still keeps going up...

Ok so I checked again, and while we are dropping documents once trimming the session history kicks in, we're not dropping windows(unless the script-thread shuts down).

So it looks like if we load, say, 1000 documents in a script-thread, we're actually keeping 1000 windows somewhere.

I can't tell what is keeping a reference to them, could it be some Js engine thing? Maybe the windowproxy needs to unset it's window? Currently unset_currently_active sets the window to a dissimilar origin one(which clones all the ipc senders), and I can't see any code actually setting the window to nothing, or the equivalent(altough each new load will set_window to a new one, and I'm not sure what that does to the previous one).

Note that each new window does a WindowBinding::Wrap(JSContext::from_ptr(runtime.cx()), win), maybe we need to do "the opposite" when shutting down that pipeline? It seems to create a global object, and I can imagine we need to explicitly undo that?

As far as I know there is no explicit "delete this global object" operation. I'm checking in #jsapi on IRC, but we may just be storing a pointer to the obsolete Window object somewhere that is considered a root that will prevent it from being GCed.

I tried removing the window proxies from the map held by the script-thread, so that they would not be re-used across loads, and that doesn't help. So if the document is dropped, and the window-proxy is dropped, the windows is still held by something else.

What I've done by the way is just implement Drop for window with a println, and the only print I see is of the window that dropped when the initial page loaded when you do ./mach run --release -z --webdriver=7002 is shut-down, and that is in a separate script-thread(I think that's some sort of about:blank equivalent, not sure why it's created by the way, anyway when it drops it's managing only one doc, and the window drops along with it, I guess because the entire script-thread is shut-down, altough I wonder why it shares a pipeline-namespace with the rest. Then I can see the other script-thread starts trimming documents, with 21 docs currently managed, at which point it drops a document each time a new one is loaded, but no window ever drops).

Screen Shot 2019-08-20 at 12 11 24 AM

Ouch, we managed to drop the Document without dropping the Window??? How is that possible, I thought we never null'd a window's document once it was set?

Correction, we're not dropping documents either, we're only removing them from the map held by the script-thread, but something else is keeping them.

I'm looking into whether it's the task-queue that could be keeping tasks somehow, since those tasks can then hold a Trusted document/window/globalscope...

Phew, at least I understand what's going on!

The idea of it being a Trusted object that's keeping everything alive sounds like a good one. Not sure how to deal with that though, without (for example) removing rAF callbacks from any page that navigates forwards-then-backwards.

I'm also not sure what's meant to happen if document A creates a promise, document B gets hold of it, then document A navigates away to the point that it would normally be reclaimed. Should the promise stop the document from being reclaimed? Stop it from being GC'd? The promise get rejected?

Ok so I checked and the task-queue isn't holding back/storing tasks for longer than expected, so that doesn't seem like a source of leaking Trusted.

I've also done some printing out of the refcounts of Trusted, and it doesn't seem to steadily go up, it goes up to 9 for each load, and then drops back(to 4, but I think that's related to the next load). If we were leaking Trusted somewhere I would expect it to steadily go up for each load, wouldn't you?

So it appears to me that the Trusted create either in tasks, or in IPC router callbacks, are eventually all dropped.

Since the only thing that drops documents/windows is exiting a script-thread, I'm back to thinking it's the runtime that somehow internally keeps the windows or documents stored somewhere, and that we're missing some low-level SM call as part of window.clear_js_runtime...

I'm also not sure what's meant to happen if document A creates a promise, document B gets hold of it, then document A navigates away to the point that it would normally be reclaimed. Should the promise stop the document from being reclaimed? Stop it from being GC'd? The promise get rejected?

I would tend to think the promise would be resolved before the active document that would navigate away is unloaded, since the microtask checkpoint would occur immediately after the current task. There could be some weird edge cases, and I think it would be hard to document B to hold on to a promise and see it resolve after document A has been unloaded.

Also fetches, and tasks queued for them, would be cancelled, as part of https://html.spec.whatwg.org/multipage/browsing-the-web.html#navigating-across-documents:abort-a-document-2

Hmm, are you talking about the refcount of the objects, or the size of the
hashtables in LiveDOMReferences? The size of the hashtables is the thing
that matters I suspect.

The code that removes the entry from the script thread's documents is
https://github.com/servo/servo/blob/47aa1ccaa2d0f482dcdc1332c8dfe4a7c44f2b11/components/script/script_thread.rs#L2771

This is meant to be called when the pipeline exits, is it not getting
called?

On Wed, Aug 21, 2019 at 6:57 AM Gregory Terzian notifications@github.com
wrote:

Ok so I checked and the task-queue isn't holding back/storing tasks, so
that doesn't seem like a source of leaking Trusted.

I've also done some printing out of the refcounts of Trusted, and it
doesn't seem to steadily go up, it goes up to 9 for each load, and then
drops back(to 4, but I think that's related to the next load). If we were
leaking Trusted somewhere I would expect it to steadily go up for each
load, wouldn't you?

Since the only thing that drops documents/windows is exiting a
script-thread, I'm back to thinking it's the runtime that somehow
internally keeps the windows or documents stored somewhere, and that we're
missing some low-level SM call as part of window. clear_js_runtime...

I'm also not sure what's meant to happen if document A creates a promise,
document B gets hold of it, then document A navigates away to the point
that it would normally be reclaimed. Should the promise stop the document
from being reclaimed? Stop it from being GC'd? The promise get rejected?

I would tend to think the promise would be resolved before the active
document that would navigate away is unloaded, since the microtask
checkpoint would occur immediately after the current task. There could be
some weird edge cases, and I think it would be hard to document B to hold
on to a promise and see it resolve after document A has been unloaded.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/servo/servo/issues/23905?email_source=notifications&email_token=AADCPBPOS6DTQZFF6FTTWULQFUUSTA5CNFSM4IIMWPS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4ZNEDQ#issuecomment-523424270,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AADCPBMAXTSFUTWP6RKRQVTQFUUSTANCNFSM4IIMWPSQ
.

or the size of the
hashtables in LiveDOMReferences? The size of the hashtables is the thing
that matters I suspect.

Yes I looked at LiveDOMReferences(actually I didn't looked at the promise table).

This is meant to be called when the pipeline exits, is it not getting
called?

It's getting called, and the size of self.documents stays at 21 across loads(once it has reached 21), however, the document that is removed does not drop, and neither does the associated window.

So I think the mach ports/fd we're leaking is in the ipc senders held by the window/documents, which don't appear to be dropped unless the script-thread as a whole exits. I assume this has to do with the runtime itself dropping at that point, hence my suspicion the window or document are kept as a reference somewhere in there...

Hmm, so something is keeping a live reference to the document, oh that's
annoying.!

I wonder if there's some way we can find out what's keeping the document
alive? Can we temporarily add an assert! to the implementation of
Document::trace() which panics if we end up tracing a document that's
been discarded? The stack trace might help figure out what's keeping the
document alive.

On Wed, Aug 21, 2019 at 8:38 AM Gregory Terzian notifications@github.com
wrote:

or the size of the
hashtables in LiveDOMReferences? The size of the hashtables is the thing
that matters I suspect.

Yes I looked at LiveDOMReferences(actually I didn't looked at the promise
table).

This is meant to be called when the pipeline exits, is it not getting
called?

It's getting called, and the size of self.documents stays at 21 across
loads(once it has reached 21), however, the document that is removed does
not drop, and neither does the associated window.

So I think the mach ports/fd we're leaking is in the ipc senders held by
the window/documents, which don't appear to be dropped unless the
script-thread as a whole exits. I assume this has to do with the runtime
itself dropping at that point, hence my suspicion the window or document
are kept as a reference somewhere in there...


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/servo/servo/issues/23905?email_source=notifications&email_token=AADCPBP7B467NGM3FRQCP6TQFVAPHA5CNFSM4IIMWPS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4ZWDUQ#issuecomment-523461074,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AADCPBOYVTBDEIE6X7JW733QFVAPHANCNFSM4IIMWPSQ
.

Can we temporarily add an assert! to the implementation of
Document::trace() which panics if we end up tracing a document that's
been discarded?

It appears that the tracing always happens as part of a call to window.clear_js_runtime, which calls into GC, and which traces everything I guess.

So it doesn't look like the window/document is actually used somewhere in a Trusted, and the only thing using it is the garbage collector when tracing...

I'm able to get the window/document/global to drop now, however it's through a hack and I'm not sure how to do it right, this gets you there:

unsafe impl JSTraceable for Window {
    unsafe fn trace(&self, trc: *mut JSTracer) {
        match self.current_state.get() {
            WindowState::Alive => self.globalscope.trace(trc),
            _ => {}
        }
    }
}

I've tried nulling everything in Window that keeps a Dom<Window>, and it's hard to know if I missed anything, in any case it wasn't working. However, not tracing works well.

However, that still doesn't prevent the leaking of ipc senders, which must happen elsewhere.

Screen Shot 2019-08-22 at 11 59 33 PM

I guess I can try to panic when tracing it, see what it tells us...

Screen Shot 2019-08-23 at 12 09 48 AM

The good news is there are only 20 places where an ipc-sender is used is stored in something traceable, so it must be one of them that is leaking! yeah!

Gosh, not tracing a Window is dangerous! It'll end up gc'ing the window's children, but not the window itself. We have to be very confident that the window is unreachable for this to be safe!

not tracing a Window is dangerous! It'll end up gc'ing the window's children, but not the window itself.

Does the drop on no trace indicate that what is keeping the window alive is a child holding a reference to the window?

A couple of other questions:

  1. If the window drops when we don't trace it(when it's in zombie state), what does that tell us about potential leaking Dom<Window> and so on? Logically, if there was a Dom<Window> stored somewhere, would the window still drop if we didn't trace it?
  2. When is trace_rust_roots called? Is it some sort of interupt? I'm asking because I got a few double borrow errors on a DomRefCell which seems to only be possible if the refcell was traced when the script-thread was interrupted while borrowing it.

By the way, to narrow things down a bit, it seems we are after all leaking ipc-senders somewhere in a refcounted object, see:

Screen Shot 2019-08-23 at 11 25 51 AM
Screen Shot 2019-08-23 at 11 26 04 AM

Correction: the ipc-senders we're leaking are not in refcounted, they're on the "setting stack"...
Screen Shot 2019-08-23 at 1 20 02 PM

Correction again, the tracing of ipc-sender is completely outside of trace_rust_roots...

It's really incredible how you sometimes don't add a print somewhere because you think you "know" something...

By the way I've found a way to drop the windows without the no-trace, I'll make a PR for that separately...

Can we get a trace that shows how we're ending up tracing discarded windows? Can you add a log entry for the WindowState of each traced window?

Also, can you add a log entry for the size of the settings stack at https://github.com/servo/servo/blob/f1dd31f70440fa9c7a40525bd1e03eede568f74d/components/script/dom/bindings/settings_stack.rs#L34 so we can see if it's growing unexpectedly?

Also, can you add a log entry for the size of the settings stack at

So it's not the settings stack after all, in fact the tracing of IpcSender didn't occur as part of trace_rust_roots. It turns out it was the FetchCanceller(altough I don't fully understand it didn't show up as part of the tracing in trace_rust_roots).

Can we get a trace that shows how we're ending up tracing discarded windows?

I don't have a trace handy now, however I remember it was always as part of clear_js_runtime, which calls into GC. So essentially they were kept alive, and traced each time a new pipeline was closed and clear_js_runtime was called, but they weren't dropped ever.


So I was able to stop the FecthCanceller, or any traced IpcSender for that matter, from growing steadily, and I was also able to get the docs/windows/globalscope to actually drop when we close a pipeline, I'll have a PR ready with those changes soon(it does not involve any changes to tracing).

However, this still doesn't fix the mach ports count from going up for each load. So I assume some ipc sender/receivers are being leaked somewhere outside of script. Actually I haven't checked whether IpcReceiver are traced, I should.

Yay, good find on the FetchCanceller, which explains what's going on with the IPC senders.

If the windows are being traced but not dropped, then something must be keeping a live pointer to them! Adding logging to the trace which shows the window's WindowState would help with finding the culprit.

If the windows are being traced but not dropped, then something must be keeping a live pointer to them! Adding logging to the trace which shows the window's WindowState would help with finding the culprit.

What do you mean with "Adding logging to the trace"?

I'll open a PR soon with some fixes, basically it involves setting all MutNullableDom on a window to None inside clear_js_runtime, and I guess it helps because most of them contain a Dom<Window>, although I can't say I fully understand what is going on, but they drop with those changes and I assume it's less hacky than what I did previously with not tracing the window.

Also I don't understand why the FetchCanceller wouldn't drop, since the document, containing the document loaded containing the cancellers, does drop.

Something like:

#[derive(Clone, Copy, Debug, MallocSizeOf, PartialEq)] // removed JSTraceable
enum WindowState {
    Alive,
    Zombie, // Pipeline is closed, but the window hasn't been GCed yet.
}
unsafe impl JSTraceable for WindowState {
    unsafe fn trace(&self, trc: *mut JSTracer) {
        debug!("Tracing window in state {:?}", self);
    }
}

Setting all the nullable fields of a Window to null is a much better solution than not tracing the fields!

Perhaps we should have:

enum WindowState {
    Alive(WindowImpl),
    Dead,
}
struct Window {
    globalscope: GlobalScope,
    state: WindowState,
}

This would avoid introducing bugs when forgetting to clear fields of Window in the future.

Even better!

Thanks, I will try that approach.

I'm just doing some browsing, instead of running the webdriver script, and some weird things are happening, any ideas how this is possible:

Screen Shot 2019-08-23 at 9 40 33 PM

must be iframes...

That shouldn't happen, each iframe should get a different pipeline_id, you shouldn't be seeing the same id over and over.

It happens quite a lot when browsing around github, although not always, you also get this kind of expected behavior:

Screen Shot 2019-08-23 at 10 12 35 PM

Which is weird, because this happens after the big chunk above, and it's for the same pipeline...

The "multiple drop" of a document, seems to consistently occur right before layout exits, if that gives any indication...

Ok so those big chunks of documents with the same pipeline id dropping, they were created as part of https://github.com/servo/servo/blob/174bcc443435100da31b20db5ab684f4f3c8255b/components/script/dom/servoparser/mod.rs#L162

Basically, if a pages uses an API like SetInnerHTML, Servo creates a new document for each fragment that is parsed. There are few other places where new documents are created, like when using HTMLTemplateElement or CreateHTMLDocument.

Those document do not have a browsing context, and they share the window of the document in which they are created(so they share the pipeline).


On another note, there is a PR at https://github.com/servo/servo/pull/24047 which fixes the issues noted with the FetchCanceller and windows not dropping as part of clear_js_runtime.

However, the mach ports count still steadily goes up on each load.

And, I'm not so sure anymore if it really has to do with leaking ipc-channels as part of the load workflow, because I can get the count to steadily go up also just by scrolling the page.

So it looks like the mach ports leak is somewhere in the glutin/compositing/rendering part.
Also, I'm not completely sure "mach ports" equals icp-sender/recevier, since from sampling I can see quite a lot of mach_msg calls that are unrelated to icp-channel, for example as part of running the gluting event-loop, handling input events, swapping buffers as part of the call to self.window.present(); in the compositor, and so on...

Anyway, when https://github.com/servo/servo/pull/24047 merges, @kanaka it might be worth another try using the master branch, since we were leaking ipc-channels along with the DOM objects, and the mach ports count I see on Mac might not actaully related to the fd related problem on linux.

@gterzian okay, I've been watching the progress on #24047 and I'll give it a run once it is merged.

Hmm, Servo shouldn't create a new Document for the same Window should it? http://html.spec.whatwg.org/multipage/#concept-document-window says that the only case where Document is mutable is the initial about:blank.

I don't believe that is what is being described? There can be multiple Document objects that share the same global, because Document objects can be created through various APIs. window.document should only ever return the initial about:blank or the subsequent document.

Oh I see what you mean, it's not changing the associated document for the Window, but creating a new Document.

I did a test run with #24072 (39bd455). Panic after 182 page loads.

$ ./mach run --release -z --webdriver=7002 --resolution=400x300



called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" } (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(1), index: PipelineIndex(2) }, at src/libcore/result.rs:1084)
stack backtrace:
   0: servo::main::{{closure}}::hf72b2e0d1b59dc91 (0x555d7738e05c)
   1: std::panicking::rust_panic_with_hook::h2e01ca08bfa50f6c (0x555d7a4a33cb)
             at src/libstd/panicking.rs:481
   2: std::panicking::continue_panic_fmt::h89ac412ecc105abb (0x555d7a4a2e81)
             at src/libstd/panicking.rs:384
   3: rust_begin_unwind (0x555d7a4a2d75)
             at src/libstd/panicking.rs:311
   4: core::panicking::panic_fmt::hc72bd3ff9f8b21ef (0x555d7a4c5209)
             at src/libcore/panicking.rs:85
   5: core::result::unwrap_failed::h8fe51b7d2d9e1a4f (0x555d7a4c5306)
             at src/libcore/result.rs:1084
   6: script::fetch::FetchCanceller::initialize::h0667583d5c208824 (0x555d77ff0385)
   7: script::document_loader::DocumentLoader::fetch_async_background::ha953132b150c11ec (0x555d77939abf)
   8: script::dom::document::Document::fetch_async::hcefa6c68c10aa6df (0x555d77f7e878)
   9: script::stylesheet_loader::StylesheetLoader::load::h4189056b1f0b82bc (0x555d780189ce)
  10: script::dom::htmllinkelement::HTMLLinkElement::handle_stylesheet_url::h36d04d50d21c673c (0x555d780caa97)
  11: <script::dom::htmllinkelement::HTMLLinkElement as script::dom::virtualmethods::VirtualMethods>::bind_to_tree::h8d0d4e2c17978558 (0x555d780ca365)
  12: script::dom::node::Node::insert::he0df1dffeda1d560 (0x555d78167c74)
  13: script::dom::node::Node::pre_insert::h5d249f3646f58260 (0x555d78166fe2)
  14: script::dom::servoparser::insert::hcc57a1a7a2519d59 (0x555d77fba29e)
  15: html5ever::tree_builder::TreeBuilder<Handle,Sink>::insert_element::h8415e0a35347bf7a (0x555d7830e183)
  16: html5ever::tree_builder::TreeBuilder<Handle,Sink>::step::h685ee3f370b0fab0 (0x555d78337de5)
  17: <html5ever::tree_builder::TreeBuilder<Handle,Sink> as html5ever::tokenizer::interface::TokenSink>::process_token::h9908a5e0076b3921 (0x555d782bd033)
  18: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$13process_token17h41c90dc793c1d2a6E.llvm.2797139391679541069 (0x555d779f7698)
  19: html5ever::tokenizer::Tokenizer<Sink>::emit_current_tag::he67fc36c50ce4cfe (0x555d779f813f)
  20: html5ever::tokenizer::Tokenizer<Sink>::step::h04bdbd1ae281e490 (0x555d77a008e7)
  21: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$3run17h03bfa67c7500ca1fE.llvm.2797139391679541069 (0x555d779fbb0a)
  22: script::dom::servoparser::html::Tokenizer::feed::h8ddf61a5fc89c8e5 (0x555d77a28bfa)
  23: script::dom::servoparser::ServoParser::do_parse_sync::h958bbe7bd160caf6 (0x555d77fb5b0e)
  24: profile_traits::time::profile::hb3a88f3f29bcec16 (0x555d78129b6f)
  25: _ZN6script3dom11servoparser11ServoParser10parse_sync17h0d22e0838491f6baE.llvm.5820365630158477671 (0x555d77fb5706)
  26: <script::dom::servoparser::ParserContext as net_traits::FetchResponseListener>::process_response_chunk::h1c9239edfd5cb712 (0x555d77fb9af1)
  27: script::script_thread::ScriptThread::handle_msg_from_constellation::h92b3fa635181d023 (0x555d780011c8)
  28: _ZN6script13script_thread12ScriptThread11handle_msgs17h24da03ff2c7389e7E.llvm.5820365630158477671 (0x555d77ffbc87)
  29: profile_traits::mem::ProfilerChan::run_with_memory_reporting::h607d88fbbd02ee1e (0x555d78125b67)
  30: std::sys_common::backtrace::__rust_begin_short_backtrace::hd6d1fca432969577 (0x555d779049f7)
  31: _ZN3std9panicking3try7do_call17ha2efe18ca68ba1f6E.llvm.2884171771275279049 (0x555d7854ebc3)
  32: __rust_maybe_catch_panic (0x555d7a4ad029)
             at src/libpanic_unwind/lib.rs:80
  33: core::ops::function::FnOnce::call_once{{vtable.shim}}::hcfddfc1508cf5b2a (0x555d77ae43a2)
  34: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::hf92c7ee34ebffad1 (0x555d7a491b3e)
             at /rustc/521d78407471cb78e9bbf47160f6aa23047ac499/src/liballoc/boxed.rs:922
  35: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::ha94d3a20f0e8dd18 (0x555d7a4ac36f)
             at /rustc/521d78407471cb78e9bbf47160f6aa23047ac499/src/liballoc/boxed.rs:922
      std::sys_common::thread::start_thread::h4ab30187e03381f3
             at src/libstd/sys_common/thread.rs:13
      std::sys::unix::thread::Thread::new::thread_start::hb63d4d1c8078a565
             at src/libstd/sys/unix/thread.rs:79
  36: start_thread (0x7feaa00a56b9)
  37: clone (0x7fea9e94141c)
  38: <unknown> (0x0)
[2019-08-29T15:52:55Z ERROR servo] called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" }

The file-descriptor counts show some patterns that might be interesting in tracking this down so I'm including it below. The growth rate seems to be about 29 fds per load until about run 20, then wobbles and stabilizes until about run 33 after which it seems to grow at about 22 fds per load until it hits the 4096 limit.

Thu Aug 29 10:52:23 CDT 2019: Run 0, (fds: 221)
Thu Aug 29 10:52:25 CDT 2019: Run 1, (fds: 267)
Thu Aug 29 10:52:26 CDT 2019: Run 2, (fds: 296)
Thu Aug 29 10:52:27 CDT 2019: Run 3, (fds: 325)
Thu Aug 29 10:52:28 CDT 2019: Run 4, (fds: 354)
Thu Aug 29 10:52:29 CDT 2019: Run 5, (fds: 383)
Thu Aug 29 10:52:30 CDT 2019: Run 6, (fds: 412)
Thu Aug 29 10:52:31 CDT 2019: Run 7, (fds: 438)
Thu Aug 29 10:52:32 CDT 2019: Run 8, (fds: 467)
Thu Aug 29 10:52:33 CDT 2019: Run 9, (fds: 496)
Thu Aug 29 10:52:34 CDT 2019: Run 10, (fds: 525)
Thu Aug 29 10:52:34 CDT 2019: Run 11, (fds: 554)
Thu Aug 29 10:52:34 CDT 2019: Run 12, (fds: 583)
Thu Aug 29 10:52:34 CDT 2019: Run 13, (fds: 612)
Thu Aug 29 10:52:34 CDT 2019: Run 14, (fds: 641)
Thu Aug 29 10:52:34 CDT 2019: Run 15, (fds: 670)
Thu Aug 29 10:52:34 CDT 2019: Run 16, (fds: 699)
Thu Aug 29 10:52:34 CDT 2019: Run 17, (fds: 728)
Thu Aug 29 10:52:34 CDT 2019: Run 18, (fds: 757)
Thu Aug 29 10:52:34 CDT 2019: Run 19, (fds: 786)
Thu Aug 29 10:52:34 CDT 2019: Run 20, (fds: 815)
Thu Aug 29 10:52:34 CDT 2019: Run 21, (fds: 830)
Thu Aug 29 10:52:34 CDT 2019: Run 22, (fds: 825)
Thu Aug 29 10:52:35 CDT 2019: Run 23, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 24, (fds: 825)
Thu Aug 29 10:52:36 CDT 2019: Run 25, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 26, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 27, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 28, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 29, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 30, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 31, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 32, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 33, (fds: 847)
Thu Aug 29 10:52:36 CDT 2019: Run 34, (fds: 869)
Thu Aug 29 10:52:38 CDT 2019: Run 35, (fds: 891)
Thu Aug 29 10:52:38 CDT 2019: Run 36, (fds: 913)
Thu Aug 29 10:52:38 CDT 2019: Run 37, (fds: 935)
Thu Aug 29 10:52:38 CDT 2019: Run 38, (fds: 957)
Thu Aug 29 10:52:38 CDT 2019: Run 39, (fds: 979)
Thu Aug 29 10:52:38 CDT 2019: Run 40, (fds: 1001)
Thu Aug 29 10:52:38 CDT 2019: Run 41, (fds: 1023)
Thu Aug 29 10:52:38 CDT 2019: Run 42, (fds: 1045)
Thu Aug 29 10:52:38 CDT 2019: Run 43, (fds: 1045)
Thu Aug 29 10:52:38 CDT 2019: Run 44, (fds: 1067)
Thu Aug 29 10:52:39 CDT 2019: Run 45, (fds: 1089)
Thu Aug 29 10:52:39 CDT 2019: Run 46, (fds: 1111)
Thu Aug 29 10:52:39 CDT 2019: Run 47, (fds: 1111)
Thu Aug 29 10:52:39 CDT 2019: Run 48, (fds: 1133)
Thu Aug 29 10:52:39 CDT 2019: Run 49, (fds: 1155)
Thu Aug 29 10:52:39 CDT 2019: Run 50, (fds: 1177)
Thu Aug 29 10:52:39 CDT 2019: Run 51, (fds: 1199)
Thu Aug 29 10:52:39 CDT 2019: Run 52, (fds: 1221)
Thu Aug 29 10:52:39 CDT 2019: Run 53, (fds: 1243)
Thu Aug 29 10:52:39 CDT 2019: Run 54, (fds: 1265)
Thu Aug 29 10:52:40 CDT 2019: Run 55, (fds: 1287)
Thu Aug 29 10:52:41 CDT 2019: Run 56, (fds: 1309)
Thu Aug 29 10:52:41 CDT 2019: Run 57, (fds: 1309)
Thu Aug 29 10:52:41 CDT 2019: Run 58, (fds: 1331)
Thu Aug 29 10:52:41 CDT 2019: Run 59, (fds: 1353)
Thu Aug 29 10:52:41 CDT 2019: Run 60, (fds: 1375)
Thu Aug 29 10:52:41 CDT 2019: Run 61, (fds: 1397)
Thu Aug 29 10:52:41 CDT 2019: Run 62, (fds: 1419)
Thu Aug 29 10:52:41 CDT 2019: Run 63, (fds: 1441)
Thu Aug 29 10:52:41 CDT 2019: Run 64, (fds: 1463)
Thu Aug 29 10:52:42 CDT 2019: Run 65, (fds: 1485)
Thu Aug 29 10:52:42 CDT 2019: Run 66, (fds: 1507)
Thu Aug 29 10:52:42 CDT 2019: Run 67, (fds: 1529)
Thu Aug 29 10:52:42 CDT 2019: Run 68, (fds: 1551)
Thu Aug 29 10:52:42 CDT 2019: Run 69, (fds: 1573)
Thu Aug 29 10:52:42 CDT 2019: Run 70, (fds: 1595)
Thu Aug 29 10:52:42 CDT 2019: Run 71, (fds: 1617)
Thu Aug 29 10:52:42 CDT 2019: Run 72, (fds: 1639)
Thu Aug 29 10:52:42 CDT 2019: Run 73, (fds: 1661)
Thu Aug 29 10:52:42 CDT 2019: Run 74, (fds: 1683)
Thu Aug 29 10:52:42 CDT 2019: Run 75, (fds: 1705)
Thu Aug 29 10:52:43 CDT 2019: Run 76, (fds: 1727)
Thu Aug 29 10:52:43 CDT 2019: Run 77, (fds: 1749)
Thu Aug 29 10:52:43 CDT 2019: Run 78, (fds: 1771)
Thu Aug 29 10:52:43 CDT 2019: Run 79, (fds: 1771)
Thu Aug 29 10:52:43 CDT 2019: Run 80, (fds: 1793)
Thu Aug 29 10:52:43 CDT 2019: Run 81, (fds: 1815)
Thu Aug 29 10:52:43 CDT 2019: Run 82, (fds: 1837)
Thu Aug 29 10:52:43 CDT 2019: Run 83, (fds: 1859)
Thu Aug 29 10:52:43 CDT 2019: Run 84, (fds: 1881)
Thu Aug 29 10:52:44 CDT 2019: Run 85, (fds: 1903)
Thu Aug 29 10:52:44 CDT 2019: Run 86, (fds: 1925)
Thu Aug 29 10:52:44 CDT 2019: Run 87, (fds: 1947)
Thu Aug 29 10:52:44 CDT 2019: Run 88, (fds: 1969)
Thu Aug 29 10:52:44 CDT 2019: Run 89, (fds: 1991)
Thu Aug 29 10:52:44 CDT 2019: Run 90, (fds: 2013)
Thu Aug 29 10:52:44 CDT 2019: Run 91, (fds: 2035)
Thu Aug 29 10:52:44 CDT 2019: Run 92, (fds: 2057)
Thu Aug 29 10:52:44 CDT 2019: Run 93, (fds: 2079)
Thu Aug 29 10:52:44 CDT 2019: Run 94, (fds: 2101)
Thu Aug 29 10:52:45 CDT 2019: Run 95, (fds: 2123)
Thu Aug 29 10:52:45 CDT 2019: Run 96, (fds: 2145)
Thu Aug 29 10:52:45 CDT 2019: Run 97, (fds: 2167)
Thu Aug 29 10:52:45 CDT 2019: Run 98, (fds: 2189)
Thu Aug 29 10:52:45 CDT 2019: Run 99, (fds: 2211)
Thu Aug 29 10:52:45 CDT 2019: Run 100, (fds: 2233)
Thu Aug 29 10:52:45 CDT 2019: Run 101, (fds: 2255)
Thu Aug 29 10:52:45 CDT 2019: Run 102, (fds: 2277)
Thu Aug 29 10:52:45 CDT 2019: Run 103, (fds: 2299)
Thu Aug 29 10:52:45 CDT 2019: Run 104, (fds: 2321)
Thu Aug 29 10:52:46 CDT 2019: Run 105, (fds: 2343)
Thu Aug 29 10:52:46 CDT 2019: Run 106, (fds: 2365)
Thu Aug 29 10:52:46 CDT 2019: Run 107, (fds: 2387)
Thu Aug 29 10:52:46 CDT 2019: Run 108, (fds: 2409)
Thu Aug 29 10:52:46 CDT 2019: Run 109, (fds: 2431)
Thu Aug 29 10:52:46 CDT 2019: Run 110, (fds: 2453)
Thu Aug 29 10:52:46 CDT 2019: Run 111, (fds: 2475)
Thu Aug 29 10:52:46 CDT 2019: Run 112, (fds: 2497)
Thu Aug 29 10:52:46 CDT 2019: Run 113, (fds: 2519)
Thu Aug 29 10:52:47 CDT 2019: Run 114, (fds: 2541)
Thu Aug 29 10:52:47 CDT 2019: Run 115, (fds: 2563)
Thu Aug 29 10:52:47 CDT 2019: Run 116, (fds: 2585)
Thu Aug 29 10:52:47 CDT 2019: Run 117, (fds: 2607)
Thu Aug 29 10:52:47 CDT 2019: Run 118, (fds: 2629)
Thu Aug 29 10:52:47 CDT 2019: Run 119, (fds: 2651)
Thu Aug 29 10:52:47 CDT 2019: Run 120, (fds: 2673)
Thu Aug 29 10:52:47 CDT 2019: Run 121, (fds: 2695)
Thu Aug 29 10:52:47 CDT 2019: Run 122, (fds: 2717)
Thu Aug 29 10:52:47 CDT 2019: Run 123, (fds: 2739)
Thu Aug 29 10:52:48 CDT 2019: Run 124, (fds: 2761)
Thu Aug 29 10:52:48 CDT 2019: Run 125, (fds: 2783)
Thu Aug 29 10:52:48 CDT 2019: Run 126, (fds: 2805)
Thu Aug 29 10:52:48 CDT 2019: Run 127, (fds: 2827)
Thu Aug 29 10:52:48 CDT 2019: Run 128, (fds: 2849)
Thu Aug 29 10:52:48 CDT 2019: Run 129, (fds: 2871)
Thu Aug 29 10:52:48 CDT 2019: Run 130, (fds: 2893)
Thu Aug 29 10:52:48 CDT 2019: Run 131, (fds: 2915)
Thu Aug 29 10:52:48 CDT 2019: Run 132, (fds: 2937)
Thu Aug 29 10:52:48 CDT 2019: Run 133, (fds: 2959)
Thu Aug 29 10:52:48 CDT 2019: Run 134, (fds: 2981)
Thu Aug 29 10:52:49 CDT 2019: Run 135, (fds: 3003)
Thu Aug 29 10:52:49 CDT 2019: Run 136, (fds: 3025)
Thu Aug 29 10:52:49 CDT 2019: Run 137, (fds: 3047)
Thu Aug 29 10:52:49 CDT 2019: Run 138, (fds: 3069)
Thu Aug 29 10:52:49 CDT 2019: Run 139, (fds: 3091)
Thu Aug 29 10:52:49 CDT 2019: Run 140, (fds: 3113)
Thu Aug 29 10:52:49 CDT 2019: Run 141, (fds: 3135)
Thu Aug 29 10:52:49 CDT 2019: Run 142, (fds: 3157)
Thu Aug 29 10:52:49 CDT 2019: Run 143, (fds: 3179)
Thu Aug 29 10:52:50 CDT 2019: Run 144, (fds: 3201)
Thu Aug 29 10:52:50 CDT 2019: Run 145, (fds: 3223)
Thu Aug 29 10:52:50 CDT 2019: Run 146, (fds: 3245)
Thu Aug 29 10:52:50 CDT 2019: Run 147, (fds: 3267)
Thu Aug 29 10:52:50 CDT 2019: Run 148, (fds: 3289)
Thu Aug 29 10:52:50 CDT 2019: Run 149, (fds: 3311)
Thu Aug 29 10:52:50 CDT 2019: Run 150, (fds: 3333)
Thu Aug 29 10:52:50 CDT 2019: Run 151, (fds: 3355)
Thu Aug 29 10:52:50 CDT 2019: Run 152, (fds: 3377)
Thu Aug 29 10:52:50 CDT 2019: Run 153, (fds: 3399)
Thu Aug 29 10:52:51 CDT 2019: Run 154, (fds: 3421)
Thu Aug 29 10:52:51 CDT 2019: Run 155, (fds: 3443)
Thu Aug 29 10:52:51 CDT 2019: Run 156, (fds: 3465)
Thu Aug 29 10:52:51 CDT 2019: Run 157, (fds: 3487)
Thu Aug 29 10:52:51 CDT 2019: Run 158, (fds: 3509)
Thu Aug 29 10:52:51 CDT 2019: Run 159, (fds: 3531)
Thu Aug 29 10:52:51 CDT 2019: Run 160, (fds: 3553)
Thu Aug 29 10:52:51 CDT 2019: Run 161, (fds: 3575)
Thu Aug 29 10:52:51 CDT 2019: Run 162, (fds: 3597)
Thu Aug 29 10:52:51 CDT 2019: Run 163, (fds: 3619)
Thu Aug 29 10:52:52 CDT 2019: Run 164, (fds: 3641)
Thu Aug 29 10:52:52 CDT 2019: Run 165, (fds: 3663)
Thu Aug 29 10:52:52 CDT 2019: Run 166, (fds: 3685)
Thu Aug 29 10:52:52 CDT 2019: Run 167, (fds: 3707)
Thu Aug 29 10:52:52 CDT 2019: Run 168, (fds: 3729)
Thu Aug 29 10:52:52 CDT 2019: Run 169, (fds: 3751)
Thu Aug 29 10:52:52 CDT 2019: Run 170, (fds: 3773)
Thu Aug 29 10:52:52 CDT 2019: Run 171, (fds: 3795)
Thu Aug 29 10:52:52 CDT 2019: Run 172, (fds: 3817)
Thu Aug 29 10:52:53 CDT 2019: Run 173, (fds: 3839)
Thu Aug 29 10:52:53 CDT 2019: Run 174, (fds: 3861)
Thu Aug 29 10:52:53 CDT 2019: Run 175, (fds: 3883)
Thu Aug 29 10:52:53 CDT 2019: Run 176, (fds: 3905)
Thu Aug 29 10:52:53 CDT 2019: Run 177, (fds: 3927)
Thu Aug 29 10:52:53 CDT 2019: Run 178, (fds: 3949)
Thu Aug 29 10:52:53 CDT 2019: Run 179, (fds: 3971)
Thu Aug 29 10:52:53 CDT 2019: Run 180, (fds: 3993)
Thu Aug 29 10:52:53 CDT 2019: Run 181, (fds: 4015)
Thu Aug 29 10:52:53 CDT 2019: Run 182, (fds: 4037)

Thanks! @kanaka So I think those FDs are not those leaked by keeping the window/globalscope alive, and it matches the mach ports count that I noticed on Mac.

Also, I don't think those relate to actually loading the page anymore, since I can get that count to go op simply by opening a page, and scrolling it up and dow. Perhaps you can comfirm you see that happening on Linux too?

I used 1291d52f (2019-08-29) to test scrolling. I'm not seeing an obvious increase due to scrolling. I loaded cnn and while it consumed something like 1800 fds and the number wobbled around a bit during scrolling I couldn't make it consistently increase. I load this page as well and it stayed around 482 no matter how much I scrolled the page.

That is a lot of fds!

@asajeffrey I thought so too. For reference, the about page (servo started with no URL), uses about 218 fds for me (and scrolling doesn't change that it there either).

Ok so what I saw with mach ports going up while scrolling on mac, doesn't reproduce with fds on linux.

I rebased https://github.com/servo/servo/pull/23909 so that it includes the recent fix to leaking windows, it might be worth to try again with that one, see if it takes us further at all.

@gterzian Building to test with #23909 now.

Things are MUCH improved with this build!

I did a test run for about 2000 loads with no crash/panic. There was a still a slight increase in fds over time, but it's very slow and certainly not happening with every page load. The fds started at about 220 and increased to 263 on the 2000th load. The number of fds wobbles up and down some (perhaps an artifact of how I'm counting them via /proc) but on average has a slow increase.

Here are the first few fd counts and then every 100 page loads after that:

Tue Sep  3 12:42:46 CDT 2019: Run 0, (fds: 220)
Tue Sep  3 12:42:47 CDT 2019: Run 1, (fds: 209)
Tue Sep  3 12:42:47 CDT 2019: Run 2, (fds: 212)
Tue Sep  3 12:42:47 CDT 2019: Run 3, (fds: 214)
Tue Sep  3 12:42:47 CDT 2019: Run 4, (fds: 214)
Tue Sep  3 12:42:47 CDT 2019: Run 5, (fds: 214)
Tue Sep  3 12:42:48 CDT 2019: Run 6, (fds: 214)
Tue Sep  3 12:42:49 CDT 2019: Run 7, (fds: 212)
Tue Sep  3 12:42:49 CDT 2019: Run 8, (fds: 214)
Tue Sep  3 12:42:49 CDT 2019: Run 9, (fds: 214)
Tue Sep  3 12:42:49 CDT 2019: Run 10, (fds: 214)
...
Tue Sep  3 12:43:06 CDT 2019: Run 100, (fds: 220)
Tue Sep  3 12:43:17 CDT 2019: Run 200, (fds: 223)
Tue Sep  3 12:43:27 CDT 2019: Run 300, (fds: 226)
Tue Sep  3 12:43:37 CDT 2019: Run 400, (fds: 232)
Tue Sep  3 12:43:47 CDT 2019: Run 500, (fds: 235)
Tue Sep  3 12:43:57 CDT 2019: Run 600, (fds: 241)
Tue Sep  3 12:44:06 CDT 2019: Run 700, (fds: 238)
Tue Sep  3 12:44:16 CDT 2019: Run 800, (fds: 238)
Tue Sep  3 12:44:25 CDT 2019: Run 900, (fds: 241)
Tue Sep  3 12:44:34 CDT 2019: Run 1000, (fds: 245)
Tue Sep  3 12:44:44 CDT 2019: Run 1100, (fds: 250)
Tue Sep  3 12:44:53 CDT 2019: Run 1200, (fds: 245)
Tue Sep  3 12:45:02 CDT 2019: Run 1300, (fds: 245)
Tue Sep  3 12:45:12 CDT 2019: Run 1400, (fds: 248)
Tue Sep  3 12:45:21 CDT 2019: Run 1500, (fds: 254)
Tue Sep  3 12:45:31 CDT 2019: Run 1600, (fds: 257)
Tue Sep  3 12:45:40 CDT 2019: Run 1700, (fds: 257)
Tue Sep  3 12:45:49 CDT 2019: Run 1800, (fds: 257)
Tue Sep  3 12:45:59 CDT 2019: Run 1900, (fds: 263)
Tue Sep  3 12:46:08 CDT 2019: Run 2000, (fds: 263)

Ok so:

  1. Running the test off of master, including the "window leak fix", produces unsatisfactory results.
  2. Running the test off of #23909 without the "window leak fix", produces unsatisfactory results.
  3. Running the test off of #23909 rebased on master, including the "window leak fix", produces satisfactory results.

I'm not sure what to make off it, since 1. fixed a leak of ipc-channels(hence fds), but 2 does not, it only reduces the number of ipc-channel created, and each one created is promptly dropped(I think).

Maybe #23909 inadvertently also fixes a leak somewhere else, however it's hard to tell. The main change it makes is not using the "create a ipc-channel, send the sender in a message and block on the receiver" pattern, whose usage would seem to imply not leaking the ipc-channel, since neither the sender nor the receiver is stored anywhere.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

AgustinCB picture AgustinCB  ·  4Comments

SimonSapin picture SimonSapin  ·  3Comments

CYBAI picture CYBAI  ·  3Comments

shinglyu picture shinglyu  ·  4Comments

pshaughn picture pshaughn  ·  3Comments