I haven't got a small reproducible case, but it looks like I'm seeing leaking memory. Using wgpu-rs on macOS. AMD Radeon Pro Vega 56 DiscreteGpu Metal
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: AllocationError(OutOfMemory(Device))', /Users/ln/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/05ba7a5/wgpu-core/src/device/mod.rs:328:22
stack backtrace:
0: 0x107737575 - backtrace::backtrace::libunwind::trace::h5c84db184dbe55ed
at /Users/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
1: 0x107737575 - backtrace::backtrace::trace_unsynchronized::hfabf504f184a4062
at /Users/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
2: 0x107737575 - std::sys_common::backtrace::_print_fmt::hb18545a457444b58
at src/libstd/sys_common/backtrace.rs:77
3: 0x107737575 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hff7732c2e44ef8b9
at src/libstd/sys_common/backtrace.rs:59
4: 0x107754bad - core::fmt::write::hd42cb3dea57bae40
at src/libcore/fmt/mod.rs:1052
5: 0x107734f3b - std::io::Write::write_fmt::ha39f6009af02b1d2
at src/libstd/io/mod.rs:1426
6: 0x10773938a - std::sys_common::backtrace::_print::h5cfb8cdd320f1e64
at src/libstd/sys_common/backtrace.rs:62
7: 0x10773938a - std::sys_common::backtrace::print::hef683e3bc77ce269
at src/libstd/sys_common/backtrace.rs:49
8: 0x10773938a - std::panicking::default_hook::{{closure}}::h389f076017b5df43
at src/libstd/panicking.rs:204
9: 0x10773908a - std::panicking::default_hook::h04b06ec20c41bf02
at src/libstd/panicking.rs:224
10: 0x1077399dd - std::panicking::rust_panic_with_hook::hccde7faed9a5c398
at src/libstd/panicking.rs:472
11: 0x1077395a2 - rust_begin_unwind
at src/libstd/panicking.rs:380
12: 0x107767ebf - std::panicking::begin_panic::hc520a8a43176ea4c
13: 0x107767dc5 - std::panicking::begin_panic::hc520a8a43176ea4c
14: 0x106eff899 - core::result::Result<T,E>::unwrap::hfdd5819c5946a227
at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libcore/result.rs:963
15: 0x106e3fd8c - wgpu_core::device::Device<B>::create_buffer::h40f8e83dda14e0f2
at /Users/ln/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/05ba7a5/wgpu-core/src/device/mod.rs:328
16: 0x106da5b15 - wgpu_core::device::<impl wgpu_core::hub::Global<G>>::device_create_buffer_mapped::h93275aab33da2756
at /Users/ln/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/05ba7a5/wgpu-core/src/device/mod.rs:554
17: 0x106e43bbc - wgpu_device_create_buffer_mapped
at /Users/ln/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/05ba7a5/wgpu-native/src/device.rs:215
18: 0x106d39d19 - wgpu::Device::create_buffer_mapped::h9cd4b0ad4b3c5391
at /Users/ln/.cargo/git/checkouts/wgpu-rs-40ea39809c03c5d8/e198d63/src/lib.rs:946
19: 0x106d39e75 - wgpu::Device::create_buffer_with_data::h2645df23584f5aa9
at /Users/ln/.cargo/git/checkouts/wgpu-rs-40ea39809c03c5d8/e198d63/src/lib.rs:964
In Instruments, it looks like "All Heap Allocations" looks ok. There is growth consistently frame by frame that eventually gets freed. The biggest of which is I'm seeing allocations by [GFX9_MtlCmdBuffer blitCommandEncoder] through wgpu::CommandEncoder::copy_buffer_to_buffer growing to more than 150k total allocations over minutes, that then eventually gets released. So that looks like a red herring.
The one that looks like the root cause is from "All VM Regions". VM: IOAccelerator - this doesn't seem to leak all the time, my app was running for 11 minutes with no incremental allocations from IOAccelerator, but once it starts leaking it grew to 1GB and then the app panicked as above. I'm not sure whether anything in my code triggered it. I'll dig some more and see if I can find something.
Here's the allocation call trace of the call that is definitely problematic:
0 IOAccelerator IOAccelResourceCreate
1 Metal -[MTLIOAccelResource initWithDevice:remoteStorageResource:options:args:argsSize:]
2 Metal -[MTLIOAccelResource initWithDevice:options:args:argsSize:]
3 Metal -[MTLIOAccelBuffer initWithDevice:pointer:length:options:sysMemSize:vidMemSize:args:argsSize:deallocator:]
4 AMDRadeonX5000MTLDriver -[AMD_MtlBuffer initWithDevice:pointer:length:options:sysMemSize:vidMemSize:args:argsSize:deallocator:]
5 AMDRadeonX5000MTLDriver -[AMD_MtlBuffer initInternalWithDevice:pointer:length:options:deallocator:]
6 AMDRadeonX5000MTLDriver -[AMD_MtlDevice newBufferWithLength:options:]
7 myapp _$LT$$LP$A$C$$u20$B$RP$$u20$as$u20$objc..message..MessageArguments$GT$::invoke::hfd269ed3df516a3b /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/objc-0.2.7/src/message/mod.rs:128
8 myapp objc::message::platform::send_unverified::ha3bb8f66d92d4c47 /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/objc-0.2.7/src/message/apple/mod.rs:27
9 myapp objc::message::send_message::he77867efdfe3ea45 /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/objc-0.2.7/src/message/mod.rs:178
10 myapp metal::device::DeviceRef::new_buffer::h610be1f6ca0bf17b /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/metal-0.18.0/src/device.rs:1653
11 myapp _$LT$gfx_backend_metal..device..Device$u20$as$u20$gfx_hal..device..Device$LT$gfx_backend_metal..Backend$GT$$GT$::allocate_memory::hf145286829816ae2 /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-backend-metal-0.5.1/src/device.rs:2275
12 myapp gfx_memory::allocator::allocate_memory_helper::h8514d362e5d363d4 /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-memory-0.1.1/src/allocator/mod.rs:61
13 myapp gfx_memory::allocator::general::GeneralAllocator$LT$B$GT$::alloc_chunk_from_device::h3f5e412513142799 /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-memory-0.1.1/src/allocator/general.rs:236
14 myapp gfx_memory::allocator::general::GeneralAllocator$LT$B$GT$::alloc_chunk::hfb5f92c151ec8ce1 /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-memory-0.1.1/src/allocator/general.rs:271
15 myapp gfx_memory::allocator::general::GeneralAllocator$LT$B$GT$::alloc_from_entry::h1d010a782d4c4b3f /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-memory-0.1.1/src/allocator/general.rs:364
16 myapp gfx_memory::allocator::general::GeneralAllocator$LT$B$GT$::alloc_block::h5349f67da9b9b5f4 /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-memory-0.1.1/src/allocator/general.rs:418
17 myapp _$LT$gfx_memory..allocator..general..GeneralAllocator$LT$B$GT$$u20$as$u20$gfx_memory..allocator..Allocator$LT$B$GT$$GT$::alloc::hbeda516620781f02 /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-memory-0.1.1/src/allocator/general.rs:498
18 myapp gfx_memory::heaps::memory_type::MemoryType$LT$B$GT$::alloc::h0228881f8dde543d /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-memory-0.1.1/src/heaps/memory_type.rs:92
19 myapp gfx_memory::heaps::Heaps$LT$B$GT$::allocate_from::h6008f56da3692568 /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-memory-0.1.1/src/heaps/mod.rs:168
20 myapp gfx_memory::heaps::Heaps$LT$B$GT$::allocate::h87c66e912681c07a /Users/ln/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-memory-0.1.1/src/heaps/mod.rs:137
21 myapp wgpu_core::device::Device$LT$B$GT$::create_buffer::h40f8e83dda14e0f2 /Users/ln/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/05ba7a5/wgpu-core/src/device/mod.rs:328
22 myapp wgpu_core::device::_$LT$impl$u20$wgpu_core..hub..Global$LT$G$GT$$GT$::device_create_buffer_mapped::h93275aab33da2756 /Users/ln/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/05ba7a5/wgpu-core/src/device/mod.rs:554
23 myapp wgpu_device_create_buffer_mapped /Users/ln/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/05ba7a5/wgpu-native/src/device.rs:215
24 myapp wgpu::Device::create_buffer_mapped::h9cd4b0ad4b3c5391 /Users/ln/.cargo/git/checkouts/wgpu-rs-40ea39809c03c5d8/e198d63/src/lib.rs:946
25 myapp wgpu::Device::create_buffer_with_data::h2645df23584f5aa9 /Users/ln/.cargo/git/checkouts/wgpu-rs-40ea39809c03c5d8/e198d63/src/lib.rs:964
One more piece of data, the problematic call from my code is creating the vertex buffer. I dynamically create a vec of vertex data every frame, then create the wgpu buffer from that. The byte array passed to create_buffer_with_data is in the range of 1.16-1.24 mb every frame consistently. That's still true of the last call into create_buffer_with_data which then panics (so it's not my code suddenly generating some crazy buffer size or something).
Thank you for the info!
Sounds like a basic thing of not properly destroying the buffers that are created with mapping.
Could be a regression from https://github.com/gfx-rs/wgpu/pull/547
I added more logging and tested our skybox example on macos. It appears so far that all the tracking and freeing works as expected (this fragment repeats every frame):
[2020-04-04T23:31:48Z TRACE gfx_memory::heaps] Allocate memory block: type '1', kind 'Linear', size: '256', align: '256'
[2020-04-04T23:31:48Z TRACE gfx_memory::heaps] Free memory block: type '1', size: '256'
Could you share your test case with us to test?
It's not really easily isolated as a test case right now. I'm trying to figure out when it triggers. It's absolutely not every frame (like one in thousands of frames) even when I can get it to trigger. It's somehow linked to exactly what is being rendered on each frame - and I think it needs to be significant buffer sizes (> 1MB from my testing so far). When I simply render nothing, but create the vertex buffer, but clip its size to under a megabyte, I can't reproduce it. But even then it's not obvious when the problem occurs to me yet. I'll turn on trace logging for gfx_memory and see what I can see in the log.
If it's not happening consistently, are you sure this is a leak? versus, say memory fragmentation.
It's consistent in that I can get it to fail every time, just not consistent timing. There are certainly allocations from the Instruments logging that are not getting freed. It's just very few of them, but each of them are 10MB, so after I have about 100 of them I'm hitting a memory cap somewhere.
I've got a VERY large trace log from gfx_memory, but finding any problematic entries is difficult.
Right near the beginning I get:
[2020-04-04T23:51:55Z WARN gfx_memory::heaps] Unable to allocate 4194304 with Linear: TooManyObjects
Not sure whether that's a red herring or not (it occurs way before anything starts not getting deallocated).
no, that should be ok. Linear allocator doesn't handle arbitrary sizes, and we are just falling back to a dedicated allocation.
Please try to share your work in some way that I could test it.
Sorry, I really appreciate the attempt to help here. Let me see if I can minimize the reproducible case. I'm trying to avoid giving you 6k lines of code.
OK. I think I found reproducible code for my problem.
Am I doing something dumb here?
This panics in under 5secs on my machine.
Note that the variable size vertex buffer seem to matter here:
let vertices = vec![0u8; 2_000_000 + (count % 500) * 1000];
If I do a constant allocation for vertices, it doesn't reproduce the problem:
let vertices = vec![0u8; 2_000_000];
cargo.toml
[package]
name = "memleak"
version = "0.1.0"
authors = ["lordnoriyuki <[email protected]>"]
edition = "2018"
[dependencies]
winit = "0.22"
wgpu = { git = "https://github.com/gfx-rs/wgpu-rs" }
futures = "0.3"
main.rs
use winit;
fn main() {
let winit_event_loop = winit::event_loop::EventLoop::new();
let logical_size = winit::dpi::LogicalSize::new(400, 400);
let window = winit::window::WindowBuilder::new()
.with_inner_size(logical_size)
.build(&winit_event_loop).unwrap();
let surface = wgpu::Surface::create(&window);
let (adapter, device, queue) = futures::executor::block_on(create_device());
let mut swap_chain = device.create_swap_chain(
&surface,
&wgpu::SwapChainDescriptor {
usage: wgpu::TextureUsage::OUTPUT_ATTACHMENT,
format: wgpu::TextureFormat::Bgra8Unorm,
width: 400,
height: 400,
present_mode: wgpu::PresentMode::Immediate,
},
);
let mut count = 0;
winit_event_loop.run(move |event, _, control_flow| {
*control_flow = winit::event_loop::ControlFlow::Poll;
match event {
winit::event::Event::MainEventsCleared => {
*control_flow = winit::event_loop::ControlFlow::Poll;
window.request_redraw();
},
winit::event::Event::RedrawRequested(_) => {
let frame = swap_chain.get_next_texture().unwrap();
let mut encoder = device.create_command_encoder(&wgpu::CommandEncoderDescriptor { label: None });
let vertices = vec![0u8; 2_000_000 + (count % 500) * 1000];
device.create_buffer_with_data(&vertices, wgpu::BufferUsage::VERTEX);
encoder.begin_render_pass(&wgpu::RenderPassDescriptor {
color_attachments: &[wgpu::RenderPassColorAttachmentDescriptor {
attachment: &frame.view,
resolve_target: None,
load_op: wgpu::LoadOp::Clear,
store_op: wgpu::StoreOp::Store,
clear_color: wgpu::Color { r: 0.0, g: 0.0, b: 0.5, a: 1.0 },
}],
depth_stencil_attachment: None,
});
queue.submit(&[encoder.finish()]);
count += 1;
},
winit::event::Event::WindowEvent { event, .. } => match event {
winit::event::WindowEvent::CloseRequested => {
*control_flow = winit::event_loop::ControlFlow::Exit
}
_ => {}
},
_ => {}
};
})
}
async fn create_device() -> (wgpu::Adapter, wgpu::Device, wgpu::Queue) {
let adapter = wgpu::Adapter::request(&wgpu::RequestAdapterOptions {
power_preference: wgpu::PowerPreference::HighPerformance,
compatible_surface: None,
}, wgpu::BackendBit::PRIMARY).await.unwrap();
let (device, queue) = adapter.request_device(&wgpu::DeviceDescriptor {
extensions: wgpu::Extensions {
anisotropic_filtering: false,
},
limits: wgpu::Limits::default(),
}).await;
(adapter, device, queue)
}
Oh interesting, you are just creating a buffer that is being dropped! I can test this for sure, thank you 馃憤
I tried creating 16Mb buffers in the cube example this way, and it appears that we are freeing all of them correctly. Could you push your repro code in a branch for me to check out?
Code is here:
It was this commit to wgpu-rs:
https://github.com/gfx-rs/wgpu-rs/commit/31e80d99b3f06a711a8ee7cc9474016a77c64590
Which pulled in latest wgpu. The commit before this doesn't panic on my machine.
Looking at that, the wgpu commit when it worked was https://github.com/gfx-rs/wgpu/commit/08e8d406c175579da5ef18c1abf4d6c00e2a9726. Between then and the current wgpu commit (https://github.com/gfx-rs/wgpu/commit/306554600ab7479ec3e54d0c076c71f02474237a) used in wgpu-rs, I'm guessing it's likely it was the "Port to gfx-extras and gfx-hal-0.5" changes. Only other one that looks like it could be related was "Recycled identity management (#533)".
Was running your testcase, and I think I got a lead: we never end up calling free_chunk. So the problem is definitely in gfx-memory.
Thanks to your test case, it was not too difficult to track down.
Now fixed upstream in https://github.com/gfx-rs/gfx-extras/pull/8
Please pick up the new version by doing cargo update -p gfx-memory.
Confirmed fixes it in the test case, and in my original code. Thank you very much!
Most helpful comment
Thanks to your test case, it was not too difficult to track down.
Now fixed upstream in https://github.com/gfx-rs/gfx-extras/pull/8
Please pick up the new version by doing
cargo update -p gfx-memory.