Rust: Segfault using `vec!` when running under valgrind

Created on 19 Mar 2018 · 16Comments · Source: rust-lang/rust

Allocations bigger than 0x2000 with vec! crash with a segfault when running under valgrind.

fn main() {
    let mut data = vec![0; 0x2001]; // while 0x2000 runs fine
}

This was first noticed and reported as an issue to image.

Meta

rustc - 1.26.0-nightly


rustc 1.26.0-nightly (adf2135ad 2018-03-17)
binary: rustc
commit-hash: adf2135adc4a65a78ba053f04c29d7fe0468eb87
commit-date: 2018-03-17
host: x86_64-unknown-linux-gnu
release: 1.26.0-nightly
LLVM version: 6.0

valgrind - 3.13.0

valgrind-3.13.0-16445:16446-vex-3396

Full backtrace



$ valgrind target/release/hello_world

==10398== Memcheck, a memory error detector

==10398== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.

==10398== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info

==10398== Command: target/release/hello_world

==10398==

==10398== Invalid read of size 8

==10398==    at 0x129DD7: je_arena_sdalloc (arena.h:1516)

==10398==    by 0x129DD7: je_isdalloct (jemalloc_internal.h:1195)

==10398==    by 0x129DD7: je_isqalloc (jemalloc_internal.h:1205)

==10398==    by 0x129DD7: isfree (jemalloc.c:1921)

==10398==    by 0x129DD7: sdallocx (jemalloc.c:2669)

==10398==    by 0x10F24E: hello_world::main (in /home/andreas/code/sandbox/image_issue/target/release/hello_world)

==10398==    by 0x10F1F2: std::rt::lang_start::{{closure}} (in /home/andreas/code/sandbox/image_issue/target/release/hello_world)

==10398==    by 0x110FE7: {{closure}} (rt.rs:59)

==10398==    by 0x110FE7: _ZN3std9panicking3try7do_call17h64f9d8eb811b164fE.llvm.10326519180759561980 (panicking.rs:306)

==10398==    by 0x121C0E: __rust_maybe_catch_panic (lib.rs:102)

==10398==    by 0x110205: try (panicking.rs:285)

==10398==    by 0x110205: catch_unwind (panic.rs:361)

==10398==    by 0x110205: std::rt::lang_start_internal (rt.rs:58)

==10398==    by 0x10F2A3: main (in /home/andreas/code/sandbox/image_issue/target/release/hello_world)

==10398==  Address 0x5e00000 is in a rwx anonymous segment

==10398==

==10398== Invalid read of size 4

==10398==    at 0x524E9E0: pthread_mutex_lock (in /usr/lib/libpthread-2.26.so)

==10398==    by 0x1336F5: je_malloc_mutex_lock (mutex.h:101)

==10398==    by 0x1336F5: je_arena_dalloc_large (arena.c:3075)

==10398==    by 0x10F24E: hello_world::main (in /home/andreas/code/sandbox/image_issue/target/release/hello_world)

==10398==    by 0x10F1F2: std::rt::lang_start::{{closure}} (in /home/andreas/code/sandbox/image_issue/target/release/hello_world)

==10398==    by 0x110FE7: {{closure}} (rt.rs:59)

==10398==    by 0x110FE7: _ZN3std9panicking3try7do_call17h64f9d8eb811b164fE.llvm.10326519180759561980 (panicking.rs:306)

==10398==    by 0x121C0E: __rust_maybe_catch_panic (lib.rs:102)

==10398==    by 0x110205: try (panicking.rs:285)

==10398==    by 0x110205: catch_unwind (panic.rs:361)

==10398==    by 0x110205: std::rt::lang_start_internal (rt.rs:58)

==10398==    by 0x10F2A3: main (in /home/andreas/code/sandbox/image_issue/target/release/hello_world)

==10398==  Address 0x400000 is not stack'd, malloc'd or (recently) free'd

==10398==

==10398==

==10398== Process terminating with default action of signal 11 (SIGSEGV): dumping core

==10398==  Access not within mapped region at address 0x400000

==10398==    at 0x524E9E0: pthread_mutex_lock (in /usr/lib/libpthread-2.26.so)

==10398==    by 0x1336F5: je_malloc_mutex_lock (mutex.h:101)

==10398==    by 0x1336F5: je_arena_dalloc_large (arena.c:3075)

==10398==    by 0x10F24E: hello_world::main (in /home/andreas/code/sandbox/image_issue/target/release/hello_world)

==10398==    by 0x10F1F2: std::rt::lang_start::{{closure}} (in /home/andreas/code/sandbox/image_issue/target/release/hello_world)

==10398==    by 0x110FE7: {{closure}} (rt.rs:59)

==10398==    by 0x110FE7: _ZN3std9panicking3try7do_call17h64f9d8eb811b164fE.llvm.10326519180759561980 (panicking.rs:306)

==10398==    by 0x121C0E: __rust_maybe_catch_panic (lib.rs:102)

==10398==    by 0x110205: try (panicking.rs:285)

==10398==    by 0x110205: catch_unwind (panic.rs:361)

==10398==    by 0x110205: std::rt::lang_start_internal (rt.rs:58)

==10398==    by 0x10F2A3: main (in /home/andreas/code/sandbox/image_issue/target/release/hello_world)

==10398==  If you believe this happened as a result of a stack

==10398==  overflow in your program's main thread (unlikely but

==10398==  possible), you can try to increase the size of the

==10398==  main thread stack using the --main-stacksize= flag.

==10398==  The main thread stack size used in this run was 8388608.

==10398==

==10398== HEAP SUMMARY:

==10398==     in use at exit: 32,804 bytes in 2 blocks

==10398==   total heap usage: 7 allocs, 5 frees, 34,772 bytes allocated

==10398==

==10398== LEAK SUMMARY:

==10398==    definitely lost: 32,772 bytes in 1 blocks

==10398==    indirectly lost: 0 bytes in 0 blocks

==10398==      possibly lost: 0 bytes in 0 blocks

==10398==    still reachable: 32 bytes in 1 blocks

==10398==         suppressed: 0 bytes in 0 blocks

==10398== Rerun with --leak-check=full to see details of leaked memory

==10398==

==10398== For counts of detected and suppressed errors, rerun with: -v

==10398== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 0 from 0)

Speicherzugriffsfehler (Speicherabzug geschrieben)

C-bug I-unsound 💥

Source

HeroicKatora

Most helpful comment

Does valgrind work properly with jemalloc?

Using the system allocator, this doesn't crash:

#![feature(alloc_system, global_allocator, allocator_api)]

extern crate alloc_system;

use alloc_system::System;

#[global_allocator]
static A: System = System;

fn main() {
    let mut data = vec![0; 0x2001];
}

crumblingstatue on 20 Mar 2018

❤3

All 16 comments

Another code that doesn't crash:

fn main() {
    let mut data = vec![1; 0x2001];
    // or Vec::with_capacity(0x2001)
}

If I read the source correctly, that means the offending part is located within RawVec::with_capacity_zeroed https://github.com/rust-lang/rust/blob/63739ab7b210c1a8c890c2ea5238a3284877daa3/src/liballoc/vec.rs#L1460-L1465

HeroicKatora on 19 Mar 2018

👍1

Does valgrind work properly with jemalloc?

Using the system allocator, this doesn't crash:

#![feature(alloc_system, global_allocator, allocator_api)]

extern crate alloc_system;

use alloc_system::System;

#[global_allocator]
static A: System = System;

fn main() {
    let mut data = vec![0; 0x2001];
}

crumblingstatue on 20 Mar 2018

❤3

It likely messes with it somehow, quite curiously valgrind reports 32,804 bytes in 2 blocks but we have 0x2001*4 = 32772 and that should be a single block. Where do the remaining bytes come from, does valgrinds alloc prepend/append trace data?

HeroicKatora on 20 Mar 2018

jemalloc does not support valgrind in the sense that its allocations will not be tracked, but I've never seen Valgrind cause it to crash.

sfackler on 20 Mar 2018

There is some code in jemalloc which explicitely works together with Valgrind, informing it about allocations as well as undefined and defined sections (VALGRIND_MALLOCLIKE_BLOCK). Another macro (VALGRIND_FREELIKE_BLOCK) gets called when the block is freed. This is invoked for every one of its internal allocations functions, malloc and calloc alike, for example je_calloc .

Then why are the statistics given by valgrind (see the above comment) not what I'd expect. Notice how valgrind only reports 6 allocations when running with let mut data: Vec<i32> = Vec::with_capacity(0x2001); but reports 7 before the crash with the given code. Somehow that seems very dubious.

HeroicKatora on 20 Mar 2018

@HeroicKatora we don't enable valgrind support when building jemalloc.

sfackler on 20 Mar 2018

jemalloc does not work and it does not intend to work with valgrind; valgrind support was completely removed from jemalloc in version 5.0 (https://github.com/jemalloc/jemalloc/issues/369).

@HeroicKatora valgrind needs a special allocator to work properly, and it appears that it segfaults if the wrong allocator is used. Therefore, valgrind should detect whether the allocator is appropriate, and if it isn't, either emit a nice error message, or workaround this somehow. There is nothing we can do about it from the Rust side, so I think you should report this to valgrind upstream and @Centril can just close this issue here (BTW @Centril how is this unsound? AFAICT it is just a crash. It would be cool if we had a flag to distinguish between unsound stable Rust issues and unsound nightly Rust issues since the unsound flag is being used a lot..).

gnzlbg on 3 Jul 2018

@HeroicKatora can you try with jemallocator ? That is, by adding this to your Cargo.toml:

[dependencies]
jemallocator = "0.1.9"

and this to your main file:

extern crate jemallocator;

#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;

I can reproduce this on linux with rust's jemalloc, but I can't reproduce this with jemallocator.

I also get a completely different error on MacOSX with both rust's jemalloc and jemallocator.

I have dumps of all outputs in this (gist).

gnzlbg on 5 Jul 2018

I've also tried by enabling alloc_trait feature in jemallocator just in case this issue is triggered by vec! calling some of the non-std jemalloc APIs which are not used if Jemallocator does not implement the Alloc trait (IIUC how all of this works, cc @SimonSapin ).

gnzlbg on 5 Jul 2018

The alloc_trait feature does not make any difference as far as #[global_allocator] is concerned.

I can reproduce this on linux with rust's jemalloc, but I can't reproduce this with jemallocator.

They are different versions of jemalloc, for what it’s worth.

SimonSapin on 5 Jul 2018

They are different versions of jemalloc, for what it’s worth.

Yeah, I forgot to mention that (rustc uses jemalloc 4.5, and jemallocator uses jemalloc 5.1).

@SimonSapin my point was that the impl of Alloc for Global implements everything on top of a couple of methods, and that might mean that through Alloc only the parts of the malloc-API of jemalloc get called. If valgrind intercepts those correctly, then things might work fine. The moment the Alloc impl is overriden to use xallocx or what not, if valgrind does not intercept those correctly, then you are allocating/freeing through valgrind, but reallocating through jemalloc directly, and that might cause issues.

gnzlbg on 5 Jul 2018

@gnzlbg I can not reproduce this with jemallocator either.

Also, in case the allocator integration was fully disabled, would that not mean that valgrind simply observes the effects of the underlying allocator? I don't see why it would falsely detect uninitialized reads in these cases. In any case, since rust still uses a version of jemalloc from before the complete removal, I suspect this might instead be related to one of the reasons motivating the integration removal. Namely that it was not reliable in all system configurations etc.

HeroicKatora on 5 Jul 2018

@gnzlbg Currently the behavior of Global is not specialized based on whether the static under #[global_allocator] also implements the Alloc trait. And I suspect it won’t ever be, rather we’d add more methods to GlobalAlloc with defaults that call the existing methods.

As to what is or isn’t a valid use of valgrind + jemalloc together, I don’t know anything so I can’t help there.

SimonSapin on 5 Jul 2018

Jemalloc was removed in https://github.com/rust-lang/rust/pull/55238 so I'm going to close this

alexcrichton on 3 Nov 2018

The jemallocator crate should work fine with valgrind in the supported platforms.

gnzlbg on 3 Nov 2018

@gnzlbg is there any chance I would be accidentally using jemalloc with this version of Rust?

https://github.com/PistonDevelopers/image/issues/719#issuecomment-468258882 Comment reproduced below:

I ran into this last night/today/whatever

rustc 1.34.0-nightly (633d75ac1 2019-02-21)

changing any vec![0u8; N] and vec.resize(N, 0u8) calls to vec![1u8; N] and vec.resize(N, 1u8) followed by writing all the zeroes in a for loop caused it to stop.