Rust: Under certain circumstances, Rust segfaults when using compiler_builtins memset due to stack misalignment

Created on 18 Mar 2019  路  4Comments  路  Source: rust-lang/rust

Rust generates a call to memset when zeroing stack objects, as is reasonable. Normally, this would go to libc, but in no_std programs, it's usually done with compiler_builtins. However, because compiler_builtins's memset is in the same binary and LLVM "knows" what it does, LLVM inlines it and, on x86_64, optimizes it to movaps instructions. movaps requires 16-byte alignment. Rust does not 16-byte align the stack. As a result, under these circumstances, Rust will segfault if it needs to zero a stack object that by chance isn't 16-byte aligned.

When compiled with nightly-x86_64-unknown-linux-gnu rustc 1.35.0-nightly (2c8bbf50d 2019-03-16), the following program will trigger the bug.

#![no_std]
#![no_main]
#![feature(lang_items)]
#![feature(compiler_builtins_lib)] // Forces us to use rustc -O.

use core::ptr::write_volatile;

struct SomeStruct ([u8; 64]);

#[no_mangle]
extern "C" fn _start() { // This is the interesting bit.
    let mut instance = SomeStruct([0; 64]);
    unsafe {
        write_volatile::<[u8; 64]>(instance.0.as_mut_ptr() as *mut [u8; 64], [0; 64]); // Stop LLVM from optimizing out everything due to -O.
    }
    loop {};
}

// Standard stubs for no_std.

use core::panic::PanicInfo;

#[panic_handler]
fn rust_panic(info: &PanicInfo) -> ! {
    loop {};
}

#[lang = "eh_personality"]
fn eh_personality() {
    loop {};
}

The following assembly is produced on x86_64 when compiling with rustc src/main.rs -O -Z pre-link-arg=-nostartfiles (the instruction pointer is on the instruction where SIGSEGV occurs):

(gdb) disassemble
Dump of assembler code for function _start:
   0x0000555555555000 <+0>:     sub    $0x88,%rsp
   0x0000555555555007 <+7>:     xorps  %xmm0,%xmm0
=> 0x000055555555500a <+10>:    movaps %xmm0,(%rsp)
   0x000055555555500e <+14>:    movaps %xmm0,0x10(%rsp)
   0x0000555555555013 <+19>:    movaps %xmm0,0x20(%rsp)
   0x0000555555555018 <+24>:    movaps %xmm0,0x30(%rsp)
   0x000055555555501d <+29>:    movaps (%rsp),%xmm0
   0x0000555555555021 <+33>:    movaps %xmm0,0x40(%rsp)
   0x0000555555555026 <+38>:    movaps 0x10(%rsp),%xmm0
   0x000055555555502b <+43>:    movaps %xmm0,0x50(%rsp)
   0x0000555555555030 <+48>:    movaps 0x20(%rsp),%xmm0
   0x0000555555555035 <+53>:    movaps %xmm0,0x60(%rsp)
   0x000055555555503a <+58>:    movaps 0x30(%rsp),%xmm0
   0x000055555555503f <+63>:    movaps %xmm0,0x70(%rsp)
   0x0000555555555044 <+68>:    nopw   %cs:0x0(%rax,%rax,1)
   0x000055555555504e <+78>:    xchg   %ax,%ax
   0x0000555555555050 <+80>:    jmp    0x555555555050 <_start+80>
End of assembler dump.

Possible fixes include reporting the issue to LLVM and awaiting their fix, somehow convincing LLVM not to make this optimization, aligning the stack if we are going to use memset, or somehow avoiding memset entirely.

Most helpful comment

What @comex said. The _start for Linux ELF executables does not follow the conventional calling convention and therefore assembly is necessary to implement it. Namely it is expected that _start looks like this:

xor    %ebp,%ebp
and    $0xfffffff0,%esp
; perhaps initialize the other registers to something as well
call     <actual_entry_point>

Closing as a not-a-bug.

All 4 comments

I should note that prior testing has shown that the kernel provides a 16-byte aligned stack, but Rust doesn't preserve that alignment.

AFAIK, on entry to a function, rsp is not supposed to be a multiple of 16, but a multiple of 16 plus 8. This is because the caller function is expected to have rsp aligned before executing the call instruction, which subtracts 8. Thus sub $0x88, %rsp is meant to bring rsp back into alignment. I guess the kernel does not follow this rule when jumping to the start routine, so you should probably define _start in assembly instead and have it fix up rsp before calling into Rust code.

What @comex said. The _start for Linux ELF executables does not follow the conventional calling convention and therefore assembly is necessary to implement it. Namely it is expected that _start looks like this:

xor    %ebp,%ebp
and    $0xfffffff0,%esp
; perhaps initialize the other registers to something as well
call     <actual_entry_point>

Closing as a not-a-bug.

@comex Ah. I suppose I misread the calling convention. In my larger tests, I had an assembly _start, but it had the stack aligned after the call. I鈥檒l switch that around and hopefully that鈥檒l work.

Was this page helpful?
0 / 5 - 0 ratings