Rust generates a call to memset when zeroing stack objects, as is reasonable. Normally, this would go to libc, but in no_std programs, it's usually done with compiler_builtins. However, because compiler_builtins's memset is in the same binary and LLVM "knows" what it does, LLVM inlines it and, on x86_64, optimizes it to movaps instructions. movaps requires 16-byte alignment. Rust does not 16-byte align the stack. As a result, under these circumstances, Rust will segfault if it needs to zero a stack object that by chance isn't 16-byte aligned.
When compiled with nightly-x86_64-unknown-linux-gnu rustc 1.35.0-nightly (2c8bbf50d 2019-03-16), the following program will trigger the bug.
#![no_std]
#![no_main]
#![feature(lang_items)]
#![feature(compiler_builtins_lib)] // Forces us to use rustc -O.
use core::ptr::write_volatile;
struct SomeStruct ([u8; 64]);
#[no_mangle]
extern "C" fn _start() { // This is the interesting bit.
let mut instance = SomeStruct([0; 64]);
unsafe {
write_volatile::<[u8; 64]>(instance.0.as_mut_ptr() as *mut [u8; 64], [0; 64]); // Stop LLVM from optimizing out everything due to -O.
}
loop {};
}
// Standard stubs for no_std.
use core::panic::PanicInfo;
#[panic_handler]
fn rust_panic(info: &PanicInfo) -> ! {
loop {};
}
#[lang = "eh_personality"]
fn eh_personality() {
loop {};
}
The following assembly is produced on x86_64 when compiling with rustc src/main.rs -O -Z pre-link-arg=-nostartfiles (the instruction pointer is on the instruction where SIGSEGV occurs):
(gdb) disassemble
Dump of assembler code for function _start:
0x0000555555555000 <+0>: sub $0x88,%rsp
0x0000555555555007 <+7>: xorps %xmm0,%xmm0
=> 0x000055555555500a <+10>: movaps %xmm0,(%rsp)
0x000055555555500e <+14>: movaps %xmm0,0x10(%rsp)
0x0000555555555013 <+19>: movaps %xmm0,0x20(%rsp)
0x0000555555555018 <+24>: movaps %xmm0,0x30(%rsp)
0x000055555555501d <+29>: movaps (%rsp),%xmm0
0x0000555555555021 <+33>: movaps %xmm0,0x40(%rsp)
0x0000555555555026 <+38>: movaps 0x10(%rsp),%xmm0
0x000055555555502b <+43>: movaps %xmm0,0x50(%rsp)
0x0000555555555030 <+48>: movaps 0x20(%rsp),%xmm0
0x0000555555555035 <+53>: movaps %xmm0,0x60(%rsp)
0x000055555555503a <+58>: movaps 0x30(%rsp),%xmm0
0x000055555555503f <+63>: movaps %xmm0,0x70(%rsp)
0x0000555555555044 <+68>: nopw %cs:0x0(%rax,%rax,1)
0x000055555555504e <+78>: xchg %ax,%ax
0x0000555555555050 <+80>: jmp 0x555555555050 <_start+80>
End of assembler dump.
Possible fixes include reporting the issue to LLVM and awaiting their fix, somehow convincing LLVM not to make this optimization, aligning the stack if we are going to use memset, or somehow avoiding memset entirely.
I should note that prior testing has shown that the kernel provides a 16-byte aligned stack, but Rust doesn't preserve that alignment.
AFAIK, on entry to a function, rsp is not supposed to be a multiple of 16, but a multiple of 16 plus 8. This is because the caller function is expected to have rsp aligned before executing the call instruction, which subtracts 8. Thus sub $0x88, %rsp is meant to bring rsp back into alignment. I guess the kernel does not follow this rule when jumping to the start routine, so you should probably define _start in assembly instead and have it fix up rsp before calling into Rust code.
What @comex said. The _start for Linux ELF executables does not follow the conventional calling convention and therefore assembly is necessary to implement it. Namely it is expected that _start looks like this:
xor %ebp,%ebp
and $0xfffffff0,%esp
; perhaps initialize the other registers to something as well
call <actual_entry_point>
Closing as a not-a-bug.
@comex Ah. I suppose I misread the calling convention. In my larger tests, I had an assembly _start, but it had the stack aligned after the call. I鈥檒l switch that around and hopefully that鈥檒l work.
Most helpful comment
What @comex said. The
_startfor Linux ELF executables does not follow the conventional calling convention and therefore assembly is necessary to implement it. Namely it is expected that_startlooks like this:Closing as a not-a-bug.