The following code will segfault on playground due to stack overflow:
fn do_stuff() {
let data = [0; 10000000];
}
The problem is that Rust arrays are created on the stack, but this array is too large to fit on the stack.
What's worse, the naive solution doesn't work either:
fn do_stuff() {
let data = Box::new([0; 10000000]);
}
This still instantiates the array on the stack and segfaults. The proper solution is this:
fn do_stuff() {
let data = vec![0; 10000000].into_boxed_slice();
}
This issue is particularly tricky if the array size is dynamic, and does not typically manifest on tests, resulting in unexpected crashes in production. Example:
fn do_stuff(len: usize) {
let data = [0; len];
}
Here len can be set to an arbitrarily large number that would overflow the stack. Only length values of types u8, i8, u16, i16 are definitely safe. The solution is to use one of them or create the array on the heap as described above.
@Shnatsel your last example fails to compile because len is non-const. Once const generics are widespread, it could be a generic parameter, though, like
fn do_stuff<const N>() { // I probably got the syntax wrong here
let _data = [0usize; N];
..
}
So we should detect large numbers – though how large is too large? Should we take the type into account?
Stack size is different between systems and there is no way to confidently tell if array will fit without asking the kernel.
The default stack size in Rust is 2Mb: https://doc.rust-lang.org/std/thread/index.html#stack-size
Any dynamic length type larger than u16 will definitely allow stack overflow.
In fact, if you're making an array of u64, then even a u16 length is already potentially problematic: every u64 is 8 bytes, so you're taking up 512Kb out of the entire 2Mb allotment with this one array. But that's potentially, not definitely, so that's only applicable as a pedantic lint.
As for large fixed sizes, that's more of a rookie mistake, this code would crash more or less up front, so I'd err on the side of people knowing what they're doing and set a rather high limit such as 512Kb.
And yes, the type needs to be taken into account because the memory limit counts against size_of(T) * len, not just len. If you're putting huge sized structs in an array you could get an overflow with very few of them.
@Shnatsel, that's the default stack size for spawned threads. As the page says, "the stack size of the main thread is not determined by Rust".
I think a lint like this would best be made to sum all worst-case stack allocations within a given function/stack frame. It, of course, could not reason about the possible stack growth scenarios of the whole program but at the very least it could trigger on:
fn main() {
let data1 = [0; 2000000];
if f() {
let data2 = [0; 2000000];
}
}
As indeed it's impossible to know the maximum stack size of the system the program would run on, this would essentially be a lint against stack frames that are simply very large (where how large would be configurable like in some other lints) and not necessarily bound to overflow the stack.
good first issue: A basic implementation of the lint with a configurable parameter
E-hard: Extending this to const generics and sum up required stack size in a bigger scope, like @jakubadamw suggested
I would like to tackle this one.
I am not sure about local variable allocation in Rust (I've read LLVM documentation and Rust Reference about local variables, but they left me with the feeling that it's not really specified). Is it ok to just sum up all the array allocations on the worst path that are binded to a variable + temp variable with max size?
I am not sure about local variable allocation in Rust
Yeah, me neither. I changed the classification to E-hard. No idea, why I thought, this is E-medium.
At first, only the "easy" implementation would be enough. So to just search for array allocations and check the size against a threshold.
Is it ok to just sum up all the array allocations on the worst path that are binded to a variable + temp variable with max size?
I would say that this is a good heuristic.
Hasn't this been implemented? https://rust-lang.github.io/rust-clippy/master/index.html#large_stack_arrays
Or is something still missing?
@cauebs only the "good first issue" part of this comment was implemented.
@ebroto Thanks! Missed that. Maybe the labels should be updated to reflect that, then. Or even close and open another issue to make it clearer.
Most helpful comment
I would like to tackle this one.