Rust: Stack overflow compiling slice pattern for big array

Created on 30 Aug 2018  Â·  20Comments  Â·  Source: rust-lang/rust

The following fails with a stack overflow:

#![feature(slice_patterns)]

fn main() {
    match [0u8; 16*1024] {
        [..] => {}
    }
}

I see tests failing on miri's CI on Windows when the size is just 1024, so my suspicion is that this code will fail to compile on Windows with a smaller slice -- but I do not have a Windows machine to test that.

A-mir F-slice_patterns I-compiletime T-compiler

Most helpful comment

Since no one has done it publicly so far, I thought I'd investigate this error a bit.
First, I reduced the example to (playground link):

match [0u8; 16*1024] {
    [..] => {}
}

From the stack trace, we know that the overflow occurs at the only line is_useful calls itself directly: https://github.com/rust-lang/rust/blob/fdaf594bab31eec75fb6d582cd33e5a5b43de7f4/src/librustc_mir/hair/pattern/_match.rs#L1199
From my understanding of the code and the algorithm, this means that matrix has way too many columns. This is precisely what we would expect to happen if we considered a fixed-size array as a large tuple and naively applied the algorithm from the paper.
And this is in fact what seems to be happening: in the case of a fixed-size array, constructor_sub_pattern_tys returns a Vec of the same size as the array, and is_useful_specialized ends up making a Matrix with as many columns. Whereas the code for matching slices only uses as many matrix columns as (essentially) the longest pattern of the match, the code for fixed-size arrays brutally considers them as big tuples and calls it a day. It essentially considers [a, ..] as if it was [a, _, _, _, ..., _]. The algorithm then proceeds with these subpatterns one-by-one recursively, hence the stack overflow.

I looks to me like this issue doesn't come from a small oversight somewhere, but rather is a structural consequence of how fixed-size array patterns where shoehorned into the usefulness algorithm. So I feel that a good solution to this issue would need to rethink how to hande arrays (and slices too probably).
My idea for a solution would be the following: We would view a pattern [a, b, .., c] as something like Cons(a, Cons(b, Snoc(c, [..]))), which means we keep the number of matrix columns to a minimum, but we treat [..] specially when trying to specialize. For example, if we encounter a [..] when computing S(Cons, P), we would pretend [..] was actually [] | Cons(_, [..]) (or just [] if we know the size is 0), and then continue as usual.

Am I making sense ?

All 20 comments

Seems like we already had that once with https://github.com/rust-lang/rust/issues/17877, and it came back?

Running this in gdb, the top end of the stack looks like

#0  0x00007ffff6a91a75 in <core::iter::FlatMap<I, U, F> as core::iter::iterator::Iterator>::next () at /home/r/src/rust/rustc.2/src/libcore/iter/mod.rs:2523
#1  0x00007ffff6a7ef11 in <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T, I>>::from_iter () at /home/r/src/rust/rustc.2/src/liballoc/vec.rs:1860
#2  0x00007ffff6dcaa1a in <alloc::vec::Vec<T> as core::iter::traits::FromIterator<T>>::from_iter () at /home/r/src/rust/rustc.2/src/liballoc/vec.rs:1772
#3  core::iter::iterator::Iterator::collect () at /home/r/src/rust/rustc.2/src/libcore/iter/iterator.rs:1415
#4  rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1058
#5  0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#6  0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#7  0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#8  0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#9  0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#10 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#11 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#12 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#13 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#14 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#15 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#16 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#17 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#18 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#19 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#20 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#21 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#22 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#23 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#24 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#25 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#26 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#27 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#28 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#29 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#30 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#31 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#32 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#33 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#34 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#35 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#36 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#37 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#38 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#39 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#40 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#41 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#42 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#43 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#44 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#45 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#46 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#47 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#48 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#49 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#50 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#51 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#52 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#53 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#54 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#55 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#56 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#57 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#58 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111

Further down:

#16319 0x00007ffff6dcc218 in rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1111
#16320 0x00007ffff6dccf32 in rustc_mir::hair::pattern::_match::is_useful_specialized () at librustc_mir/hair/pattern/_match.rs:1212
#16321 0x00007ffff6af5cd3 in rustc_mir::hair::pattern::_match::is_useful::{{closure}} () at librustc_mir/hair/pattern/_match.rs:1101
#16322 <core::iter::Map<I, F> as core::iter::iterator::Iterator>::try_fold::{{closure}} () at /home/r/src/rust/rustc.2/src/libcore/iter/mod.rs:1406
#16323 core::iter::iterator::Iterator::try_fold () at /home/r/src/rust/rustc.2/src/libcore/iter/iterator.rs:1522
#16324 <core::iter::Map<I, F> as core::iter::iterator::Iterator>::try_fold () at /home/r/src/rust/rustc.2/src/libcore/iter/mod.rs:1406
#16325 0x00007ffff6b0eb29 in <core::iter::Map<I, F> as core::iter::iterator::Iterator>::try_fold ()
   from /home/r/src/rust/rustc.2/build/x86_64-unknown-linux-gnu/stage2/bin/../lib/../lib/librustc_mir-336e26fd9f56cf35.so
#16326 0x00007ffff6dca95b in core::iter::iterator::Iterator::try_for_each () at /home/r/src/rust/rustc.2/src/libcore/iter/iterator.rs:1559
#16327 core::iter::iterator::Iterator::find () at /home/r/src/rust/rustc.2/src/libcore/iter/iterator.rs:1782
#16328 rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1052
#16329 0x00007ffff6dccf32 in rustc_mir::hair::pattern::_match::is_useful_specialized () at librustc_mir/hair/pattern/_match.rs:1212
#16330 0x00007ffff6af5cd3 in rustc_mir::hair::pattern::_match::is_useful::{{closure}} () at librustc_mir/hair/pattern/_match.rs:1101
#16331 <core::iter::Map<I, F> as core::iter::iterator::Iterator>::try_fold::{{closure}} () at /home/r/src/rust/rustc.2/src/libcore/iter/mod.rs:1406
#16332 core::iter::iterator::Iterator::try_fold () at /home/r/src/rust/rustc.2/src/libcore/iter/iterator.rs:1522
#16333 <core::iter::Map<I, F> as core::iter::iterator::Iterator>::try_fold () at /home/r/src/rust/rustc.2/src/libcore/iter/mod.rs:1406
#16334 0x00007ffff6b0eb29 in <core::iter::Map<I, F> as core::iter::iterator::Iterator>::try_fold ()
   from /home/r/src/rust/rustc.2/build/x86_64-unknown-linux-gnu/stage2/bin/../lib/../lib/librustc_mir-336e26fd9f56cf35.so
#16335 0x00007ffff6dca95b in core::iter::iterator::Iterator::try_for_each () at /home/r/src/rust/rustc.2/src/libcore/iter/iterator.rs:1559
#16336 core::iter::iterator::Iterator::find () at /home/r/src/rust/rustc.2/src/libcore/iter/iterator.rs:1782
#16337 rustc_mir::hair::pattern::_match::is_useful () at librustc_mir/hair/pattern/_match.rs:1052
#16338 0x00007ffff6dd00fe in rustc_mir::hair::pattern::check_match::check_arms () at librustc_mir/hair/pattern/check_match.rs:369
#16339 rustc_mir::hair::pattern::check_match::MatchVisitor::check_match::{{closure}} () at librustc_mir/hair/pattern/check_match.rs:220
#16340 rustc_mir::hair::pattern::_match::MatchCheckCtxt::create_and_enter () at librustc_mir/hair/pattern/_match.rs:325

cc @varkor

cc @nagisa also?

I updated the example to the changed syntax; it still reproduces with the latest nightly.

Since no one has done it publicly so far, I thought I'd investigate this error a bit.
First, I reduced the example to (playground link):

match [0u8; 16*1024] {
    [..] => {}
}

From the stack trace, we know that the overflow occurs at the only line is_useful calls itself directly: https://github.com/rust-lang/rust/blob/fdaf594bab31eec75fb6d582cd33e5a5b43de7f4/src/librustc_mir/hair/pattern/_match.rs#L1199
From my understanding of the code and the algorithm, this means that matrix has way too many columns. This is precisely what we would expect to happen if we considered a fixed-size array as a large tuple and naively applied the algorithm from the paper.
And this is in fact what seems to be happening: in the case of a fixed-size array, constructor_sub_pattern_tys returns a Vec of the same size as the array, and is_useful_specialized ends up making a Matrix with as many columns. Whereas the code for matching slices only uses as many matrix columns as (essentially) the longest pattern of the match, the code for fixed-size arrays brutally considers them as big tuples and calls it a day. It essentially considers [a, ..] as if it was [a, _, _, _, ..., _]. The algorithm then proceeds with these subpatterns one-by-one recursively, hence the stack overflow.

I looks to me like this issue doesn't come from a small oversight somewhere, but rather is a structural consequence of how fixed-size array patterns where shoehorned into the usefulness algorithm. So I feel that a good solution to this issue would need to rethink how to hande arrays (and slices too probably).
My idea for a solution would be the following: We would view a pattern [a, b, .., c] as something like Cons(a, Cons(b, Snoc(c, [..]))), which means we keep the number of matrix columns to a minimum, but we treat [..] specially when trying to specialize. For example, if we encounter a [..] when computing S(Cons, P), we would pretend [..] was actually [] | Cons(_, [..]) (or just [] if we know the size is 0), and then continue as usual.

Am I making sense ?

Am I making sense ?

Thanks for the investigation! cc @varkor, @arielb1, and @matthewjasper.

@Nadrieril: thanks for investigating! Your analysis looks correct to me. However, rather than coming up with a new solution for fixed-size arrays, I wonder why we don't just use the handling for slices. Other than the mismatched size error (e.g. error[E0527]: pattern requires 1 elements but array has 2), fixed-size arrays should be matched just like slices: the size of the array is completely irrelevant. Maybe we could get away with folding the fixed-size array code path into the slice code path (though keeping E0527).

@varkor Oh right, that sounds easier. We can probably just use the smallest between the max_slice_length of the patterns and the length of the array ? It might just be tricky to produce witnesses that have the correct length.

@Nadrieril: I think it'd be easier just to go with max_slice_length — if the slice length is greater than the size of the array, then E0527 will report the error before we attempt to start checking the match, so by the time we get to is_useful, we know all the patterns are valid ones for the array.

If you'd like to take this issue, you're very welcome to; you seem to have a good understanding of what's going on! I'm happy to help with any questions you have. (Though no pressure if not!)

@Nadrieril: I think it'd be easier just to go with max_slice_length — if the slice length is greater than the size of the array, then E0527 will report the error before we attempt to start checking the match, so by the time we get to is_useful, we know all the patterns are valid ones for the array.

I'm not quite sure: the following is a valid exhaustive match for a size-1 array:

match x {
  [true, ..] => {}
  [.., false] => {}
}

However, if I understand the doc of max_slice_length correctly, this match's max_slice_length is 2. So this would trigger E0527 when it shouldn't. I think that using the array's size when it's smaller than the match's max_slice_length solves this correctly.

If you'd like to take this issue, you're very welcome to; you seem to have a good understanding of what's going on! I'm happy to help with any questions you have. (Though no pressure if not!)

Thanks ! I'm not sure if I'll find a lot of time to do it, but I will gladly try !

So this would trigger E0527 when it shouldn't.

E0527 isn't based on max_slice_length; I think it's completely independent. In any case, we should be able to work out whether we need to take this into account during implementation.

https://github.com/rust-lang/rust/blob/7b3f72906ffea5a8aec9e3d109d8e012f771a672/src/librustc_typeck/check/pat.rs#L1060-L1065

Ok, I see what you meant. I've started reading the rustc guide, then I'll have a go at this :)

I've thought about this some more, and there's an issue that might prevent the simple solution from working: witnesses. Take the following:

match big_array {
    [false, ..] => {}
}

This is non-exhaustive, and the compiler will report something like pattern [true, _, _, _, _] not covered, with enough underscores to get the correct length. This is a problem because in order to generate this witness, we need the matrix to have as many columns as the length of the array, which is exactly what we need to avoid.
This makes me think that we need to treat the slice constructor as something more complex than just a tuple, so that we can represent a witness like [true, ..] without expanding it to its full length. I fear that this would be quite a complex change.
I hope I'm mistaken and there's a way around this, otherwise I'll start investigating how I could implement the more complex solution.

@Nadrieril: ah, that's a very good point. I think you're right: we're going to have to handle these large witnesses specially. (Maybe after a certain cutoff, we start using .. rather than being explicit.) I wonder whether we can still use a modified version of the "treat arrays like slices" approach. However, when we get to the point where we're going to print diagnostic messages involving the witnesses, we expand as necessary if it's an array rather than a slice (by adding a .., or some explicit _). That is, the matching code itself treats slices just like arrays, but the diagnostics have a special-case.

This is just a vague idea, as the code isn't so fresh in my mind at the moment, so it may not be plausible. In any case, thanks for your continuing investigation — it's really good to have someone willing to dig down into a complicated issue like this!

Ok, so running tests takes > 10min which makes it quite hard for me to try things out. Since the functionality of match exhaustiveness is quite independent from the rest of rustc, I thought I'd add some unit tests directly in the file instead of adding them in src/test/ui/....
Would that be ok ? Would anyone know the correct x.py/cargo invocation to only rebuild and test librustc_mir ?

technically it should be ./x.py test src/librustc_mir --stage 1, but I don't know if that does what it appears it should, never tested it

@oli-obk Thanks !

So I've started working on this, and so far I've refactored some things here and there to make the code easier to follow. I'm not sure my decisions are good and this actually improves the code though. Also this is not strictly necessary to fix this particular issue. Is that an ok thing for me to be doing ? Should I maybe push those changes as a separate refactor PR ?

Is that an ok thing for me to be doing? Should I maybe push those changes as a separate refactor PR ?

Not only is it OK, it's awesome to first refactor and then implement. :) If it is a large refactor it might make sense to split things into different PRs.

I've put up a large refactor over there: #65160.
If and once it gets merged, fixing the issue should just be a case of using the VarLenSlice constructor for fixed-length arrays as well.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pedrohjordao picture pedrohjordao  Â·  3Comments

mcarton picture mcarton  Â·  3Comments

dtolnay picture dtolnay  Â·  3Comments

behnam picture behnam  Â·  3Comments

zhendongsu picture zhendongsu  Â·  3Comments