Rust: [MIR] double drop with slice patterns

Created on 7 Jul 2016  路  8Comments  路  Source: rust-lang/rust

Version

$ rustc -V
rustc 1.11.0-dev (42b7c32ac 2016-07-02)

STR

#![feature(slice_patterns, rustc_attrs)]

struct NoisyDrop;
impl Drop for NoisyDrop {
    fn drop(&mut self) { println!("splat!"); }
}

#[rustc_mir]
fn main() {
    let [x] = [NoisyDrop];
}

Expected Result

NoisyDrop gets dropped once

Actual Result

NoisyDrop is dropped twice

A-mir C-bug E-mentor P-medium T-compiler T-lang

Most helpful comment

This is the last thing that is stopping slice patterns from being stabilized, so I think we should go forward with it. I would like to mentor this issue so that I will not be the sole maintainer of the drop elaboration code.

Mentoring Notes

The issue is that drop elaboration does not handle arrays correctly.

Background

During MIR generation, Drop terminators are generated for each local that goes out of scope. This means that the Drop terminator is encountered by a value that is already dropped. E.g. the issue code:

#![feature(slice_patterns)]

struct NoisyDrop;
impl Drop for NoisyDrop {
    fn drop(&mut self) { println!("splat!"); }
}

fn main() {
    let [x] = [NoisyDrop];
}

generates the following MIR:

fn main() -> () {
    let mut _0: ();                      // return pointer
    scope 1 {
        let _1: NoisyDrop;               // "x" in scope 1 at src/main.rs:10:10: 10:11
    }
    scope 2 {
    }
    let mut _2: [NoisyDrop; 1];
    let mut _3: NoisyDrop;

    bb0: {                              
        StorageLive(_2);                 // scope 1 at src/main.rs:10:15: 10:26
        StorageLive(_3);                 // scope 1 at src/main.rs:10:16: 10:25
        _3 = NoisyDrop::{{constructor}}; // scope 1 at src/main.rs:10:16: 10:25
        _2 = [_3];                       // scope 1 at src/main.rs:10:15: 10:26
        StorageDead(_3);                 // scope 1 at src/main.rs:10:26: 10:26
        StorageLive(_1);                 // scope 1 at src/main.rs:10:10: 10:11
        // COMMENT MINE - value moved out here
        _1 = _2[0 of 1];                 // scope 1 at src/main.rs:10:10: 10:11
        drop(_2) -> [return: bb3, unwind: bb2]; // scope 1 at src/main.rs:10:27: 10:27
    }

    bb1: {                               // cleanup
        resume;                          // scope 0 at src/main.rs:9:1: 11:2
    }

    bb2: {                               // cleanup
        drop(_1) -> bb1;                 // scope 0 at src/main.rs:11:2: 11:2
    }

    bb3: {                              
        StorageDead(_2);                 // scope 1 at src/main.rs:10:27: 10:27
        _0 = ();                         // scope 0 at src/main.rs:9:11: 11:2
        // COMMENT MINE - ...and local is dropped at end of scope here
        drop(_1) -> bb4;                 // scope 0 at src/main.rs:11:2: 11:2
    }

    bb4: {                              
        StorageDead(_1);                 // scope 0 at src/main.rs:11:2: 11:2
        return;                          // scope 0 at src/main.rs:11:2: 11:2
    }
}

To ensure that locals that have already been moved aren't dropped again, Rust uses [flag-based dynamic drop]. The compiler tracks which locals and parts of locals have been moved out, so a Drop terminator only drops the part of its argument that is initialized.

This is done by drop elaboration, which is implemented in [rustc_mir::util::elaborate_drops] (the codegen part for each individual drop), [rustc_mir::transform::elaborate_drops] (the transform sequence), and [rustc_mir::dataflow] (the analysis).

Drop elaboration uses local variable "drop flags" to track the initialization state of each local fragment (called a "move path"), and rewrites each MIR drop into code that checks these drop flags and only drops what is initialized.

The move path analysis is currently not implemented for arrays, so array fragments are currently [shortcutted]. This causes the double-drop - moves out of array elements are ignored when checking drop flags.

Implementation notes

So, in order to remove the shortcut, we have to implement move paths for arrays. The code for handling move paths at [rustc_mir::dataflow::move_paths::builder] needs to be adapted, and the code at [open_drop_for_array] needs to be changed to consume the new move paths and only drop what is initialized (see e.g. the code for open_drop_for_tuple in the same function).

I think that a first implementation that works like structs, and creates a field for every array index, would work. Before we stabilize slice patterns, I would like to modify the implementation to consolidate ranges of indexes and generate only a single entry for them.

Afterwards, I would like to see a good amount of tests in [the dynamic-drop test] using the runner there (see the existing tests - this tests that drops are handled correctly, including in panic paths), including tests for patterns matching from both ends with potential overlap - e.g.

if maybe {
    let [_, ..b] = array;
} else {
    let [..b, _, _x] = array;
}

The drop elaboration code is subtle code that can easily cause miscompilations, so the more tests, the better (if you find something that looks suspicious, adding a test and potentionally a bugfix would be very cool).

Bonus Points - Slice Patterns

I'll note that move paths off slices, which can have several lengths, aren't supported, because I have no good idea of how to track drop flags for something like this (you can match slices for both ends, and for small lengths you might have overlap):

fn foo(x: Box<[Box<[Box<u32>]>]>) {
    match rand() {
        0 => match x {
            box [box [x, ..], ..] => use(x)
        },
        1 => match x {
            box [.., box [x, ..]] => use(x)
        },
        2 => match x {
            box [_, box [_, x, ..], ..] => use(x)
        }
        // etc.
    }
}

Because I expect this sort of thing to be rare, for bonus points I would like it if you find some "brute-force" way of handling this that is exponential at these ugly cases but linear in sane situations.

All 8 comments

Should this go on the mir milestone?

@brson

slice_patterns is feature-gated and quite buggy.

Fixing the issue for _arrays_ would be quite easy. The primary problem is that you can move out of _slices_

#![feature(slice_patterns, advanced_slice_patterns, box_syntax, box_patterns)]

fn cond() -> bool { false }

fn make() -> Box<[Box<[Box<[String]>]>]> {
    box [box [box ["hello".to_string(), "world".to_string()]]]
}

fn main() {
    let b3 = make();

    match (b3, cond()) {
        (box [box [box [s, ..], ..], ..], true) => {
            println!("{}", s);
        }
        (box [.., box [.., box [.., s]]], _) => {
            println!("{}", s);
        }
        _ => {}
    }
}

Here we must drop only "hello" in the first branch, and only "world" in the second branch, but I do not see any reliable way of doing it.

This is not blocking the move to MIR, as @arielb1 notes.

Still an issue in rustc 1.15.0-nightly (0bd2ce62b 2016-11-19)

Note that since #36353 prohibits moves out of slices, we could presumably fix things for arrays alone at this point without too much trouble.

This is the last thing that is stopping slice patterns from being stabilized, so I think we should go forward with it. I would like to mentor this issue so that I will not be the sole maintainer of the drop elaboration code.

Mentoring Notes

The issue is that drop elaboration does not handle arrays correctly.

Background

During MIR generation, Drop terminators are generated for each local that goes out of scope. This means that the Drop terminator is encountered by a value that is already dropped. E.g. the issue code:

#![feature(slice_patterns)]

struct NoisyDrop;
impl Drop for NoisyDrop {
    fn drop(&mut self) { println!("splat!"); }
}

fn main() {
    let [x] = [NoisyDrop];
}

generates the following MIR:

fn main() -> () {
    let mut _0: ();                      // return pointer
    scope 1 {
        let _1: NoisyDrop;               // "x" in scope 1 at src/main.rs:10:10: 10:11
    }
    scope 2 {
    }
    let mut _2: [NoisyDrop; 1];
    let mut _3: NoisyDrop;

    bb0: {                              
        StorageLive(_2);                 // scope 1 at src/main.rs:10:15: 10:26
        StorageLive(_3);                 // scope 1 at src/main.rs:10:16: 10:25
        _3 = NoisyDrop::{{constructor}}; // scope 1 at src/main.rs:10:16: 10:25
        _2 = [_3];                       // scope 1 at src/main.rs:10:15: 10:26
        StorageDead(_3);                 // scope 1 at src/main.rs:10:26: 10:26
        StorageLive(_1);                 // scope 1 at src/main.rs:10:10: 10:11
        // COMMENT MINE - value moved out here
        _1 = _2[0 of 1];                 // scope 1 at src/main.rs:10:10: 10:11
        drop(_2) -> [return: bb3, unwind: bb2]; // scope 1 at src/main.rs:10:27: 10:27
    }

    bb1: {                               // cleanup
        resume;                          // scope 0 at src/main.rs:9:1: 11:2
    }

    bb2: {                               // cleanup
        drop(_1) -> bb1;                 // scope 0 at src/main.rs:11:2: 11:2
    }

    bb3: {                              
        StorageDead(_2);                 // scope 1 at src/main.rs:10:27: 10:27
        _0 = ();                         // scope 0 at src/main.rs:9:11: 11:2
        // COMMENT MINE - ...and local is dropped at end of scope here
        drop(_1) -> bb4;                 // scope 0 at src/main.rs:11:2: 11:2
    }

    bb4: {                              
        StorageDead(_1);                 // scope 0 at src/main.rs:11:2: 11:2
        return;                          // scope 0 at src/main.rs:11:2: 11:2
    }
}

To ensure that locals that have already been moved aren't dropped again, Rust uses [flag-based dynamic drop]. The compiler tracks which locals and parts of locals have been moved out, so a Drop terminator only drops the part of its argument that is initialized.

This is done by drop elaboration, which is implemented in [rustc_mir::util::elaborate_drops] (the codegen part for each individual drop), [rustc_mir::transform::elaborate_drops] (the transform sequence), and [rustc_mir::dataflow] (the analysis).

Drop elaboration uses local variable "drop flags" to track the initialization state of each local fragment (called a "move path"), and rewrites each MIR drop into code that checks these drop flags and only drops what is initialized.

The move path analysis is currently not implemented for arrays, so array fragments are currently [shortcutted]. This causes the double-drop - moves out of array elements are ignored when checking drop flags.

Implementation notes

So, in order to remove the shortcut, we have to implement move paths for arrays. The code for handling move paths at [rustc_mir::dataflow::move_paths::builder] needs to be adapted, and the code at [open_drop_for_array] needs to be changed to consume the new move paths and only drop what is initialized (see e.g. the code for open_drop_for_tuple in the same function).

I think that a first implementation that works like structs, and creates a field for every array index, would work. Before we stabilize slice patterns, I would like to modify the implementation to consolidate ranges of indexes and generate only a single entry for them.

Afterwards, I would like to see a good amount of tests in [the dynamic-drop test] using the runner there (see the existing tests - this tests that drops are handled correctly, including in panic paths), including tests for patterns matching from both ends with potential overlap - e.g.

if maybe {
    let [_, ..b] = array;
} else {
    let [..b, _, _x] = array;
}

The drop elaboration code is subtle code that can easily cause miscompilations, so the more tests, the better (if you find something that looks suspicious, adding a test and potentionally a bugfix would be very cool).

Bonus Points - Slice Patterns

I'll note that move paths off slices, which can have several lengths, aren't supported, because I have no good idea of how to track drop flags for something like this (you can match slices for both ends, and for small lengths you might have overlap):

fn foo(x: Box<[Box<[Box<u32>]>]>) {
    match rand() {
        0 => match x {
            box [box [x, ..], ..] => use(x)
        },
        1 => match x {
            box [.., box [x, ..]] => use(x)
        },
        2 => match x {
            box [_, box [_, x, ..], ..] => use(x)
        }
        // etc.
    }
}

Because I expect this sort of thing to be rare, for bonus points I would like it if you find some "brute-force" way of handling this that is exponential at these ugly cases but linear in sane situations.

triage: P-medium

@rust-lang/lang team agrees that when this is done we could consider stabilizing slice patterns.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dtolnay picture dtolnay  路  3Comments

nikomatsakis picture nikomatsakis  路  3Comments

mcarton picture mcarton  路  3Comments

Robbepop picture Robbepop  路  3Comments

SharplEr picture SharplEr  路  3Comments