Rust: optimisation idea: look ahead to deduce better vector capacities

Created on 18 Jul 2018  路  9Comments  路  Source: rust-lang/rust

````rust
fn main() {
foo(false);
}

fn foo(yes: bool) {
let mut x = Vec::new();
println!("capacity: {}", x.capacity());
println!("length : {}", x.len());
println!("pushing data");
if yes {
x.push(1);
x.push(2);
x.push(3);
x.push(4);
} else {
x.push(1);
x.push(2);
x.push(3);
x.push(4);
x.push(5);
x.push(6);
}
println!("capacity: {}", x.capacity());
println!("length : {}", x.len());
println!("vec: {:?}", x);
}

capacity: 0
length : 0
pushing data
capacity: 8
length : 6
vec: [1, 2, 3, 4, 5, 6]
````

foo() might have a none const parameter.

I wonder if in this code the compiler could deduce that x will have at least 4 elts pushed to it in any case, so we construct the vector x with a capacity of 4 by default instead of 0.

Most helpful comment

That's some dark optimizer magic specialized to Vec. While I'm totally for thinking about this, we should be thinking about a generic way to do this.

The easy way out of doing such black arts, is to detect these cases and have a lint suggest Vec::with_capacity(if yes { 4 } else { 6 }). This absolutely feels like something we could have in clippy right now in the perf lint group.

All 9 comments

That's some dark optimizer magic specialized to Vec. While I'm totally for thinking about this, we should be thinking about a generic way to do this.

The easy way out of doing such black arts, is to detect these cases and have a lint suggest Vec::with_capacity(if yes { 4 } else { 6 }). This absolutely feels like something we could have in clippy right now in the perf lint group.

Oh that's an interesting idea!

I already thought if we could somehow sink the Vec::new() call into the if branches, sorta like
rust if yes { let mut x = Vec::new(); x.push(1); x.push(2); x.push(3); x.push(4); } else { let mut x = Vec::new(); x.push(1); x.push(2); x.push(3); x.push(4); x.push(5); x.push(6); } println!("vec: {:?}", x);
and have const fn Vec::push() handle the rest, but that would open another can of scope problems I guess...

What you have raised is not really a real-world example. There's already optimizations exist on collect() and IMO everyone should be using it instead of direct mutation.

I don't think we can really act on this. Feels free to open a clippy isuse though so we can brainstorm lint ideas.

This might make a good first MIR lint!

Hmm, I found a similar example with strings that however has no if-statements; if we push strings that both have constant sizes, we could precalculate a better vector capacity:
rust fn main() { let mut x = String::from("constant size of 19"); println!("size: {}, cap: {}", x.len(), x.capacity()); x.push_str("const size of 16"); println!("size: {}, cap: {}", x.len(), x.capacity()); }
=>

size: 19, cap: 19 size: 35, cap: 38

This already seems to work nicely for vectors though? :confused:

````rust
fn main() {
let mut x = vec![1, 2, 3, 4];
println!("size: {}, cap: {}", x.len(), x.capacity());

x.append(&mut vec![5, 6, 7, 8, 9, 10, 11]);
println!("size: {}, cap: {}", x.len(), x.capacity());
}
````

size: 4, cap: 4 size: 11, cap: 11

Whether overallocating or not is a non-issue. The issue that worth prevent is the reallocation, however IMO you should use iterator functions and format! everywhere which eliminates most reallocations.

I agree. This is not actionable on the compiler/optimizer side.

Was this page helpful?
0 / 5 - 0 ratings