For some reason it seems that a direct
println!("{}", var);
is a lot slower than
let s = format!("{}", var);
println!("{}", s);
Is it possible to make both version perform equally?
I timed this via this benchmarking code:
````rust
extern crate rand;
use rand::{thread_rng, Rng};
fn main() {
for i in 0..10000000 as u64 {
//large_print_fmt(); // 12.333s
//large_print(); // 39.78s
}
}
fn get_rand_number() -> i32 {
rand::thread_rng().gen_range(0, 10)
}
fn large_print() {
println!(
"1 random number: {}
2 random number: {}
3 random number: {}
4 random number: {}
5 random number: {}
6 random number: {}
7 random number: {}
8 random number: {}
9 random number: {}
10 random number: {}",
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number()
);
}
fn large_print_fmt() {
let print = format!(
"1 random number: {}
2 random number: {}
3 random number: {}
4 random number: {}
5 random number: {}
6 random number: {}
7 random number: {}
8 random number: {}
9 random number: {}
10 random number: {}",
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number(),
get_rand_number()
);
println!("{}", print);
}
````
When printing a single formatted string, a single call to write with all of the lines happens, resulting in a single syscall. When directly println-ing, there will be a separate call to write for 1 random number:, 1234556, \n2 random number:, ...etc. Standard output is line-buffered, so there'll be one syscall after each write call that contains a newline.
One potential change we could make is to disable the newline detection logic while inside of a single print_fmt call. I'm not sure off the top of my head if that would be a good idea or not, but we could check to see what C does with line buffered FILEs and formatted printing.
Slight offtopic but replacing format! with other methods would give you even more performance: https://github.com/hoodie/concatenation_benchmarks-rs
Triage: it seems like this isn't a bug, just maybe a possible optimization, but frankly it seems a bit questionable to me. I'd probably close this, but I'll leave it open in case @rust-lang/libs thinks it's valuable.
I personally think we should close things like this so I'll go ahead and do so, it's getting really into the weeds of the implementation and if you're optimizing then you probably don't want to use println! anyway (instead locking stdout and such)
Most helpful comment
When printing a single formatted string, a single call to
writewith all of the lines happens, resulting in a single syscall. When directly println-ing, there will be a separate call towritefor1 random number:,1234556,\n2 random number:, ...etc. Standard output is line-buffered, so there'll be one syscall after eachwritecall that contains a newline.One potential change we could make is to disable the newline detection logic while inside of a single
print_fmtcall. I'm not sure off the top of my head if that would be a good idea or not, but we could check to see what C does with line bufferedFILEs and formatted printing.