Create and copy 256MB data, Node.js 317 ms VS Rust 12 sec!
let ins0 = Instant::now();
let mut buf1 = vec!['a'; 268435455];
let mut buf2 = Vec::with_capacity(268435455);
let ins1 = Instant::now();
println!("create {:?}", ins1.duration_since(ins0));
buf2.append(&mut buf1);
let ins2 = Instant::now();
println!("copy {:?}", ins2.duration_since(ins1));
console.time('create')
var buf1 = new Buffer(268435455)
var buf2 = new Buffer(268435455)
buf1.fill('a')
console.timeEnd('create')
console.time('copy')
buf1.copy(buf2)
console.timeEnd('copy')
// create: 145ms
// copy: 174ms
Did you compile in release mode?
@jonas-schievink
release mode is faster than debug mode, but still slower than Node.js
➜ test target/release/hello_world
create Duration { secs: 0, nanos: 871399084 }
copy Duration { secs: 2, nanos: 215459729 }
➜ test target/debug/hello_world
create Duration { secs: 11, nanos: 601338291 }
copy Duration { secs: 1, nanos: 516260097 }
I don't think append matches copy. copy will overwrite the target, while append, well, appends buf1 to buf2.
No idea why copy became slower in release mode. Is that reproducible?
append also consumes the source, as per the example
(EDIT: this should not affect much the performance of the test you are performing, but is another difference in behaviour, which was only partly explained by @jonas-schievink )
@jonas-schievink @ranma42 There is no copy method for vec. When I use push with Iterator, it is much slower than append
@zensh I mean the copy from Node.js doesn't do the same thing as Rust's append
There is no copy method for vec.
The way you do the copy in Node is equivalent to
let mut buf2 = Vec::with_capacity(268435455);
unsafe {
buf2.set_len(268435455);
buf2.copy_from_slice(&buf1);
}
in Rust, all the uninitialized stuff and unsafety included.
That being said, I still find this bug very valid. We could certainly:
Vec to do memcpy with T: Copy elements (i.e. do what the snippet I just posted above does);Vec<T: Copy>::extend_with_element to a memsetbecause LLVM apparently cannot figure either optimisation in the example provided.
I have implemented a copy_vec function for my use case, it is very fast now.
use std::ptr;
use std::time::Instant;
fn copy_vec(src: &Vec<u8>, dst: &mut Vec<u8>, index: usize) {
let len = src.len();
let dst_len = dst.len();
unsafe {
ptr::copy_nonoverlapping(src.as_ptr(), dst.get_unchecked_mut(index), len);
dst.set_len(dst_len + len);
}
}
fn main() {
let ins0 = Instant::now();
let buf1 = vec![b'a'; 268435455];
let mut buf2 = Vec::with_capacity(268435455);
let ins1 = Instant::now();
println!("create {:?}", ins1.duration_since(ins0));
copy_vec(&buf1, &mut buf2, 0);
// buf2.append(&mut buf1);
let ins2 = Instant::now();
println!("copy {:?}", ins2.duration_since(ins1));
}
➜ test target/debug/hello_world
create Duration { secs: 11, nanos: 308244219 }
copy Duration { secs: 0, nanos: 250651955 }
➜ test target/release/hello_world
create Duration { secs: 0, nanos: 73 }
copy Duration { secs: 0, nanos: 63015 }
@nagisa copy_from_slice can only use for slice, that would not be large(< 2MB) because of stack size limit.
The use of get_unchecked_mut makes the function unsafe, you should use normal indexing instead or mark copy_vec as unsafe.
@jonas-schievink Here https://github.com/iorust/msgp-rust/blob/master/src/lib.rs#L19 is my use case, it work fine.
@zensh slices obtained from a Vec point into its backing storage, which is always on the heap
@zensh
Rust chars are 32-bit long, so you are actually copying 1GB of data. Use String if you want that. Node.js characters are 8-bit long, right? (using UTF-8). Using 8-bit characters is fast as it should.
@jonas-schievink You are right, thank you.
@arielb1 yeah I have taken notice of that, I changed it to b'a' in my test. Github's editor have some thing wrong I can't edit the issue~ thank you.
@zensh
I see a memcpy being emitted in both cases. Vec::append already does the Right Thing:
pub fn append(&mut self, other: &mut Self) {
self.reserve(other.len());
let len = self.len();
unsafe {
ptr::copy_nonoverlapping(other.as_ptr(), self.get_unchecked_mut(len), other.len());
}
self.len += other.len();
unsafe {
other.set_len(0);
}
}
Most helpful comment
The way you do the copy in Node is equivalent to
in Rust, all the uninitialized stuff and unsafety included.
That being said, I still find this bug very valid. We could certainly:
Vecto domemcpywithT: Copyelements (i.e. do what the snippet I just posted above does);Specialize,Vec<T: Copy>::extend_with_elementto amemsetbecause LLVM apparently cannot figure either optimisation in the example provided.