Ubuntu 16.04.1
1GB RAM (Free memory 400MB+)
On VPS machine
Each test is ran 3 times:
ruby -v
Ruby 2.3.1p112 (2016-04-26) [x86_64-linux-gnu]
crystal -v
Crystal 0.26.1 [391785249] (2018-08-27)
LLVM: 4.0.0
Default target: x86_64-unknown-linux-gnu
$ crystal run --release 1.rb
00:00:00.033265433
$ ruby 1.rb
0.015399671000011494
1.rb
require "benchmark"
cols = 10
rows = 1000
data = Array.new(rows) { Array.new(cols) { "x"*1000 } }
time = Benchmark.realtime do
csv = data.map { |row| row.join(",") }.join("\n")
end
puts time
I'm surprise if an unopitimize code does affect Crystal to run slower than Ruby?
Is it due to LLVM 4.0 on Ubuntu being too old where LLVM 6.0 on macOS Homebrew is newer?
Tested latest Crystal 0.26.1 on macOS Mojave is definitely slower than Ruby.2.5 as other have reported.
Have you tried compiling it and running it separately? I'm not really sure why this would make a difference, but it produces completely different results for me.
With cols = 100 and rows = 10000:
$ ruby wat.cr
22.24634099297691
$ crystal run --release wat.cr
00:00:23.799129445
$ crystal build --release wat.cr
$ ./wat
00:00:07.604624759
@DestyNova Super strange, that shouldn't produce a difference.
@proyb6 You are totally right, Crystal is performing worse than Ruby here. At first I thought it was because maybe Ruby's memory allocator was faster than Crystal. Then I got curious and checked how join
is implemented. I was sure join
was defined in Enumerable
, but just in case I checked whether its defined in Array
too. And it is! I wondered why...
I turns out there's a possible optimization to do if the array consists entirely of strings. In the general case, join
will use String.build
and successively append the elements and the separators, reallocating memory if needed. But if all the elements are strings (and we convert the separator to a string if it's not a string already), we can compute the total memory needed for the final string: separator.bytesize * (array.size - 1) + array.sum(&.bytesize)
.
I will send a PR with this optimization soon.
@proyb6 Thank you so much for reporting this! These are the kind of problems that I enjoy most, and it brings performance improvements to all programs out there :-)
@DestyNova I have compiled to build and have the time timing.
@asterite No problem! It's helpful if we can compare the benchmark based from the source code in Ruby Performance Optimization:
https://pragprog.com/book/adrpo/ruby-performance-optimization
Assuming Rubyists and newbies will evaluate the results from those source, I hope we get better and reliable!
I always keep adding 鉂わ笍 to Ruby, it's incredible all the tweaks, optimizations and great thoughts that there are in the entire codebase. All of this usually goes un-noticed...
@asterite
Super strange, that shouldn't produce a difference.
Yeah it does seem counterintuitive. It might be because my laptop only has 8 GB of RAM and that's nearly all used by a couple of Elixir programs and Firefox and Chromium. Perhaps the compiler uses enough RAM that the system starts using swap memory, and a few seconds are needed to go back to normal... if I add a sleep 10
to the program before the benchmark starts, it performs better (but still not as fast as running the executable created by crystal build --release
).
Interestingly, if I add GC.disable
to the beginning of the program, then both crystal build --release wat.cr && ./wat
and crystal run --release wat.cr
show similar performance.
Most helpful comment
@DestyNova Super strange, that shouldn't produce a difference.
@proyb6 You are totally right, Crystal is performing worse than Ruby here. At first I thought it was because maybe Ruby's memory allocator was faster than Crystal. Then I got curious and checked how
join
is implemented. I was surejoin
was defined inEnumerable
, but just in case I checked whether its defined inArray
too. And it is! I wondered why...I turns out there's a possible optimization to do if the array consists entirely of strings. In the general case,
join
will useString.build
and successively append the elements and the separators, reallocating memory if needed. But if all the elements are strings (and we convert the separator to a string if it's not a string already), we can compute the total memory needed for the final string:separator.bytesize * (array.size - 1) + array.sum(&.bytesize)
.I will send a PR with this optimization soon.
@proyb6 Thank you so much for reporting this! These are the kind of problems that I enjoy most, and it brings performance improvements to all programs out there :-)