I noticed that there was a previously closed issue that brought up performance concerns. I believe the issue was closed due to lack of info/fair comparison.
I realize Kotnlin Native is still very new and it's something that I am very excited for. However, I have noticed some performance issues when compared to C and even Java. For benchmarking I wrote a 1-level Karatsuba multiplication to multiply two 576 bit numbers and the average computation time using Kotlin Native is significantly higher than C and Kotlin Java.
C avg runtime: 0.006630 milliseconds
Kotlin Java avg runtime: 0.01605 milliseconds
Kotlin Native avg runtime: 0.55328 milliseconds
This is 83 times slower than C and 34 times slower than Kotlin Java.
Is performance still being worked on?
Specs:
Windows 10
Intel i5-4690K
16GB RAM
Kotlin Code (Used for both Kotlin Native and Kotlin Java builds):
import time.*
fun main(arg : Array<String>) {
val karatsuba : Karatsuba = Karatsuba(1);
var avgTime : Long = 0;
val iterations : Int = 100000;
val a : Array<Long> = arrayOf<Long>(0x7d8d, 0x54ca, 0x1866, 0x3399, 0x26d1, 0x2d0d, 0xcaf9, 0xf169, 0xbbce, 0xd1a8, 0xd51f, 0x4ecd,
0xd035, 0xd7f8, 0x1cbb, 0xc278, 0xe6dc, 0xbeb3, 0xaa99, 0x3b75, 0xee36, 0x3629, 0x7787, 0xd4e0, 0x5882, 0x6965, 0x733b, 0x4ed5, 0x6c08,
0x70f3, 0x5614, 0x83dc, 0x016b, 0, 0, 0);
val b : Array<Long> = arrayOf<Long>(0x577d, 0x11fd, 0x7740, 0xfa10, 0xc40e, 0x54cb, 0xbb90, 0xb69f, 0xb805, 0x0214, 0x211e, 0x666e, 0x5ccc, 0x5be4, 0xd1be, 0x0344,
0x3e08, 0xb277, 0x3c8a, 0x1d5a, 0x6df7, 0x95a5, 0xb110, 0xe0ae, 0x1e01, 0x2420, 0x0b9d, 0x8f3a, 0x7cae, 0x9d93, 0x616f, 0x8e71, 0x01b7, 0, 0, 0)
for (i in 0 until iterations) {
//avgTime += measureTimeMillis {
avgTime += measureClock {
val result: Array<Long> = karatsuba.compute(a, b)
}
}
print("Avg Time = " + (avgTime / iterations.toFloat()).toString() + "ms")
}
fun measureClock(func : () -> Unit) : Int {
val start : Int = clock();
func()
return (clock() - start)
}
class Karatsuba(level : Int) {
private val mLevel : Int = level;
public fun compute(a : Array<Long>, b : Array<Long>) : Array<Long> {
val halfSize : Int = a.size / 2;
val r : Array<Long> = Array<Long>(a.size + b.size, {0});
val z0 : Array<Long> = productScan(a.sliceArray(0 until halfSize), b.sliceArray(0 until halfSize))
val z2 : Array<Long> = productScan(a.sliceArray(halfSize until a.size), b.sliceArray(halfSize until b.size))
val mid : Pair<Array<Long>, Array<Long>> = Pair(Array<Long>(halfSize, {0}), Array<Long>(halfSize, {0}))
for (i in mid.first.indices) {
mid.first[i] = a[i] + a[halfSize + i]
mid.second[i] = b[i] + b[halfSize + i]
}
val z1 : Array<Long> = productScan(mid.first, mid.second)
for (i in z1.indices) {
z1[i] = z1[i] - z0[i] - z2[i];
}
for (i in a.indices) {
r[i] += z0[i]
r[halfSize + i] += z1[i]
r[a.size + i] += z2[i]
}
return r
}
public fun productScan(scanA : Array<Long>, scanB : Array<Long>) : Array<Long> {
val r : Array<Long> = Array<Long>(scanA.size + scanB.size, {0});
for (i in 0 until scanA.size) {
for (j in 0 until scanB.size) {
r[i + j] += scanA[i] * scanB[j];
}
}
return r;
}
}
Thank you for report, but there are several things you perhaps have missed:
Sure it's not ready for comparisons right now, but will it someday copmpete with really native compilers, like C or Rust?
JVM Version 1.8.0_131
No special flags were used (Just clicked compile in Intellij)
KotlinC version: kotlinc-native 1.1.4-dev-355 (JRE 1.8.0_101-b13)
No special flags used
C compiler version - I tested with both MSVC (Version 19.00.24213.1) and GCC (gcc version 4.8.4) and did not use any special flags.
While I understand and have previously read that Kotlin/Native isn't ready for performance comparisons I had noticed that @msink also brought up performance concerns and the issue was closed. From what I understand it was closed because the collaborator didn't feel that the ASM was fair to compare to other native languages (because of things like garbage collection).
The purpose of me doing this benchmark is to hopefully bring this to your attention and get your feedback so that users of Kotlin/Native know that it is something being looked at. I am a big fan that a more modern language is being developed for native use. However, with the current performance it is impractical to use Kotlin/Native due to its significant speed disadvantage (it partially defeats the purpose of a native language). I don't expect it to be just as fast or faster than C this is understandable, but 83 times slower is quite a lot.
It would just be good to know that performance specifically is something that is actively being worked on.
Thanks for this report. As it was mentioned, Kotlin/Native isn't ready for performance comparison, and it is not intended to achieve level of C performance. Kotlin is an application-level language, and for raw speed we suggest to rely on C libraries, invoked from Kotlin/Native via interop. Current form of this report is not directly actionable, so closing it.
@SeanReg, can I see your C code?
I found this benchmark https://github.com/frol/completely-unscientific-benchmarks
Numbers looks really really bad for kotlin native
Is it going to be improved? what is jetbrain opinion on performance for kotlin native ?
@SeanReg, can I see your C code?
@Dougrinch
Sorry, not sure if I have it anymore
Numbers looks really really bad for kotlin native
@RUSshy I came back to Kotlin Native recently and ran another benchmark. I don't remember the exact numbers, but they were a slight improvement over the numbers from my first post. However, still much too slow. Personally I can't bring myself to use Kotlin Native until I see better speed improvements for both runtime and compilation.
My numbers with the unscientific benchmark:
8.026s for 0.7
8.051s for 0.8.2
9.518s with 0.9
Most helpful comment
@Dougrinch
Sorry, not sure if I have it anymore
@RUSshy I came back to Kotlin Native recently and ran another benchmark. I don't remember the exact numbers, but they were a slight improvement over the numbers from my first post. However, still much too slow. Personally I can't bring myself to use Kotlin Native until I see better speed improvements for both runtime and compilation.