I used dart2native to compile a little Dart program that reads through a big word vector file (typically > 1 GB) and prints some statistics. The execution times were a bit disappointing. The compiled native used real 0m37.340s (user 0m35.359s), while the old non-compiled used only real 0m23.449s (user 0m23.894s). This was on a MacBook Air.
My Dart program, vec-test.dart, is listed at the bottom. I run it like this: dart vec-test.dart some_file.vec
A vec file (text) may be found at https://fasttext.cc/docs/en/crawl-vectors.html, e.g. https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.no.300.vec.gz
Dart SDK Version (dart --version)
Dart VM version: 2.6.0 (Thu Oct 24 17:52:22 2019 +0200) on "macos_x64"
Whether you are using Windows, MacOSX, or Linux (if applicable)
macOS 10.14.6, Mojave
import "dart:io";
import "dart:convert";
import "dart:async";
import "dart:math";
RegExp norwRx = new RegExp(r"^[a-zæøåé]+\-?[a-zæøåé]*$");
bool nordic(String word) {
return true;
}
main(List<String> arguments) {
final filename = arguments.first;
final file = new File(filename);
Stream<List<int>> inputStream = file.openRead();
final verbose = arguments.length > 1;
int wordLines;
int dims;
int goodCount = 0;
double maxLen = 0.0;
String maxWord;
inputStream
.transform(utf8.decoder)
.transform(new LineSplitter())
.listen((String line) {
List<String> parts = line.split(" ");
if (parts.length <= 3) { // may be a trailing space
wordLines = int.parse(parts[0]);
dims = int.parse(parts[1]);
print("$wordLines word lines, $dims dimensions");
} else if(norwRx.hasMatch(parts[0])) {
double sumOfSquares = 0.0;
for (int i=1; i<=dims; i++) {
double d = double.parse(parts[i]);
sumOfSquares += d * d;
}
double vLen = sqrt(sumOfSquares);
goodCount++;
if (vLen > maxLen) {
maxLen = vLen;
maxWord = parts[0];
}
if (verbose) {
print("${parts[0]}: $vLen");
}
}
},
onDone: () {
print("\naccepted words: $goodCount");
print("maxLen=$maxLen, for '$maxWord'");
},
onError: (e) { print(e.toString()); }
);
}
RegExp seems to be much slower with dart2native. Here is another case:
#!/usr/bin/env dart
void main() {
final re = RegExp(r'^/foo/bar/baz/(.+)$');
final s = '/foo/bar/baz/the_five_boxing_wizards/jump/quickly';
final results = List<String>(100000);
final sw = Stopwatch()..start();
for (var i = 0; i < results.length; i += 1) {
results[i] = re.firstMatch(s)?.group(1);
}
print('${sw.elapsedMilliseconds}ms');
}
Running this with directly (i.e., through dart) on my Linux laptop, this completes in ~30ms.
Using dart2native and running the result, this instead completes in ~500ms(!).
Dart VM version: 2.6.1 (Mon Nov 11 13:12:24 2019 +0100) on "linux_x64"
Edit: looked into this a bit more and this seems to be the same issue as #37774, #39139.
I'm running into the same issue on Flutter, where in the release build (which is AOT – see) the RegExp performance is heavily degraded – see profiling below. Parsing the same content took 1476ms (AOT) vs 84ms (JIT).
From the profiling, it seems like the culprit is _ExecuteMatchSticky – in other profiling I did the culprit is much more exaggerated, in hindsight I should have taken screenshots of those.


My flutter doctor output – for Dart version.
[✓] Flutter (Channel unknown, v1.15.9, on Mac OS X 10.15.4 19E266, locale en-AU)
• Flutter version 1.15.9 at /Users/fwang/Documents/Personal/flutter
• Framework revision cc52a903a8 (4 weeks ago), 2020-03-04 18:59:18 -0800
• Engine revision 810727bf3f
• Dart version 2.8.0 (build 2.8.0-dev.11.0 57462f9ca5)
Am happy to provide more details if needed.
I have the other very simple example, where compiled performance is about 85% worse, than in VM:
void main() {
final List<int> result = List(3000);
for (int i = 0; i < 15; i++) {
final start = DateTime.now();
for (int j = 0; j < 3000; j++) {
result[j] = i;
}
int sum = 0;
for (int j = 0; j < 3000; j++) {
sum += result[j];
}
// final int sum = result.reduce((t, v) => t + v);
final end = DateTime.now();
print('${end.difference(start).inMicroseconds}, $sum');
}
}
VM results (you can see heating up in the first 3 rows, that's fine):
93, 0
154, 3000
85, 6000
7, 9000
6, 12000
6, 15000
7, 18000
6, 21000
6, 24000
6, 27000
6, 30000
6, 33000
6, 36000
7, 39000
6, 42000
dart2native results:
13, 3000
12, 6000
11, 9000
11, 12000
12, 15000
11, 18000
11, 21000
11, 24000
11, 27000
11, 30000
11, 33000
11, 36000
11, 39000
13, 42000
P.S. When using final int sum = result.reduce((t, v) => t + v); to calculate sum instead of the manual loop it takes 2-8 times more time of VM and 2 times more when compiled.
_Edit_: Using Uint32List (or Uint8List, which is enough in this case) instead of List<int> lets the compiled performance be the same as in VM. Can't the compiler do this optimisation by its own?
Most helpful comment
I have the other very simple example, where compiled performance is about 85% worse, than in VM:
VM results (you can see heating up in the first 3 rows, that's fine):
dart2native results:
P.S. When using
final int sum = result.reduce((t, v) => t + v);to calculate sum instead of the manual loop it takes 2-8 times more time of VM and 2 times more when compiled._Edit_: Using
Uint32List(orUint8List, which is enough in this case) instead ofList<int>lets the compiled performance be the same as in VM. Can't the compiler do this optimisation by its own?