Currently the pipeline for generating and running modular kernel files for the vm requires a concat step at the very end, to merge all the files into a single output. This is presenting problems when trying to compile large numbers of tests on CI environments such as Travis (example failure).
The problem is that each test in this example ends up producing a 22MB output file, even though its individual kernel module is quite small (low KBs generally). This results in a build cache that is almost 6GB in size for pub.
If the vm could take a list of dill files instead, and read them all in sequentially as if they were a single file (that's all that is needed), then we could skip this entire build step and get caching working for CI environments (as well as using up a lot less disk space on users machines).
In theory these could also be read in parallel which may speed up the startup time.
cc @kevmoo
Not sure I understand the problem, if an individual test only produces a kernel file of low KBs size what gets concatenated to it to make it grow to 22 MB ?
The small file for the test is the kernel module for the test file itself, but none of its dependencies.
To run in the vm we have to also include all the modules for all dependencies, which totals 22 MB (same size as what you get using the monolithic snapshot tool dart --snapshot=out.dill some_test.dart).
So what you are stating is that if you do not have to concatenate then all the dependent modules are shared across the different tests and so the cache is smaller.
Yes, exactly.
We do something somewhat similar for Fuchsia. Dart_LoadLibraryFromKernel() (https://github.com/dart-lang/sdk/blob/master/runtime/include/dart_api.h#L3143) can be called multiple times to load partial programs. See: https://fuchsia.googlesource.com/topaz/+/master/runtime/dart_runner/dart_component_controller.cc#161
Note that we could run into issues with to many command line args, so it might make more sense to pass a file that lists all the required dill files, or something along those lines.
Raising the priority on this one. This is significantly impacting the resource usage for users of the new kernel builder. I think an appropriate stop gap solution is to allow multiple kernel files to be passed on the command line with a the long term solution being a manifest which contains the list of required kernel files.
@jakemac53 @grouma , what is the process you use to produce separate kernel files for the test and its dependencies?
@jakemac53 @grouma , what is the process you use to produce separate kernel files for the test and its dependencies?
We are using the build_runner + build_vm_compilers + build_test (needed for tests only) packages. These use the kernel worker snapshot shipped with the vm, combined with a custom module strategy, to compile packages to kernel.
It is essentially the same as the web stack, but with build_vm_compilers instead of build_web_compilers (which uses ddc).
@jakemac53 wrote:
We are using the build_runner + build_vm_compilers + build_test
Can you folks somehow share instructions on how to run this stack so we can test vm implementation of multiple kernels loading?
You would need to add these packages to your pubspec.yaml:
dev_dependencies:
build_runner: ^0.10.0
build_vm_compilers: ^0.1.0
Then you can run pub run build_runner build -o <output-dir> and it will create a merged output directory for you. By default this will only compile scripts under the benchmark, bin, test, tool, and example directories, so you will want to put your app entrypoints in one of those.
In the output directory you will see a bunch of .vm.app.dill files which are the merged kernel files - you will also see .vm.dill files which are the individual modules.
Since we can't do it today we don't have a utility to grab all the modules required for a given entrypoint and pass them to the vm, but if you make a relatively small example it shouldn't be difficult to find the required .vm.dill files (they live right next to the dart files generally, although for package imports it depends on the module structure). Let me know if you need help there.
https://dart-review.googlesource.com/c/sdk/+/67460 is one way of solving this.
That looks simple/effective enough for our purposes
cc @rmacnak-google
The standalone embedder should not accept any new kinds of input. It's already very brittle and damaged from the CFE architecture, and the cost of supporting a new kind of input should not be borne by existing users. If you require special loading, create your own embedder as Fuchsia has done.
If you require special loading, create your own embedder as Fuchsia has done.
This really isn't anything special relating to the build package, we just happen to be one of the first use cases which compiles large numbers of applications (tests specifically). The current implementation doesn't allow any way to share code across those applications, even though kernel supports it.
Given that precompiled kernel is the only way to get reasonable startup performance, I think we should support this out of the box.
we just happen to be one of the first use cases which compiles large numbers of applications (tests specifically)
Our users have been doing this for years, it's not the build system introducing the use case. In Dart 1 everyone used dart to compile and run a large number of applications and performance was fine. Dart 2 tanked the performance of using dart for this use case (compiling and running from source), so we're trying to mitigate that and we're open to discussion about what that mitigation should look like.
It might also be worth looping in engineers that work _within_ the SDK - how do they run 100s or 1000s of Dart-VM-based tests now? Does test.py have something like the logic above itself?
Support for feeding multi-kernel lists to dart landed in https://dart.googlesource.com/sdk/+/41e720b486daab41257ea512e9df19e6dcc73f45. Please take a look and see if it works as you expected.
Awesome thanks!
Thank you!
On Wed, Aug 22, 2018 at 1:07 PM Jacob MacDonald notifications@github.com
wrote:
Awesome thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/dart-lang/sdk/issues/33952#issuecomment-415144877,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABwWhdY_Nc094IjtSNhmSxILSoCa91OYks5uTavWgaJpZM4VbpNf
.
Revisiting this, trying to see if we can use it.
Turns out I don't think this solves our problem. We can't use absolute paths because it isn't a hermetic output, and relative paths against cwd drop the usability. Can we consider using relative paths against the manifest file?
I believe this has been fixed, can we close the issue?
Support was added but not in a way we can use in any hermetic build system. See my previous comment. I reopened the issue to see if we can get support for relative file paths.
https://dart-review.googlesource.com/c/sdk/+/103901 with relative paths fix.
Most helpful comment
Support for feeding multi-kernel lists to dart landed in https://dart.googlesource.com/sdk/+/41e720b486daab41257ea512e9df19e6dcc73f45. Please take a look and see if it works as you expected.