Two problems:
--minify and --omit-implicit-checks today. Our flags will change, and we want that change to be transparent and not a breaking change for build systems.Ideas
--production flag:
To improve the build system integration and prevent having to change tools downstream on every change to our compiler, we were planning to add a new set of flags that will evolve in time. Because the main goal build systems have is to have a flag to build apps at production level, we could focus on a single flag for that, e.g. --production. The means of a production compile may vary as we implement more features. For example, once we implement our optimizations for Dart2, we can deprecate the behavior of omit-implicit-checks, but may introduce a new option to allow unsafe casts in certain framework's code.
So initially --production would be equivalent to --minify --omit-implicit-checks, but later it will be equivalent to --minify --allow-unsafe-cast-operator or something else.
--unsafe-optimizations flag:
One suggestion to address the second problem is to purposely use a naming convention of flags that makes it easier for users to see how safe or unsafe the flag is. For example: it has been proposed to add --unsafe-optimizations and --safe-optimizations flags that take as arguments an optional list of optimizations.
default, all, none, allow-unsafe-cast-operator, lax-runtime-type, omit-parameter-checks, omit-implicit-downcasts, trust-primitivesdefault, all, none, minify, inlining, type-inference, rti, runtime-type (and in the future some SSA optimizer steps can be listed here).default safe optimizations are all but minifydefault unsafe optimizations may only include allow-unsafe-cast-operator initially.Revised --production=value flag:
Another suggestion: allow values to --production:
--production=safe (like --safe-optimizations=all)--production=almost-safe (unsafe-cast-operator, lax-runtime-type-to-string)--production=unsafe (omit-implicit-checks)--production=really-unsafe (trust-primitives)and have --production (with no args) default to --production=safe or --production=almost-safe.
The idea is that we could also provide a warning if an underlying flag is used without using the production flag. For example:
dart2js --production=unsafe # produces no warnings
dart2js --omit-implicit-checks # produces a warning about how you should test your app.
I'm starting to like this last approach. Thoughts?
cc @rakudrama @johnniwinther @natebosch @vsmenon
I find safe, almost-safe, unsafe, really-unsafe quite confusing. Without proper docs these names are almost meaningless.
Names that really tell what they do are much easier to understand.
If dart2js -h supports them as alias for a set of other options it's fine of course
Similar to
dart -h
...
--observe[=<port>[/<bind-address>]]
The observe flag is a convenience flag used to run a program with a
set of options which are often useful for debugging under Observatory.
These options are currently:
--enable-vm-service[=<port>[/<bind-address>]]
--pause-isolates-on-exit
--pause-isolates-on-unhandled-exceptions
--warn-on-pause-with-no-debugger
The concrete options could also be only shown with dart2js -hv with a hint that they are subject to change and to avoid breaking changes and the more generic ones should be used to be on the safe side.
Thanks for the feedback @zoechi!
If
dart2js -hsupports them as alias for a set of other options it's fine of course
Exactly, that's closer to what I had in mind. The documentation might read something like this:
--production=[safe|almost-safe|unsafe|very-unsafe]
Enables several optimizations to produce smaller and performant JavaScript to
deploy to production. This is equivalent to provide several individual flags to dart2js
(details below).
We call some optimizations "unsafe" because they may operate on
some assumptions that can be invalid on some programs. Without the optimization,
dart2js will validate that those assumptions are true, so the best way to ensure
that an unsafe optimization is OK to use, is to throroughly test your application
without the optimization first.
The intent of this `--production` flag is to provide a stable command-line interface
to users and tools. The individual flags enabled here may change with time, but the
production flag will remain stable on future versions of dart2js. Because the indiviual
flags are less stable, dart2js provides a warning if you enable them individually.
--production
--production=safe
Enable all safe optimizations in dart2js.
Today, this is equivalent to pass all the following options:
--minify
// the next few don't exist today, but maybe we should add them? We only have a way to disable them (e.g. --disable-inlining)
--inlining=true
--global-analysis=true
--rti-optimization=true
--production=almost-safe
Enable all safe optimizations and a couple unsafe optimizations in dart2js. At the "almost-safe"
level, we only turn on unsafe optimizations that are very local and are unlikely to cause issues.
It is still important to use unit tests to ensure these optimizations are OK to use.
Today, this is equivalent to pass all the following options:
--production=safe
--allow-unsafe-cast-operator=true
--lax-runtime-type-to-string
--production=unsafe
Enable all safe optimizations and a several unsafe optimizations in dart2js. A the
"unsafe" level, optimizations remove runtime checks that are part of the language semantics.
These unsafe optimizations have a global effect: code you depend on will be affected, even
if that code was not be written with these optimizations in mind.
Today, this is equivalent to pass all the following options:
--production=almost-safe
--omit-parameter-checks
--omit-implicit-downcasts
--production=very-unsafe
Enable all optimizations in dart2js. The "very-unsafe" is similar to the unsafe level, except
that primitive operations are trusted to be correct. This includes assuming non-null values on
number arithmetic, valid ranges on accesses to arrays, among other things. The reason we
consider this a category of its own is that if an invalid assumption is found, errors can be
even harder to understand. Testing should pay close attention to areas of the program that
use primitives heavily or that use arrays heavily.
Today, this is equivalent to pass all the following options:
--production=unsafe
--trust-primitives
We could use other names for safe/unsafe/etc. Basically the idea with the names is to use something that indicates the risk involved with those flags and the level of testing that is necessary.
Other ideas for the names:
safe, unsafe, very-unsafe, very-very-unsafesafe, unsafe-level-0, unsafe-level-1, unsafe-level-2safe, unsafe-0, unsafe-1, unsafe-2safe, unsafe-low-risk, unsafe-medium-risk, unsafe-high-risksafe, unsafe-low, unsafe-medium, unsafe-highsafe, unsafe-likely-ok, unsafe-be-careful, unsafe-too-dangerousHow about --optimizations=[safe|low-risk|medium-risk|high-risk]
My personal opinion is this sounds too much like the Closure Compiler Compilation Levels, which has been a constant source - both internally and externally - of confusion. I don't even expect Google engineers to be able to parse 2 paragraphs of compiler options.
For example, in the above (emphasis mine):
_The reason we consider this a category of its own is that if an invalid assumption is found, errors can be even harder to understand. Testing should pay close attention to areas of the program that use primitives heavily or that use arrays heavily._
I do not know if it is reasonable for developers to understand this. Every program is going to use primitives and arrays heavily. To me this reads like "be careful using Dart if you use this flag".
_--allow-unsafe-cast-operator=true_
This also sounds too much like the Closure compiler. I'd like to avoid users having to check to see if they can use a specific API or library based on what level of compilation flags they need, otherwise libraries will be written that only work when certain flags or enabled or disabled.
Maybe we can re-wind and talk about the _reasons_ for safe and unsafe optimizations:
Remember that the unsafe optimizations will likely be the default for all Google-developed applications. I'd prefer not having to explain to teams why we are recommending they use flags with the word "unsafe" in them.
Looking at the GNU/GCC configuration page:
-Ofast
Disregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard-compliant programs. It turns on -ffast-math and the Fortran-specific -fstack-arrays, unless -fmax-stack-var-size is specified, and -fno-protect-parens.
I like this type of wording more. So, how about:
--production == --production=safe (Safe optimizations on)--production=fast (Disregard strict standards compliance)I really appreciate this discussion, it is really helpful.
One important question here is the number of levels. I believe we need more than 2, and we could possibly just use a numeric scheme like GCC (-O0, -O1, -O2, ...).
I'd like to be able to evolve what the optimizations levels do, so we keep a stable flag that with time does less unsafe optimizations. For example, today we need --omit-implicit-checks, but once we fix #34003 and we have a more local way of omitting checks (like unsafe-casts), our hope is that most programs wont need this flag anymore. With a numeric scheme it might be easier to evolve it. We could say that today the levels are like this:
-O0: no optimizations enabled (equivalent to --disable-inlining --disable-type-inference --disable-rti-optimizations-O1: --minify-O2: --omit-implicit-checks-O3: -O2 --trust-primitivesBut later in time we could remove --omit-implicit-checks or move it up a level:
-O0: no optimizations-O1: --minify-O2: -O1 --allow-unsafe-cast-O3: -O2 --omit-implicit-checks --trust-primitivesI do like @matanlurey your suggestions for documentation, however, I'm not sure now how to describe the difference between O2 and O3:
-O0: no optimizations-O1: optimizations that respect standard language compliance -O2: optimizations that may disregard standard language compliance-O3: optimizations that may disregard standard language compliance even more?On a separate note: @rakudrama also pointed out that --lax-runtime-type-to-string is an odd duck: it is not about making well tested programs work, but it actually changes the semantics of runtimeType.toString, making it inconsistent with what other tools do. To some extend, it is not that different from the effect --minify could have on runtimeType.toString, so we need to think how we want to toggle it in the future. I hope we can have it on by default, to be honest.
Thanks!
@sigmundch:
One important question here is the number of levels. I believe we need more than 2, and we could possibly just use a numeric scheme like GCC
I think having them is fine. I'd prefer that we have a maximum of 2 sensible defaults - i.e. one for users with lax testing, and one for users with extensive testing (i.e. Google apps). But having more than 2 for users that fall in-between that spectrum seems OK.
With a numeric scheme it might be easier to evolve it.
I really like that, and it actually gives me a lot more appreciation for what you were trying to point out with the various safety levels - I was confused until I sort of mentally mapped them to a numerical hierarchy.
O0 = No optimizations, ideal for say, DartPad, or users that don't use DDC (i.e. for Safari?)O1 = Safe optimizationsO2 = ?O3 = Unsafe optimizationsO4 = For testing only (lets your team experiment with more "extreme" optimizations)... how to describe the difference between O2 and O3:
Yeah. That's why I wonder if its important to have that distinction. Will anyone want to run with --omit-implicit-checks but _not_ --trust-primitives, for example? If so, could they just specify the flags themselves and not rely on an optimization group?
On a separate note: @rakudrama also pointed out that --lax-runtime-type-to-string is an odd duck
The only reason I think it might be significant is your work on the deobsfucator. For example, if an error reporting service logs C<X> is not a D<Z>, and you're able to deobsfucate to Iterable<dynamic> is not a List<String>, then its useful, right?
@vsm had an interesting suggestion on wording:
O2/O3 Performs optimizations that assume the program doesn't ever throw any subclass of Error.
O2 for us is about TypeError, O3 includes ArgumentError, NoSuchMethodError on null, and RangeError produced from the corelibs.
... Will anyone want to run with --omit-implicit-checks but not --trust-primitives, for example?
I believe so, yes. Today only a handful of teams use the latter. Externally, we were inclined to enable omit-implicit-checks by default on package:build, but not trust-primitives.
... use primitives heavily or that use arrays heavily.
I do not know if it is reasonable for developers to understand this. Every program is going to use primitives and arrays heavily. To me this reads like "be careful using Dart if you use this flag".
Thinking more about this, the reason I'm so hesitant on grouping implicit-checks and trust-primitives in the same bucket is that the latter is more susceptible to invalid user input. So when I think of testing for omit-implicit-checks, normally DDC unit tests, plus some integration tests for the RPC layer tends to be enough to ensure that the app will be fine. With trust-primitives, I believe developers need to be more cautious about how different user inputs may affect their code and think more about off-by-one errors and such.
At the same time - once we implement #34003, I feel more comfortable grouping them together because it will be less likely for users to need to use omit-implicit-checks altogether.
Trying to put together all suggestions thus far, here is an updated proposal:
-O -O<0,1,2,3>
Controls optimizations that can help reduce code-size and improve
performance of the generated code for deployment.
-O0
Disables all optimizations. Equivalent to calling dart2js with these extra flags:
--disable-inlining
--disable-type-inference
--disable-rti-optimizations
-O
-O1
Enables optimizations that respect the language semantics and are safe for all
programs. It however changes the string representation of types, which will no
longer be consistent with the Dart VM or DDC.
Equivalent to calling dart2js with these extra flags:
--minify
--lax-runtime-type-to-string
-O2
Enables optimizations that respect the language semantics only on programs that
don't ever throw any subtype of `Error`. To use this option, we recommend that
you properly test your application first without it, and ensure that no subtype
of `Error` (such as `TypeError`) is ever thrown.
Equivalent to calling dart2js with these extra flags:
-O1
--allow-unsafe-cast
--omit-implicit-checks // for now
-O3
Enables more aggressive optimizations than -O2, but with the same assumptions.
These optimizations are on a separate group because they are more likely to be
affected by variations in input data. To use this option we recommend to pay
additional attention on testing of edge cases in user input.
Equivalent to calling dart2js with these extra flags:
-O2
--trust-primitives
(my apologies @vsm, I meant to mention @vsmenon)
馃憤 馃憤 馃憤
The above wording looks a lot better, and makes a lot more sense to me.
I'll leave the rest of you to figure out the specifics, though.
Should it be mentioned in the help text that -0x options are considered to be stable, while the individual options are subject to change?
Yes, thanks - I think I was too subtle in how I described it, so I sent a new CL to make it clearer: https://dart-review.googlesource.com/c/sdk/+/68140
I was also planning on changing the individual flags to emit a warning telling users that those individual flags are unstable and that if they want a stable alternative that they should use -O*
Most helpful comment
@vsm had an interesting suggestion on wording:
O2 for us is about TypeError, O3 includes ArgumentError, NoSuchMethodError on null, and RangeError produced from the corelibs.
I believe so, yes. Today only a handful of teams use the latter. Externally, we were inclined to enable omit-implicit-checks by default on package:build, but not trust-primitives.
Thinking more about this, the reason I'm so hesitant on grouping implicit-checks and trust-primitives in the same bucket is that the latter is more susceptible to invalid user input. So when I think of testing for omit-implicit-checks, normally DDC unit tests, plus some integration tests for the RPC layer tends to be enough to ensure that the app will be fine. With trust-primitives, I believe developers need to be more cautious about how different user inputs may affect their code and think more about off-by-one errors and such.
At the same time - once we implement #34003, I feel more comfortable grouping them together because it will be less likely for users to need to use omit-implicit-checks altogether.
Trying to put together all suggestions thus far, here is an updated proposal: