Sdk: dart2js: define flags for a "production-mode"

Created on 1 Aug 2018  路  12Comments  路  Source: dart-lang/sdk

Two problems:

  • Build systems automatically pass flags to dart2js to make it easy for developers to get the best of dart2js production capabilities out of the box without any configuration: this may include compiling with --minify and --omit-implicit-checks today. Our flags will change, and we want that change to be transparent and not a breaking change for build systems.
  • We want users to be aware of how safe/unsafe certain flags are. Documentation is not enough for this.

Ideas

--production flag:
To improve the build system integration and prevent having to change tools downstream on every change to our compiler, we were planning to add a new set of flags that will evolve in time. Because the main goal build systems have is to have a flag to build apps at production level, we could focus on a single flag for that, e.g. --production. The means of a production compile may vary as we implement more features. For example, once we implement our optimizations for Dart2, we can deprecate the behavior of omit-implicit-checks, but may introduce a new option to allow unsafe casts in certain framework's code.
So initially --production would be equivalent to --minify --omit-implicit-checks, but later it will be equivalent to --minify --allow-unsafe-cast-operator or something else.

--unsafe-optimizations flag:

One suggestion to address the second problem is to purposely use a naming convention of flags that makes it easier for users to see how safe or unsafe the flag is. For example: it has been proposed to add --unsafe-optimizations and --safe-optimizations flags that take as arguments an optional list of optimizations.

  • Accepted unsafe optimizations: default, all, none, allow-unsafe-cast-operator, lax-runtime-type, omit-parameter-checks, omit-implicit-downcasts, trust-primitives
  • Accepted safe optimizations: default, all, none, minify, inlining, type-inference, rti, runtime-type (and in the future some SSA optimizer steps can be listed here).
  • The default safe optimizations are all but minify
  • The default unsafe optimizations may only include allow-unsafe-cast-operator initially.

Revised --production=value flag:

Another suggestion: allow values to --production:

  • --production=safe (like --safe-optimizations=all)
  • --production=almost-safe (unsafe-cast-operator, lax-runtime-type-to-string)
  • --production=unsafe (omit-implicit-checks)
  • --production=really-unsafe (trust-primitives)

and have --production (with no args) default to --production=safe or --production=almost-safe.

The idea is that we could also provide a warning if an underlying flag is used without using the production flag. For example:

dart2js --production=unsafe  # produces no warnings
dart2js --omit-implicit-checks  # produces a warning about how you should test your app.

I'm starting to like this last approach. Thoughts?

cc @rakudrama @johnniwinther @natebosch @vsmenon

P1 web-dart2js

Most helpful comment

@vsm had an interesting suggestion on wording:

O2/O3 Performs optimizations that assume the program doesn't ever throw any subclass of Error.

O2 for us is about TypeError, O3 includes ArgumentError, NoSuchMethodError on null, and RangeError produced from the corelibs.

... Will anyone want to run with --omit-implicit-checks but not --trust-primitives, for example?

I believe so, yes. Today only a handful of teams use the latter. Externally, we were inclined to enable omit-implicit-checks by default on package:build, but not trust-primitives.

... use primitives heavily or that use arrays heavily.
I do not know if it is reasonable for developers to understand this. Every program is going to use primitives and arrays heavily. To me this reads like "be careful using Dart if you use this flag".

Thinking more about this, the reason I'm so hesitant on grouping implicit-checks and trust-primitives in the same bucket is that the latter is more susceptible to invalid user input. So when I think of testing for omit-implicit-checks, normally DDC unit tests, plus some integration tests for the RPC layer tends to be enough to ensure that the app will be fine. With trust-primitives, I believe developers need to be more cautious about how different user inputs may affect their code and think more about off-by-one errors and such.

At the same time - once we implement #34003, I feel more comfortable grouping them together because it will be less likely for users to need to use omit-implicit-checks altogether.

Trying to put together all suggestions thus far, here is an updated proposal:

-O -O<0,1,2,3>
   Controls optimizations that can help reduce code-size and improve
   performance of the generated code for deployment. 

  -O0
    Disables all optimizations. Equivalent to calling dart2js with these extra flags:
      --disable-inlining
      --disable-type-inference
      --disable-rti-optimizations

  -O
  -O1
    Enables optimizations that respect the language semantics and are safe for all
    programs. It however changes the string representation of types, which will no
    longer be consistent with the Dart VM or DDC.

    Equivalent to calling dart2js with these extra flags:
      --minify
      --lax-runtime-type-to-string

  -O2
    Enables optimizations that respect the language semantics only on programs that
    don't ever throw any subtype of `Error`. To use this option, we recommend that
    you properly test your application first without it, and ensure that no subtype
    of `Error` (such as `TypeError`) is ever thrown.

    Equivalent to calling dart2js with these extra flags:
      -O1
      --allow-unsafe-cast
      --omit-implicit-checks // for now

  -O3
    Enables more aggressive optimizations than -O2, but with the same assumptions.
    These optimizations are on a separate group because they are more likely to be
    affected by variations in input data. To use this option we recommend to pay
    additional attention on testing of edge cases in user input.

    Equivalent to calling dart2js with these extra flags:
      -O2
      --trust-primitives

All 12 comments

I find safe, almost-safe, unsafe, really-unsafe quite confusing. Without proper docs these names are almost meaningless.
Names that really tell what they do are much easier to understand.

If dart2js -h supports them as alias for a set of other options it's fine of course
Similar to

dart -h
...
--observe[=<port>[/<bind-address>]]
  The observe flag is a convenience flag used to run a program with a
  set of options which are often useful for debugging under Observatory.
  These options are currently:
      --enable-vm-service[=<port>[/<bind-address>]]
      --pause-isolates-on-exit
      --pause-isolates-on-unhandled-exceptions
      --warn-on-pause-with-no-debugger

The concrete options could also be only shown with dart2js -hv with a hint that they are subject to change and to avoid breaking changes and the more generic ones should be used to be on the safe side.

Thanks for the feedback @zoechi!

If dart2js -h supports them as alias for a set of other options it's fine of course

Exactly, that's closer to what I had in mind. The documentation might read something like this:

--production=[safe|almost-safe|unsafe|very-unsafe]

  Enables several optimizations to produce smaller and performant JavaScript to
  deploy to production. This is equivalent to provide several individual flags to dart2js
  (details below).

  We call some optimizations "unsafe" because they may operate on 
  some assumptions that can be invalid on some programs. Without the optimization,
  dart2js will validate that those assumptions are true, so the best way to ensure
  that an unsafe optimization is OK to use, is to throroughly test your application
  without the optimization first.

  The intent of this `--production` flag is to provide a stable command-line interface
  to users and tools. The individual flags enabled here may change with time, but the
  production flag will remain stable on future versions of dart2js. Because the indiviual
  flags are less stable, dart2js provides a warning if you enable them individually.

--production
--production=safe

  Enable all safe optimizations in dart2js.

  Today, this is equivalent to pass all the following options:
      --minify
      // the next few don't exist today, but maybe we should add them? We only have a way to disable them (e.g. --disable-inlining)
      --inlining=true
      --global-analysis=true
      --rti-optimization=true

--production=almost-safe

  Enable all safe optimizations and a couple unsafe optimizations in dart2js. At the "almost-safe"
  level, we only turn on unsafe optimizations that are very local and are unlikely to cause issues.
  It is still important to use unit tests to ensure these optimizations are OK to use.
  Today, this is equivalent to pass all the following options:
      --production=safe
      --allow-unsafe-cast-operator=true
      --lax-runtime-type-to-string

--production=unsafe

  Enable all safe optimizations and a several unsafe optimizations in dart2js. A the
  "unsafe" level, optimizations remove runtime checks that are part of the language semantics.
  These unsafe optimizations have a global effect: code you depend on will be affected, even
  if that code was not be written with these optimizations in mind.

  Today, this is equivalent to pass all the following options:
      --production=almost-safe
      --omit-parameter-checks
      --omit-implicit-downcasts

--production=very-unsafe

  Enable all optimizations in dart2js. The "very-unsafe" is similar to the unsafe level, except
  that primitive operations are trusted to be correct. This includes assuming non-null values on
  number arithmetic, valid ranges on accesses to arrays, among other things. The reason we
  consider this a category of its own is that if an invalid assumption is found, errors can be
  even harder to understand. Testing should pay close attention to areas of the program that
  use primitives heavily or that use arrays heavily.

  Today, this is equivalent to pass all the following options:
      --production=unsafe
      --trust-primitives

We could use other names for safe/unsafe/etc. Basically the idea with the names is to use something that indicates the risk involved with those flags and the level of testing that is necessary.

Other ideas for the names:

  • safe, unsafe, very-unsafe, very-very-unsafe
  • safe, unsafe-level-0, unsafe-level-1, unsafe-level-2
  • safe, unsafe-0, unsafe-1, unsafe-2
  • safe, unsafe-low-risk, unsafe-medium-risk, unsafe-high-risk
  • safe, unsafe-low, unsafe-medium, unsafe-high
  • safe, unsafe-likely-ok, unsafe-be-careful, unsafe-too-dangerous

How about --optimizations=[safe|low-risk|medium-risk|high-risk]

My personal opinion is this sounds too much like the Closure Compiler Compilation Levels, which has been a constant source - both internally and externally - of confusion. I don't even expect Google engineers to be able to parse 2 paragraphs of compiler options.

For example, in the above (emphasis mine):

_The reason we consider this a category of its own is that if an invalid assumption is found, errors can be even harder to understand. Testing should pay close attention to areas of the program that use primitives heavily or that use arrays heavily._

I do not know if it is reasonable for developers to understand this. Every program is going to use primitives and arrays heavily. To me this reads like "be careful using Dart if you use this flag".

_--allow-unsafe-cast-operator=true_

This also sounds too much like the Closure compiler. I'd like to avoid users having to check to see if they can use a specific API or library based on what level of compilation flags they need, otherwise libraries will be written that only work when certain flags or enabled or disabled.


Maybe we can re-wind and talk about the _reasons_ for safe and unsafe optimizations:

  • Safe optimizations will guarantee your program works identically in the Dart VM and Dart2JS (?)
  • Unsafe optimizations will guarantee a _correct_ and _well-tested_ program works identically. For example, it will be expected that user tests the app fully in either DDC or Dart2JS without unsafe optimizations _before_ deploying with unsafe optimizations.

Remember that the unsafe optimizations will likely be the default for all Google-developed applications. I'd prefer not having to explain to teams why we are recommending they use flags with the word "unsafe" in them.


Looking at the GNU/GCC configuration page:

-Ofast
Disregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard-compliant programs. It turns on -ffast-math and the Fortran-specific -fstack-arrays, unless -fmax-stack-var-size is specified, and -fno-protect-parens.

I like this type of wording more. So, how about:

  • --production == --production=safe (Safe optimizations on)
  • --production=fast (Disregard strict standards compliance)

I really appreciate this discussion, it is really helpful.

One important question here is the number of levels. I believe we need more than 2, and we could possibly just use a numeric scheme like GCC (-O0, -O1, -O2, ...).

I'd like to be able to evolve what the optimizations levels do, so we keep a stable flag that with time does less unsafe optimizations. For example, today we need --omit-implicit-checks, but once we fix #34003 and we have a more local way of omitting checks (like unsafe-casts), our hope is that most programs wont need this flag anymore. With a numeric scheme it might be easier to evolve it. We could say that today the levels are like this:

  • -O0: no optimizations enabled (equivalent to --disable-inlining --disable-type-inference --disable-rti-optimizations
  • -O1: --minify
  • -O2: --omit-implicit-checks
  • -O3: -O2 --trust-primitives

But later in time we could remove --omit-implicit-checks or move it up a level:

  • -O0: no optimizations
  • -O1: --minify
  • -O2: -O1 --allow-unsafe-cast
  • -O3: -O2 --omit-implicit-checks --trust-primitives

I do like @matanlurey your suggestions for documentation, however, I'm not sure now how to describe the difference between O2 and O3:

  • -O0: no optimizations
  • -O1: optimizations that respect standard language compliance
  • -O2: optimizations that may disregard standard language compliance
  • -O3: optimizations that may disregard standard language compliance even more?

On a separate note: @rakudrama also pointed out that --lax-runtime-type-to-string is an odd duck: it is not about making well tested programs work, but it actually changes the semantics of runtimeType.toString, making it inconsistent with what other tools do. To some extend, it is not that different from the effect --minify could have on runtimeType.toString, so we need to think how we want to toggle it in the future. I hope we can have it on by default, to be honest.

Thanks!

@sigmundch:

One important question here is the number of levels. I believe we need more than 2, and we could possibly just use a numeric scheme like GCC

I think having them is fine. I'd prefer that we have a maximum of 2 sensible defaults - i.e. one for users with lax testing, and one for users with extensive testing (i.e. Google apps). But having more than 2 for users that fall in-between that spectrum seems OK.

With a numeric scheme it might be easier to evolve it.

I really like that, and it actually gives me a lot more appreciation for what you were trying to point out with the various safety levels - I was confused until I sort of mentally mapped them to a numerical hierarchy.

  • O0 = No optimizations, ideal for say, DartPad, or users that don't use DDC (i.e. for Safari?)
  • O1 = Safe optimizations
  • O2 = ?
  • O3 = Unsafe optimizations
  • O4 = For testing only (lets your team experiment with more "extreme" optimizations)

... how to describe the difference between O2 and O3:

Yeah. That's why I wonder if its important to have that distinction. Will anyone want to run with --omit-implicit-checks but _not_ --trust-primitives, for example? If so, could they just specify the flags themselves and not rely on an optimization group?

On a separate note: @rakudrama also pointed out that --lax-runtime-type-to-string is an odd duck

The only reason I think it might be significant is your work on the deobsfucator. For example, if an error reporting service logs C<X> is not a D<Z>, and you're able to deobsfucate to Iterable<dynamic> is not a List<String>, then its useful, right?

@vsm had an interesting suggestion on wording:

O2/O3 Performs optimizations that assume the program doesn't ever throw any subclass of Error.

O2 for us is about TypeError, O3 includes ArgumentError, NoSuchMethodError on null, and RangeError produced from the corelibs.

... Will anyone want to run with --omit-implicit-checks but not --trust-primitives, for example?

I believe so, yes. Today only a handful of teams use the latter. Externally, we were inclined to enable omit-implicit-checks by default on package:build, but not trust-primitives.

... use primitives heavily or that use arrays heavily.
I do not know if it is reasonable for developers to understand this. Every program is going to use primitives and arrays heavily. To me this reads like "be careful using Dart if you use this flag".

Thinking more about this, the reason I'm so hesitant on grouping implicit-checks and trust-primitives in the same bucket is that the latter is more susceptible to invalid user input. So when I think of testing for omit-implicit-checks, normally DDC unit tests, plus some integration tests for the RPC layer tends to be enough to ensure that the app will be fine. With trust-primitives, I believe developers need to be more cautious about how different user inputs may affect their code and think more about off-by-one errors and such.

At the same time - once we implement #34003, I feel more comfortable grouping them together because it will be less likely for users to need to use omit-implicit-checks altogether.

Trying to put together all suggestions thus far, here is an updated proposal:

-O -O<0,1,2,3>
   Controls optimizations that can help reduce code-size and improve
   performance of the generated code for deployment. 

  -O0
    Disables all optimizations. Equivalent to calling dart2js with these extra flags:
      --disable-inlining
      --disable-type-inference
      --disable-rti-optimizations

  -O
  -O1
    Enables optimizations that respect the language semantics and are safe for all
    programs. It however changes the string representation of types, which will no
    longer be consistent with the Dart VM or DDC.

    Equivalent to calling dart2js with these extra flags:
      --minify
      --lax-runtime-type-to-string

  -O2
    Enables optimizations that respect the language semantics only on programs that
    don't ever throw any subtype of `Error`. To use this option, we recommend that
    you properly test your application first without it, and ensure that no subtype
    of `Error` (such as `TypeError`) is ever thrown.

    Equivalent to calling dart2js with these extra flags:
      -O1
      --allow-unsafe-cast
      --omit-implicit-checks // for now

  -O3
    Enables more aggressive optimizations than -O2, but with the same assumptions.
    These optimizations are on a separate group because they are more likely to be
    affected by variations in input data. To use this option we recommend to pay
    additional attention on testing of edge cases in user input.

    Equivalent to calling dart2js with these extra flags:
      -O2
      --trust-primitives

(my apologies @vsm, I meant to mention @vsmenon)

馃憤 馃憤 馃憤

The above wording looks a lot better, and makes a lot more sense to me.

I'll leave the rest of you to figure out the specifics, though.

Should it be mentioned in the help text that -0x options are considered to be stable, while the individual options are subject to change?

Yes, thanks - I think I was too subtle in how I described it, so I sent a new CL to make it clearer: https://dart-review.googlesource.com/c/sdk/+/68140

I was also planning on changing the individual flags to emit a warning telling users that those individual flags are unstable and that if they want a stable alternative that they should use -O*

Was this page helpful?
0 / 5 - 0 ratings