Omr: Rework Options Processing Framework in OMR Compiler Technology

Created on 9 Aug 2018  ·  7Comments  ·  Source: eclipse/omr

Reworking the options processing in OMR Compiler Technology has been a topic of discussion a number of times in the past. As an open source project, it is now more important than ever before to rework the options processing framework so that new developers find it easier to work with.

This issue will be used to track the progress and discussions surrounding this rework.

Some limitations of the current options processing framework

  1. TR::Options does a lot more work than it is supposed to be doing. It involves command line parsing, trace and verbose log handling, option set and regex handling and many others that should really be split into other more specialized classes.
  2. The compiler uses a variety of paths to answer whether a particular feature is enabled or disabled with differing ways of controlling whether the feature is enabled or disabled. For example, Options, Optimizer, Compilation, FrontEnd, and CodeGenerator are all asked whether features are enabled or disabled, oftentimes delegating to one another to answer the question. This makes it terribly confusing for a developer to track down who’s answering the question about a particular feature.
  3. The option names do not follow a consistent pattern. The lack of a consistent option naming scheme can cause confusions for developers when they want to add new options. eg. for options such as TR_EnableFeatureX and TR_DisableFeatureX, we could perhaps do with one TR_FeatureX option.
  4. Options are not available very early in the compiler initialization, preventing us from being able to control certain aspects of the compiler initialization due to not being able to query TR::Options.
  5. OMROptions contain a number of OpenJ9-specific options, which should not be the case. Allowing this practice to continue could potentially make the kitchen sink problem in OMROptions worse as more projects make use of the OMR compiler
  6. Option table items need to be sorted in alphabetical order for the options to work properly.
  7. There is no way for a user to output the list of options that are in effect/turned off.
  8. Lack of an option compatibility checking mechanism.
  9. Adding new options isn't always easy or straightforward due to lacking documentation on this topic. There are different environment variables for setting compiler options, with no clear guidelines on their usage.

Some desired characteristics of the reworked options processing framework

  1. Compiler features/options encapsulated in one place to be queried from.
  2. Allow new options to be added easily and a consistent way of enabling/disabling them. For example, one should not have to create separate “enable” and “disable” options for the same piece of functionality.
  3. Options should be settable from a variety of sources:

    • JIT command-line

    • an options file (for more involved options settings)

    • through environment variables

  4. Command-line option and environment variable settings should be available very early (i.e, JIT startup) and very late (i.e., JIT shutdown) in the compilation process.
  5. Applying options to certain methods (or regexes) or certain optimization levels (or regexes) is a unique and powerful feature with OMR Compiler technology. This functionality should remain, but its performance should improve.
  6. No matter how the new options handling framework is implemented, the performance should be similar or comparable to the existing framework

High level proposals for addressing the limitations

Below are some high-level descriptions of tasks to be completed to address each of the limitations of the current options processing framework. Proposals for addressing limitation 5, 6 and 7 will be made after we get to see the outcome of addressing limitation 1, 2 and 4 as it is too early to decide on a solution for them.

  1. Break TR::Options into more specialized classes.
  2. Create a central global feature querying mechanism that is able to answer all options-related queries whether it's global or thread specific.
  3. Rename (or merge) options such as TR_EnableFeatureX and TR_DisableFeatureX into one TR_FeatureX.
  4. The options string will be provided to the compiler during initialization of CompilerEnv, and processed very early into objects containing options info
  5. TBD
  6. TBD
  7. TBD
  8. Need more discussion about this issue and decide whether it is useful to solve this, as maintaining sets of incompatible options is going to be a very tedious task.
  9. Once the new options processing framework is in place, write up a document (or expand existing documentation on options) to make it easier for new developers to understand how options processing works, and how to work with them.

Roadmap

The reworked options processing framework is going to be implemented in multiple phases to make the transition easier. In addition to linking the items below to their respective issues and PRs, this section will be expanded with further steps along the path towards an improved options processing framework after more discussions around the topic took place.

The first step

  1. Major refactoring that involves splitting the responsibilities of the current options processing framework into specialized classes.
  2. Implement and make use of a central option querying mechanism throughout the compiler, and verify that the changes did not cause any performance degradation.

Community input

Reworking the options processing framework is going to affect most community members involved directly or indirectly with OMR Compiler Technology. Therefore, your input on the matter is very important. Feel free to talk about what you would or wouldn't want to see in the reworked options processing world. The input received within a week will be used to formulate the initial design. In the meantime, progress can be made with the first steps above to get the code into better shape.

compiler epic

Most helpful comment

Is there any reason this has to be a compiler-specific discussion, other than the fact that it's the original compiler options facility that we're looking to replace?

Is there any reason OMR couldn't have a single options processing strategy and implementation? Are there any additional requirements that other components would add into the discussion for the new options framework to handle? @charliegracie @DanHeidinga .

I can understand this kind of broadening activity complicates things, but I would have expected the JIT to have the most complex requirements for options processing so hopefully it won't stretch the goals too much further (he said, naively). I just want to make sure we don't miss something that other components need but wouldn't be surfaced by focusing solely on the compiler component's needs/goals.

I think it would be a shame to spend all this time redesigning such a fundamental part of the compiler only to end up implementing not quite a single options processing framework for OMR.

All 7 comments

Thanks for writing this up. Lots of good information here. The following are my initial thoughts:

  • Why and how much do we care about performance? To me options processing is something that happens either once per process or once per compilation and is fairly inexpensive.

    • How would we measure performance regressions in this area?

  • A characteristic I would like to see added to the list is the ability to alias option names

    • The reason I ask for this is because addressing proposal 3. is going to break some downstream projects such as OpenJ9 as they have publically documented supported options in existing releases and users will expect the same options to work in the next release.

    • Just to clarify by option names Imean the ones you may be supplying on the command line. The TR_EnableFeatureX and TR_DisableFeatureX can still be folded into TR_FeatureX as this is a C++ entity, however the user should not lose the ability to supply disableFeatureX. It should just map down (alias) to whatever the new system is.

  • We should consider eliminating the TR_ prefix
  • Another characteristic of the new system would be an ability to provide/print help documentation for a particular option so users understand what an option does and how to use it beyond just making assumptions from the option name

@fjeremic

Why and how much do we care about performance?

I think the main concern has to do with the performance of looking up whether an option has been enabled or not, as opposed to the performance of just parsing the options. Currently, this kind of lookup is done by checking whether a specific option bit is set, which is quite efficient. Ideally, the new option handling mechanism should support comparably efficient lookup.

Of course, the performance of actually parsing and processing the options is also important as it can affect start-up. However, as you pointed out, this should be fairly inexpensive and is less likely be problematic.

@fjeremic Thank you for sharing your thoughts about this.

Why and how much do we care about performance?

While the cost of processing options into the global Options and per-compilation Options may not affect overall performance too much, the cost of looking up options have to remain minimal due to how frequently options are queried at run-time. I believe the performance regression resulting from the rework can be measured by using existing benchmarking tools that are used for perf monitoring across different versions (please correct me if I have the wrong idea about it since I have not done this myself yet).

ability to alias option names

The ability to apply existing options correctly is a very important requirement and something I should have specified above. Thank you for bringing this up. When I was talking of merging TR_EnableFeatureX and TR_DisableFeatureX into TR_FeatureX, I was talking about how they would be stored in the new options/feature info containing structure, and not have users lose the ability to pass in -Xjit:disableFeatureX. The command line and environment options parser would take care of applying that appropriately by setting TR_FeatureX as off.

We should consider eliminating the TR_ prefix

I think eliminating TR_ prefix for options would make the option names shorter and slightly easier to work with. Some option names are very long, and the prefix just makes the problem worse (eg. TR_DisableReducedPriorityForCustomMethodHandleThunks). I am curious about why the option names have been prefixed with TR_ in the first place.

ability to provide/print help documentation for a particular option

I think that can be added as another limitation of the current options processing framework, or perhaps limitation 7 could be reworded with this limitation added. This would definitely be very useful. How do you envision such a feature implemented and used? An example of such an implementation I can think of right now: some projects take an option like -h<option-name> in the CLI and that outputs the help text related to the option, as well as the possible values.

I am curious about why the option names have been prefixed with TR_ in the first place.

Before being contributed to the Eclipse Foundation as part of OMR, the compiler had an internal name of Testarossa.[1] Because historically the compiler had not used namespaces, to prevent symbol identifier collisions, every symbol introduced by the compiler was prefixed with TR_ as a sort-of poor-man's equivalent.

[1] https://www.slideshare.net/MarkStoodley/under-the-hood-of-the-testarossa-jit-compiler

We should use proper namespaces now that they're supported and encouraged. Thanks for the reference @lmaisons! Looking forward to the talk later today.

Is there any reason this has to be a compiler-specific discussion, other than the fact that it's the original compiler options facility that we're looking to replace?

Is there any reason OMR couldn't have a single options processing strategy and implementation? Are there any additional requirements that other components would add into the discussion for the new options framework to handle? @charliegracie @DanHeidinga .

I can understand this kind of broadening activity complicates things, but I would have expected the JIT to have the most complex requirements for options processing so hopefully it won't stretch the goals too much further (he said, naively). I just want to make sure we don't miss something that other components need but wouldn't be surfaced by focusing solely on the compiler component's needs/goals.

I think it would be a shame to spend all this time redesigning such a fundamental part of the compiler only to end up implementing not quite a single options processing framework for OMR.

We have had success separating other parts of the compiler into independent entities that could be consumed by other projects or even other parts of OMR itself (e.g., limitfile processing and debug counters [tragically not contributed, but at least it demonstrates it is possible]). Options processing and feature detection could be architected with that goal in mind as well. They are just another tool in the OMR toolbox. Inputs to this design are welcome from anywhere.

Not limited to Options processing, but we should be striving for greater sharing and a more seamless look-and-feel to all the components of OMR.

Was this page helpful?
0 / 5 - 0 ratings