Graal: Support agents at image generation time

Created on 13 Mar 2019  路  17Comments  路  Source: oracle/graal

It would be great if agents could be supported at image generation time, so that the instrumented code would be used in the final image.

feature native-image

Most helpful comment

This would be very relevant for me, too. I have a monitoring application for which I would like to support this. If Graal could mimic the instrumentation API, this would of course be a plus but a different API that only supports registering a ClassFileTransformer for all classes that are added to the image would solve most use cases, I think.

If I may suggest an implementation: I do not think that such an agent should affect the classes of the JVM process that is building the image as this might interfere with the build process. Instead, there should be an API that allows for:

a) Registering a ClassFileTransformer for transforming all classes before being added to the image. The arguments for Module, ClassLoader, Class and ProtectionDomain could simply be null, you basically only need to present the class name and its byte code.
b) Offer an API to define additional classes and to find classes that are included in the image such as:

interface GraalInstrumentation {
  void addClassFileTransformer(ClassFileTransformer cft);
  byte[] findClass(String className);
  void addClass(byte[] classFile);
  void include(JarFile file);
  void addConfig(ReflectionConfig config);
  void addEntryPoint(String className);
}

The last method would be used to register classes with some method to execute before the instrumented program that was already added, similar to premain but without the possibility to request an instrumentation instance.

Given this, one could offer an API for the Graal native-image compiler where the Instrumentation interface was replaced with the above. The same should then ideally also work for standard AOT compilation.

As a bonus one could consider to supply a pseudo class loader to the above class file transformer. This class loader could then return class files on the getResourceAsStream method. This would make it a bit easier to reuse existing agent code.

For my purposes, this would be sufficient.

All 17 comments

How do you envision such a feature? It seems to me that bytecode instrumentation via agents is an orthogonal feature. You can run an agent on your code, instrument it, and then feed it to native-image.

I'm not super familiar with Graal's build options, but what I would be hoping for is a way to reuse the existing javaagent tooling at build time instead of run time.

I think supporting existing Java agents (or even JVMTI agents) in the image build would not be straightforward because they are quite often more dynamic than just instrumenting bytecode when it is first loaded, and agents can also affect the code of the image build itself and not just application and library classes. I'm also not sure whether they work currently work well with JVMCI. This could be feasible to implement in a limited scope without support for redefinition or retransformation.

(closed unintentionally)

It would be nice if native-image will support one-time class file transformations.

Preferably with the existing Java Agents API (simulating premain), but if a separate API will be required, that's not a blocker for our use cases.

Sorry for not following up. Agreed, limited functionality will go a long way - feeding the classes to be part of the image through the agent(s) and allowing them a go at transforming them will go a long way for my use case(s) too.

I'm also not sure whether they work currently work well with JVMCI

JVMCI and Graal work fine with JVMTI agents (it's a bug if they don't). We also load method substitution bytecode from disk to avoid instrumentation effects.

This would be very relevant for me, too. I have a monitoring application for which I would like to support this. If Graal could mimic the instrumentation API, this would of course be a plus but a different API that only supports registering a ClassFileTransformer for all classes that are added to the image would solve most use cases, I think.

If I may suggest an implementation: I do not think that such an agent should affect the classes of the JVM process that is building the image as this might interfere with the build process. Instead, there should be an API that allows for:

a) Registering a ClassFileTransformer for transforming all classes before being added to the image. The arguments for Module, ClassLoader, Class and ProtectionDomain could simply be null, you basically only need to present the class name and its byte code.
b) Offer an API to define additional classes and to find classes that are included in the image such as:

interface GraalInstrumentation {
  void addClassFileTransformer(ClassFileTransformer cft);
  byte[] findClass(String className);
  void addClass(byte[] classFile);
  void include(JarFile file);
  void addConfig(ReflectionConfig config);
  void addEntryPoint(String className);
}

The last method would be used to register classes with some method to execute before the instrumented program that was already added, similar to premain but without the possibility to request an instrumentation instance.

Given this, one could offer an API for the Graal native-image compiler where the Instrumentation interface was replaced with the above. The same should then ideally also work for standard AOT compilation.

As a bonus one could consider to supply a pseudo class loader to the above class file transformer. This class loader could then return class files on the getResourceAsStream method. This would make it a bit easier to reuse existing agent code.

For my purposes, this would be sufficient.

@thegreystone Would the approach as described by @raphw also work for your use case?

I prototyped a bit and extended my list of requirements by a few things. With this I could make most of my agents Graal compatible in short time.

A separate API is OK for me. That said, if the tooling could take unmodified JPLIS agents, place them in a little sandbox and throw all classes to the premain (and also intercept calls to Unsafe#defineClass etc for any helper classes generated), that would probably allow many existing agents to work with graal without much modification. Harder to accomplish and maintain though.

I am adding an agent to native-image as we speak; I am a bit afraid of what happens when these agents start interacting. Things that worry me are:

  • The order of agents. For example, my agent adds a call in every static initializer at the beginning. What if the agent of someone else adds something before.
  • Agents can modify the code of native-image itself. This includes very sensitive things such as the garbage collector and deptimization routines. How do we assure that the agent will not mess with the native-image internals.
  • Agents can introduce code that is generated after native image agents. For example, one of the agents will rewrite invokedynamic instructions for lambdas. If the user-space agent introduces a lambda, all our assumptions are broken.

With all of this said, it would be good to have examples of the agents that you want to use. It would give me a better idea of how to support this. I would like to see quite a few use-cases before exposing this in the API.

I think you could try the agents already: If you pass -J-javaagent:<agent-jar> it should work out of the box, given agents do reasonable things.

I don't think we will ever be able to guarantee that native image will work with agents that modify classes of the native image itself. Let's see how we can support this elegantly.

Datadog APM is one agent that has issues.

I have done an agent for tracing class initialization and based on the experience I believe we need to:
1) Let all user agents run before the native-image ones.
2) Restrict the scope of agents only to classes loaded by the NativeImageClassLoader. Our native-image code has various assumptions about the shape of the bytecode (e.g., some parts of the code must not allocate), so any agent will not run. However, if the agent is restricted to user-space all should work.

Is it possible to restrict the scope of the Datadog APM?

Not sure but here is the Datadog APM Java agent code https://github.com/DataDog/dd-trace-java

Most such agents target mainly user space but also some JVM-specific classes, often those responsible for context switching such as Thread or ThreadPoolExecutor.

You are saying that passing an agent with -J-javaagent:<agent-jar> to the native image compiler should basically instrument the targeted image?

Yes, it will if it does not break the image builder. I will merge soon the PR that does transformations for the Java lambdas. This will be a good starting point for adding extra agents.

Was this page helpful?
0 / 5 - 0 ratings