Jib: Enable to generate and package a CDS archive

Created on 18 May 2020  路  16Comments  路  Source: GoogleContainerTools/jib

Hi,

CDS enables java application to boot way faster (in particular when any scanning/reflection is involved) so it would be neat to integrate it with JIB.
It requires to launch 2-3 commands to generate the archive and then just bundle it and modify the jvm arguments.

https://blog.codefx.org/java/application-class-data-sharing/#Creating-A-JDK-Class-Data-Archive explains it quite well.

Since the classpath must be stable and not use folders nor wildcards, doing it in jib seems the most reliable and relevant IMHO.

Romain

kinfeature

Most helpful comment

I took Jib for a spin to see how to do this mechanically (w/o automation). The basics premises are:

  1. JDK arch/distribution needs to be the same
  2. We need JARs - no exploded classpath, and no nested JARs. Just JARs listed in classpath.

It turns out that Jib's packaged containerization mode is a great fit to be able to produce just the JARs! The only issue is AppCDS's -cp can't take wildcard, so we need to list out individual JARs, which is discussed in #2733. Starting with Jib 2.7.0, this can be done by setting <container><expandClasspathDependencies>true.

I was experimenting w/ my Hello World Spring Boot App https://github.com/saturnism/jvm-helloworld-by-example/tree/master/helloworld-springboot-tomcat

  1. Containerize w/ Packaged Mode, to local Docker daemon, and also need to use a debug base image so I can generate the classpath list using shell script.
    mvn package com.google.cloud.tools:jib-maven-plugin:2.7.0:dockerBuild \ -Dimage=helloworld-experiment \ -Djib.containerizingMode=packaged \ -Djib.container.expandClasspathDependencies=true \ -Djib.from.image=gcr.io/distroless/java-debian10:11-debug
  2. Generate the class list, and the archive in the image

    # The Java CLASSPATH is the third element of the default image `ENTRYPOINT`
    # in the Jib-built image, e.g., "java -cp <...classpath...> com.example.MyMain".
    JIB_CLASSPATH=$( docker inspect helloworld-experiment --format '{{(index .Config.Entrypoint 2)}}' )
    
    docker run --entrypoint=sh --name=helloworld-experiment helloworld-experiment \
      -c "mkdir -p /app/appcds \
          && java -XX:DumpLoadedClassList=/app/appcds/classes.lst \
                  -cp '$JIB_CLASSPATH' \
                  com.example.helloworld.HelloworldApplication --appcds=true \
          && java -Xshare:dump \
                  -XX:SharedClassListFile=/app/appcds/classes.lst \
                  -XX:SharedArchiveFile=/app/appcds/archive.jsa \
                  -cp '$JIB_CLASSPATH'"
    
  3. Commit the changes
    docker commit helloworld-experiment helloworld-experiment docker rm helloworld-experiment
  4. Produce the new container image w/ a different entrypoint. Was hoping to use Jib CLI for this, but ran into some issues.

    # Produce the classpath again
    JIB_ENTRYPOINT='"/usr/bin/java","-Xshare:on","-XX:SharedArchiveFile=/app/appcds/archive.jsa","-cp","'${JIB_CLASSPATH}'","com.example.helloworld.HelloworldApplication"'
    
    cat << EOF > Dockerfile.appcds
    FROM helloworld-experiment
    
    CMD []
    ENTRYPOINT [ $JIB_ENTRYPOINT ]
    EOF
    
    docker build -f Dockerfile.appcds -t helloworld-experiment:appcds .
    
  5. You can then run the container image w/ AppCDS
    docker run -ti --rm helloworld-experiment:appcds

All 16 comments

Never heard of CDS, but this does seem worth looking into it at some point. Thanks for the feedback! Does CDS only work with -jar my.jar or can work with the classpath launch style (-cp ... com.example.MyMain)?

Wrote a quick post about it: https://rmannibucau.metawerx.net/post/java-class-data-sharing-docker-startup

Long story short, CDS is supported by any java app but classpath beginning must be stable and match the archive so jib can't use lib/*.jar for example.

I wrapped jib in a custom main and gain is ~30% on java 11 (i'm using zulu) on my app (CDI) so definitively worth enabling :).

Thanks for the info. From which Java version can we use this feature?

jib can't use lib/*.jar for example.

@GoogleContainerTools/java-tools-build I remember the discussion that we can't get rid of the classpath wildcard libs/* until Java 9+ because of the max argument length limit or the max command line length (particularly short on Windows). However, looks like modern Linux kernels support up to 2MB and seems like this is not an issue in practice. (We don't have to think about Windows.)

I have lost the confidence. (And who knows if Windows would matter if, e.g., running a Linux container on a Windows dev machine?)

@chanseokoh think it was around java 10 (not sure commands are 100% the same in the first versions). These commands work well on java 11.
BTW I would expect this feature to be off by default since it requires to execute docker commands so somebody enabling it would do it intentionally and classpath would fit the command line anyway I guess?

I took Jib for a spin to see how to do this mechanically (w/o automation). The basics premises are:

  1. JDK arch/distribution needs to be the same
  2. We need JARs - no exploded classpath, and no nested JARs. Just JARs listed in classpath.

It turns out that Jib's packaged containerization mode is a great fit to be able to produce just the JARs! The only issue is AppCDS's -cp can't take wildcard, so we need to list out individual JARs, which is discussed in #2733. Starting with Jib 2.7.0, this can be done by setting <container><expandClasspathDependencies>true.

I was experimenting w/ my Hello World Spring Boot App https://github.com/saturnism/jvm-helloworld-by-example/tree/master/helloworld-springboot-tomcat

  1. Containerize w/ Packaged Mode, to local Docker daemon, and also need to use a debug base image so I can generate the classpath list using shell script.
    mvn package com.google.cloud.tools:jib-maven-plugin:2.7.0:dockerBuild \ -Dimage=helloworld-experiment \ -Djib.containerizingMode=packaged \ -Djib.container.expandClasspathDependencies=true \ -Djib.from.image=gcr.io/distroless/java-debian10:11-debug
  2. Generate the class list, and the archive in the image

    # The Java CLASSPATH is the third element of the default image `ENTRYPOINT`
    # in the Jib-built image, e.g., "java -cp <...classpath...> com.example.MyMain".
    JIB_CLASSPATH=$( docker inspect helloworld-experiment --format '{{(index .Config.Entrypoint 2)}}' )
    
    docker run --entrypoint=sh --name=helloworld-experiment helloworld-experiment \
      -c "mkdir -p /app/appcds \
          && java -XX:DumpLoadedClassList=/app/appcds/classes.lst \
                  -cp '$JIB_CLASSPATH' \
                  com.example.helloworld.HelloworldApplication --appcds=true \
          && java -Xshare:dump \
                  -XX:SharedClassListFile=/app/appcds/classes.lst \
                  -XX:SharedArchiveFile=/app/appcds/archive.jsa \
                  -cp '$JIB_CLASSPATH'"
    
  3. Commit the changes
    docker commit helloworld-experiment helloworld-experiment docker rm helloworld-experiment
  4. Produce the new container image w/ a different entrypoint. Was hoping to use Jib CLI for this, but ran into some issues.

    # Produce the classpath again
    JIB_ENTRYPOINT='"/usr/bin/java","-Xshare:on","-XX:SharedArchiveFile=/app/appcds/archive.jsa","-cp","'${JIB_CLASSPATH}'","com.example.helloworld.HelloworldApplication"'
    
    cat << EOF > Dockerfile.appcds
    FROM helloworld-experiment
    
    CMD []
    ENTRYPOINT [ $JIB_ENTRYPOINT ]
    EOF
    
    docker build -f Dockerfile.appcds -t helloworld-experiment:appcds .
    
  5. You can then run the container image w/ AppCDS
    docker run -ti --rm helloworld-experiment:appcds

2866 added the option jib.container.expandClasspathDependencies, and setting it to false will enumerate the dependency classpath (not yet released).

@saturnism @rmannibucau @koeberlue @holledauer @olivierboudet @bric3 @guillaumeblaquiere @bilak we've released Jib 2.7.0 which added a new configuration option (jib.container.expandClasspathDependencies (Gradle) / <container><expandClasspathDependencies> (Maven)) that enables expanding classpath dependencies in the default java command for an image ENTRYPOINT. Turning on the option (off by default) will enumerate all the dependencies, which will match the dependency loading order in Maven or Gradle builds. For example, the ENTRYPOINT becomes

java ... -cp /app/resources:/app/classes:/app/libs/spring-boot-starter-web-2.0.3.RELEASE.jar:/app/libs/shared-library-0.1.0.jar:/app/libs/spring-boot-starter-json-2.0.3.RELEASE.jar:... com.example.Main

instead of the default

java ... -cp /app/resources:/app/classes:/app/libs/* com.example.Main

Expanding the dependency list will help the AppCDS use case above.

Note that an expanded dependency list can become very long in practice, and we are not sure if there may be a potential issue due to a long command line ("argument list too long" or "command line is too long").

As with other Jib configurations, this option can also be set through the system property (-Djib.container.expandClasspathDependencies=true|false).

Does it work with extra classpath? Cds works with classpath prefix which must be expanded but end can stays a wildcard which helps to mount plugins. Would be great to have that feature without going with jibcore programmatic option.

@rmannibucau no, Jib will just add the list of strings set by extraClasspath as-is, whether it contains a wildcard (*) or not. They are custom classpath, and it's not feasible for Jib to determine or enforce some order of expanding wildcards in custom classpath. (According to https://github.com/GoogleContainerTools/jib/issues/2733#issue-687396881, the loading order of * seems to depend on filesystems (and potentially JVMs)). So it's interesting that AppCDS can safely use a wildcard?

@chanseokoh it is more about ensuring extra classpath is appended to the libs than prepended (recall it was prepended at some point - https://github.com/GoogleContainerTools/jib/pull/1642/files#diff-a5317ef6dce278f4451fa8e298358067261e7d10b3cca71c093673202ebb6d5cR276 ). If prepended it breaks cds, if appended it will keep CDS working well.

Oh, now I understand. For AppCDS to work, it's enough for only some front portion of the entire classpath to be identical and it's fine to have a different classpath entries for the back portion, including using a wildcard, right?

Hmm... yeah, intentionally we prepend extraClasspath so that resources and classes from there take precedence. I wish it were easy to fix #894, which would have made extraClasspath obsolete. Maybe #894 could be supported with a new Jib extension?

@chanseokoh well we can do anything with extensions but it kind of break using jib and using multiple extensions will quickly be hard so let's try to maybe keep it "core" until it is a specific feature? I see three simple options (in terms of usage and impl):

  1. use a placeholder with known keywords: ${projectClasspath}:${extraLibs}
  2. (preferred since easier for everybody to use and impl) add an enum PREPEND/APPEND in extractClasspath
  3. (not directly linked to the order but more this issue) if expanded, extractClasspath goes at the end implicitly, this is more a workaround but works.

To have written several mains manipulating the entrypoint I'm really unhappy with this solution and it does not merge well with a concurrent extension doing the same so hope it hits jib-core/maven-plugin soon.

Thanks for the input. Perhaps it makes sense to reduce the scope of #894 and enable simple keyword substitution only for <entrypoint> (after which extraClasspath can be deprecated).

Just in case, Jib extensions don't run concurrently but in the order they are defined.

Yes in order but combining them is hard, it is easy to break previous one and in practise easier to use exec mvn plugin with jib-core :(.

@saturnism I've updated your AppCDS demo using Jib 2.7.0 which has the option <container><expandClasspathDependencies> to expand the wildcard (*) in the classpath.

Was this page helpful?
0 / 5 - 0 ratings