We currently have a shared version of http client and its dependencies (eg various apache commons libs). This version must be used across any plugin/module that uses the ES test framework. Note that we have no such dependency in core, yet still the version must alight because the test framework is pulled in everywhere, and the rest client is part of the test framework. This is particularly troublesome as we sometimes force a version of these dependencies which does not align with what the plugin requires from other dependencies (eg see discussion in #24067).
In order to allow each plugin to use their own version of httpclient and its deps, we should shade http client and all other deps of the rest client, and package it in an uber jar. This will make the rest client standalone, so that no conflict will occur when the client is on the test classpath with any plugin also depending on http client.
This came up in some comments on #23331. I think that shading REST client dependencies (both low level and high level) will help a lot users, as they are going to have conflicts with jackson, http-client etc.
To be clear, I am only talking about shading the low level client deps, ie http client/apache commons stuff. We should not try shading ES or lucene, that has all kinds of issues which we should solve by removing those deps altogether at a later time.
I think we should consider shading dependencies for the high level REST client too. That is going to be a huge help for users and will certainly help adoption too. Shading the Elasticsearch dep is not needed, but I would consider doing it for lucene, and of course jackson and friends. What issues are there with shading lucene? I agree that we should remove those deps altogether, but given that it will take time, shading would be an ok temporary solution till we get there.
If technically feasible, I'd fully agree with Luca here.
Not sure if we can test when people are embedding as well another Lucene version in our tests though. (Like what Hibernate search does: embedding elasticsearch client and a Lucene version IIRC).
Shading becomes difficult with SPI, which Lucene uses. It may not matter for how Lucene is found within the high level client, but I think we should start simple. Shading the low level client deps will solve a large portion of the problems (and solve many of our test problems with httpclient as well).
We briefly discussed this in FixItFriday and we said there isn't much to discuss after all. We should start with shading low level clients deps and then see whether we need to do more or not.
I tried to look at this and to implement it (spoiler: with no success)
I looked at the Shadow Gradle plugin which is a port of the well known Apache Maven Shade Plugin for Gradle. The documentation is good and it's easy to configure it on a project and to build a shadow jar (in addition to the normal Jar) that includes all compile dependencies. It adds a "shadow" configuration and a "shadowJar" task to the project and also allows to relocate packages by manipulating the bytecode of the classes (using asm) before writing them in the jar file.
It is also possible to use Transformers during the shadowing to do various things like removing duplicate license/notice files or merge SPI service files (that can help to shade Lucene dependencies). I still think that we'll have to write our own transformers just to not mess up the license files.
In multiprojects build, it is possible to use the 'shadow' configuration in a project that depends on the shadow jar with something like this:
dependencies {
compile project(path: ':client:rest', configuration: 'shadow')
}
Sadly I didn't manage to get it working correctly with the low level REST client and the high level REST client.
Also, I don't really like the fact that classes are modified just before being packaged as a Jar while the tests are executed using the non shaded classes. Same thing for debugging code. Some dependencies might rely on hard package names to load a resource or a factory and I'd be more comfortable if our tests ran with already shaded classes so that we can catch such things more easily.
I ended up writing some code to try to shade the dependencies using a custom plugin (see https://github.com/tlrx/elasticsearch/commit/b552340d11abaaf66f2cd8b98ce0a4b5f20adea5). It basically unzip the jar files of dependencies marked as 'shaded' and modify classes
to relocate packages before the compileJava task is executed. This way project code can rely on already shaded classes, but it brings a lot of complexity to make it work
correctly with IDEs and sub projects too.
That's where my little knowledge of gradle and our build brought me, I'd be happy if someone can have a quick look and give his thoughts?
Also, I don't really like the fact that classes are modified just before being packaged as a Jar while the tests are executed using the non shaded classes.
I would not do this, for that exact reason. I think we could instead do this by modifying the dependencies before the code is even compiled. I'll be happy to take a look.
@rjernst I agree, that's why I tried another solution. I'd be happy to help implement this.
What's the status on this? We'd like to use the high-level REST client, but there are no artifacts available as of yet.
Since most of the high level client code bypasses Lucene (AFAICT) would it be possible to ignore the SPI stuff?
With regards to testing: What we did successfully is to have a pure test project that depends on the shaded jar and just performs black box tests against the API. This would ensure that the shaded jar works as expected.
hi @manuel-woelker the high level REST client hasn't been released yet, hence you can't find artifacts on maven central yet. Most likely we won't have shading yet for 5.6 & 6.0-beta1, then shading will come later.
An unshaded artifact would be fine for now too.
According to https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-usage-maven.html I should see the elasticsearch-rest-high-level-client here: http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.elasticsearch.client%22 but it is not showing up for me.
I now assume that is due to the current development state, but still had me confused for a while.
@manuel-woelker that is the docs for an upcoming release. The artifact will be there once the release happens. Sorry about that! Some of the confusion comes also from the fact that the current 6.0-alpha2 docs point to master which makes you think that 6.0-alpha2 included the high level client, but it didn't.
You can though get the 5.6.0-SNAPSHOT or 6.0.0-beta1-SNAPSHOT snapshots from our own http://snapshots.elastic.co/maven/ repo.
I added a note to the docs to clarify that the high level client will come with 6.0.0-beta1. Hope that helps: https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high.html .
@javanna @rjernst : Do we have shading for High level rest client as well? I don't find it on docs.
Sorry for spamming here.
@sabhishekgowda The shading changes were rolled back because they were causing a number of issues. There was a call for feedback, but we did not receive much. There is some guidance on shading the jars on your own if you need it here.
Most helpful comment
@sabhishekgowda The shading changes were rolled back because they were causing a number of issues. There was a call for feedback, but we did not receive much. There is some guidance on shading the jars on your own if you need it here.