The shaded client jar does not play well with SPI and loading alternative logging backends. Chatting with @rjernst and @jasontedor about issue #26318 and possible solutions, we came up with a few possibilities and then some questions to answer.
The most immediate solution, given the state of the shaded jar today, is to make the user shade the jar they want to use. If they shade o.a.logging to o.e.c.c.logging , this should work. This does not make it easy for consumers of this jar who want to use a different backend.
Given that the "solution" is not very user friendly, the next question was if we should even shade commons-logging. commons-logging is probably more widely used than commons-http. This is somewhat speculative, but its likely logging is in more applications already than the http client. So this could be the biggest offender for integrating the rest client into an existing application (which is why we shaded it). This is still an open question as to if we shade this or not.
If we do not shade logging, then should we even shade http?
The need for this was necessitated by the fact that we expose commons-http in the low level rest client. Should we expose this? Why are we using commons-http instead of say, netty? If we decide we might use netty or something besides commons-http in the future, should we just revert the shading? Regardless there will be a breaking change if we decide to implement this change, and given that, does it make sense to even shade currently? Does it make the life of enough people simpler that it is worth punting on the logging issue and telling people they need to shade their backend too?
I don't think that we should use any shading for the REST client libraries that we publicly provide. The reason for shading the dependencies (commons-logging, apache-httpclient) in the first place (AFAICS, looking at #25208) was for interoperability with the ES test framework, as our own ES plugins use other versions of said dependencies. The issue lies with the ES test framework here, which puts the plugin under test into the same shared classpath with the REST client. I don't think that the issues of a test component should influence the way we package and expose our main Java client libraries. The question then is: How can we fix the test framework?
The reason for shading the dependencies (commons-logging, apache-httpclient) in the first place (AFAICS, looking at #25208) was for interoperability with the ES test framework
That was part of the reason, but not the only reason. Fixing the test framework is difficult. Even if the rest runner were separated out, there are still tests in core which want to check http, and those need an http client.
I don't think that we should use any shading for the REST client libraries that we publicly provide
If the client has dependencies, there will always be someone to complain about the version of that dependency colliding with their own. I don't have any problem with shading, as long as it is done in a way that tests are run against the shaded version, and apis of the shaded classes are not leaked (the latter of which I lost in debate).
On my end, the main reason for shading was to help users and increase adoption of the java rest client (including the high-level one which depends on the low-level one) rather than our test framework. I agree that the issues of a test component should not influence the way we package and expose our client, but this is not the case. Users would also like us to shade dependencies of the high-level client as well (e.g. jackson). Many applications use the apache http client so if we depend on it, shading it will address potential conflicts for sure.
As for why we depend on commons-logging in the low-level REST client: apache http client uses commons-logging already, so we just stuck with the same library. I am not sure whether there are alternatives that would help in this matter.
It would be nice to hear directly from our users on what would help them the most.
@bleskes @jasontedor @rjernst I was able to shade an older log4j-jcl (2.0.2), and with some manual hackery in META-INF, got things wired up so commons logging is using log4j2 and the log4j-jcl shaded jar. I verified it uses my org.elasticsearch.client.logging.log4j.jcl.LogFactoryImpl, which was modified in META-INF/services/org.apache.commons.logging.LogFactory.
I did run into a problem with newer log4j versions, as there is a binary file that defines the plugins (META-INF/org/apache/logging/log4j/core/config/plugins/Log4j2Plugins.dat), and I just have not been able to fix that up in the limited timeframe. So newer log4j is puking at runtime, and I cannot say whether its solvable or not at present.
So, the answer to whether we can even use another backend thats SPI'd, we can. But it does require significant effort on the end user, and some things may not be compatible (still unsure about the Plugins.dat).
The question of whether its worth it still stands, and we should solve it asap.
I don't have a clear picture of how many people may be using commons-logging together with SPI. If that is an advanced usecase, and not that common, we could accept these issues caused by shading, especially if there is a work-around. By still shading the jar we would help users who use commons-logging without SPI and may have version conflicts. On the other hand, we could stop shading commons-logging but then any commons-logging user who wants to use our low-level client may have version conflicts. I would be leaning towards keeping things as they are for now, till we understand better from users what they need and react upon it.
Whatever we decide to do above, I think it's important to keep on shading the apache http client dependency.
This is the summary of discussions with @bleskes and @javanna.
We do not know what problems we are trying to solve, we feel there is not enough information to make a decision one way or the other whether or not to shade. We have one user saying not shading is fine for now (#25208), and another using having run into trouble with how we approached shading in an earlier 5.6.0-SNAPSHOT (#26318). (If there is additional existing user feedback on this issue, I would appreciate being pointed to it.) The simplest approach here is to not shade, users that need to shade can do this on their end, and they can control exactly what they need shaded. Until we have a better understanding what problems users are going to encounter, we should hold off on introducing shading.
Conclusion:
Please feel free to add or correct any aspects I have wrong.
I also saw some feedback on version conflicts as part of #23331 but that referred to the lucene dependency, which we decided not to shade for now anyway (it's a transitive dependency of the high-level client).
I created #26366 to address feedback on 6.0.0.
thx for all the iterations @hub-cap
Most helpful comment
thx for all the iterations @hub-cap