Elasticsearch version (bin/elasticsearch --version): 6.2.2
JVM version (java -version): 9.0.4
OS version (uname -a if on a Unix-like system): Ubuntu 17.10
Description of the problem including expected versus actual behavior:
We are migrating our Maven project to Java 9.
In each module of our project, we add a module-info.java.
Everything works fine for every dependencies except elasticsearch.
Requiring it from module-info fails with:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.7.0:compile (default-compile) on project foo: Compilation failure
[ERROR] /foo/src/main/java/module-info.java:[10,12] module not found: elasticsearch
Steps to reproduce:
Analysis:
I debugged maven-compiler-plugin and that lead me to jdk.internal.module.ModulePath.
It fails in jdk.internal.module.ModulePath at line 557:
// parse each service configuration file
for (String sn : serviceNames) {
JarEntry entry = jf.getJarEntry(SERVICES_PREFIX + sn);
List<String> providerClasses = new ArrayList<>();
try (InputStream in = jf.getInputStream(entry)) {
BufferedReader reader
= new BufferedReader(new InputStreamReader(in, "UTF-8"));
String cn;
while ((cn = nextLine(reader)) != null) {
if (cn.length() > 0) {
String pn = packageName(cn);
if (!packages.contains(pn)) {
String msg = "Provider class " + cn + " not in module";
throw new InvalidModuleDescriptorException(msg); // <-- FAILING HERE
}
providerClasses.add(cn);
}
}
}
if (!providerClasses.isEmpty())
builder.provides(sn, providerClasses);
}
The issue comes from the fact that no elasticsearch packages match the following declared Java services:
Those service declarations either need to be moved to another project, or a fake class should be added to org.apache.lucene.codecs.
no elasticsearch packages match the following declared Java services
Can you elaborate on this? I think Codec and DocValuesFormat are leftover from long ago when we had custom versions of these in ES. PostingsFormat contains Completion50PostingsFormat which is still used. Is the issue just that Codec and DocValuesFormat are empty (we can remove them)?
I added a Gradle test module in the PR, but I couldn't reproduce my maven project behaviour.
Maybe it is just a Maven issue.
@rjernst,
I didn't see that both of them were empty.
Still, the issue is also caused by org.apache.lucene.codecs.PostingsFormat for which I had to add 2 fake classes:
This is really weird.
I am creating a git repository with a Maven project reproducing the issue.
I created https://github.com/Cosium/jigsaw-elasticsearch which allows to reproduce the issue.
And a ticket in Maven compiler tracker https://issues.apache.org/jira/browse/MCOMPILER-328
https://issues.apache.org/jira/browse/MCOMPILER-328?focusedCommentId=16398365&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16398365 confirms that Maven get the automodule name directly from the JDK.
That would mean that Gradle computes the module name itself, which is more brittle.
I created a Gradle equivalent:
https://github.com/Cosium/jigsaw-elasticsearch-gradle
I reproduce the same issue as Maven.
Therefore:
Pinging @elastic/es-core-infra
After much debugging !
The modulepath should contain all modular jars. That includes jars containing well defined module-info.java and jar eligible to automodule.
Elasticsearch is not currently eligible to automodule because it violates some Jigsaw encapsulation about Java services.
A jar in the module path can't access a jar in the classpath.
For each jar:
With elasticsearch as unique dependency:
[DEBUG] Command line options:
[DEBUG] -d /srv/git/cosium/jigsaw-elasticsearch/target/classes \
-classpath /srv/git/cosium/jigsaw-elasticsearch/target/classes:/home/rhousni/.m2/repository/org/elasticsearch/elasticsearch/6.2.2/elasticsearch-6.2.2.jar: \
-sourcepath /srv/git/cosium/jigsaw-elasticsearch/src/main/java:/srv/git/cosium/jigsaw-elasticsearch/target/generated-sources/annotations: \
-s /srv/git/cosium/jigsaw-elasticsearch/target/generated-sources/annotations \
-g -verbose -nowarn -target 9 -source 9
With commons-io as unique dependency:
[DEBUG] Command line options:
[DEBUG] -d /srv/git/cosium/jigsaw-elasticsearch/target/classes \
-classpath /srv/git/cosium/jigsaw-elasticsearch/target/classes: \
--module-path /home/rhousni/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar: \
-sourcepath /srv/git/cosium/jigsaw-elasticsearch/src/main/java:/srv/git/cosium/jigsaw-elasticsearch/target/generated-sources/annotations: \
-s /srv/git/cosium/jigsaw-elasticsearch/target/generated-sources/annotations \
-g -verbose -nowarn -target 9 -source 9
Gradle doesn't make any kind of jigsaw module autodetection.
By default, even in Java 9, Gradle will put all dependencies into the classpath.
This is confirmed by the tutorial that expects us to populate the --module-path and override -classpath manually.
The experimental jigsaw plugin, seems to do the same thing:
https://github.com/gradle/gradle-java-modules/blob/master/src/main/java/org/gradle/java/JigsawPlugin.java
The jar built by my project being defined by a module-info.java, it must resides in --module-path. Since elasticsearch fails the JDK automodule selection phase during Maven build, elasticsearch ends in -classpath, making it unreachable by my jar.
IMO, there is no point in trying to reproduce the current issue with a dedicated Gradle module, since it entirely depends on the way the end user will configure Gradle. Besides, it is unlikely that Gradle user is aware of the current issue since he will probably never use the JDK api allowing to select automodules.
So the fix seems to be the good one to me.
The question seems to be what kind of automatic test could detect this kind of defect.
I think it would be easier to have a Maven module, since Maven has the correct behaviour.
What do you think?
Ping @rjernst , @jasontedor
Ping @rjernst, @jasontedor
I don't understand how this is a >non-issue. On the contrary, this is a serious issue for anyone who wants to use Elasticsearch with the module system and I think it should be investigated.
@nicolaiparlog The label was a mislabeling and I have removed it.
@jasontedor the label is still on the linked PR
@jasontedor any idea on when this will be fixed/included in a release? thx!
You can only fix this by making Elasticsearch a real module. Automodules don't work for Elasticsearch. There are several problems: Elasticsearch contains classes from a package it does not own (org.apache.lucene) and also defines SPIs for them. This needs to be exposed via module-info.java. But this won't still not work, as Lucene is not module conformant (for the same reasons).
@uschindler as the discussion in this issue shows, there is a fix available, it only has to be merged and released. So converting to an explicit module is not needed yet. Putting an automatic module name in the manifest does no harm, on the contrary, it makes sure that elasticsearch reserves its module name.
The problem is that it won't still not work at runtime, because the SPI won't load. The good thing is that Elasticsearch does no longer implement SPIs (Codecs). And if you just use the Elasticsearch client, you won't run into trouble, because Lucene is just a dependency, but it's actually not used (only for classloading).
So in short there are multiple problems: The empty/missing META-INF/services are a bug in Elasticsearch. This violates the spec (also for older Java versions, but nothing cares). Once you have fixed this and added an automatic module name, it compiles.
But this does not mean that it works at the end. Lucene unfortunately is defining the same packages in several modules, so at runtime this would cause issues with automatic modules. There is no fix for this at the moment and there won't be one for longer time (this would require a complete restructuring of Lucene). In addition, the SPI infrastructure of Lucene needs an update to be Java 9 compliant, which is also not yet done.
If you use Elasticsearch at the moment just as a client, then it could be fine also at runtime, but that's just because Java does not fully verify the module requirements/corectness at the moment.
@uschindler thx for the clarification. I do think that if this gets fixed for elasticsearch it can be deployed as automatic module and we can keep lucene on the classpath for now. (automatic modules can read from the classpath) That way at least explicit modules can use the elasticsearch automatic module, and lucene remains on the classpath until a redesign has been done.
@jasontedor can someone prioritise this issue so that the elasticsearch jar is at least compatible with jdk specs? i.e. including META-INF/services files for classes and packages that don't exist in the jar is just wrong.
Then the community can start using elasticsearch on the module path of the jdk.
Thx in advance.
I did some tests on the current master. Since we don't support elasticsearch, I think that the us-case that triggers this is being able to write a modularized application that talks to ES via one of the clients.
Since ES doesn't produce modules yet, one can only use automatic modules.
the approach I took was to start with this modules-info.java:
module org.elasticsearch.qa {
requires elasticsearch.rest.high.level.client;
}
And just a class to test the communication:
package org.elasticsearch.qa;
import org.apache.http.HttpHost;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;
public class SmokeTestHighLevelRestClient {
public static void main(String [] args) {
var client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http"))
);
var searchRequest = new SearchRequest();
var searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchRequest.source(searchSourceBuilder);
client.search(searchRequest, RequestOptions.DEFAULT);
}
}
Note that this now requires to have more requires in modules-info.java:
module org.elasticsearch.qa {
requires elasticsearch.rest.high.level.client;
requires elasticsearch;
requires httpcore;
}
We have to tell Gradle to put these jars on the module path and make them automatic modules.
configurations {
modulePath
}
dependencies {
modulePath "org.elasticsearch.client:elasticsearch-rest-client:$version"
}
configurations {
modulePath
compile.extendsFrom modulePath
}
dependencies {
modulePath "org.elasticsearch.client:elasticsearch-rest-client:$version" {
exclude group: " org.apache.lucene"
}
}
Like @uschindler mentions, elasticsearch has classes in lucene packages and lucene itself has packages spread across modules.
The lather wouldn't be a problem for ES at least not with automatic modules as we could keep Lucene on the classpath only as automatic modules are allowed to access it and it's used for class-loading.
Elasticsearch itself however has the same problem as lucene, it uses the same package in different modules, and fixing that would require shuffling packages around and breaking changes. E.x. both the elasticsearch and high-level-rest-client jar have org.elasticsearch.client.
Probably the best path to be able to use the high level rest client in a modular application is to remove elasticsearch(server) as a dependency, and reshuffle packages so it doesn't have any in common with the low level rest client. If the dependencies allow it we could potentially make it into a real module.
I was not able to replicate the issues with the empty service files just by requiring elasticsearch from a modules-info. I haven't checked what changed, but the linked PR that adds a reproduction is old. I tried against latest master.
module org.elasticsearch.qa {
requires elasticsearch;
}
With this Gradle setup
onfigurations {
modulePath {
transitive false
}
}
dependencies {
modulePath "org.elasticsearch:elasticsearch:$version"
}
compileJava {
doFirst {
options.compilerArgs += [
'--module-path', configurations.modulePath.asPath,
]
}
}
I'm not saying we shouldn't remove the empty service files. I'll create a PR for that, no need to keep them around, but I don't see how this setup could be useful.
It doesn't pick up transitive dependencies so it can't do anything usefully, and picking up transitive dependencies on the module path breaks it.
To summarize I would like to move forward with the following:
@jasontedor @rjernst thoughts ?
Right now the best bet for modular applications is to use the low level rest client, or keep all the elasticsearch jars on the classpath and have all ES specific code localized in an automated module in their application to be able to access them. This would allow the rest of the application to benefit from modularization without requiring elasticsearch to change.
I was not able to replicate the issues with the empty service files just by requiring elasticsearch from a modules-info
As mentionned in the previous messages, if you did that with Gradle it's normal.
Try with Maven.
Closing in favor of #38299
Most helpful comment
I don't understand how this is a
>non-issue. On the contrary, this is a serious issue for anyone who wants to use Elasticsearch with the module system and I think it should be investigated.