Elasticsearch: Java High Level REST Client plan for first release

Created on 23 Feb 2017  路  31Comments  路  Source: elastic/elasticsearch

This is a meta issue to track work that's being done on the Java High Level REST Client.

See https://www.elastic.co/blog/state-of-the-official-elasticsearch-java-clients to know more about the state of the official Java clients and the plan moving forward.

The RestHighLevelClient will allow to reuse the same request objects (ActionRequest subclasses) and responses (ActionResponse subclasses) as the current transport client. The client object though will not be the same and the new one will not implement the current Client interface (see #9201 to know why).

The Java High Level REST Client label can be followed to track progress on the high level REST client, the following are the apis that we want to support for its first release:

  • [x] ping api (HEAD localhost:9200/)
  • [x] info api (GET localhost:9200/)
  • [x] get api
  • [x] index api
  • [x] update api
  • [x] delete api
  • [x] bulk api
  • [x] search api

    • [x] hits

    • [x] suggest

    • [x] profile

    • [x] aggregations



      • [x] single bucket (#24564)





        • [x] filter



        • [x] children



        • [x] nested



        • [x] reverse nested



        • [x] missing



        • [x] global



        • [x] sampler





      • [x] multi bucket





        • [x] terms (#24521)



        • [x] histogram (#24213)



        • [x] date histogram (#24213)



        • [x] adjacency matrix (#24700)



        • [x] filters (#24648)



        • [x] range (#24583)



        • [x] date range (#24583)



        • [x] binary range (#24706)



        • [x] geohash grid (#24589)



        • [x] geodistance (#24583)



        • [x] significant terms (#24682)





      • [x] geo centroid


      • [x] geo bounds


      • [x] top hits (#24717)


      • [x] scripted metric (#24738)


      • [x] matrix stats (#24746)


      • [x] numeric metrics





        • [x] stats



        • [x] extended stats



        • [x] stats bucket



        • [x] extended stats



        • [x] min



        • [x] max



        • [x] avg



        • [x] sum



        • [x] value count



        • [x] simple value



        • [x] derivative



        • [x] bucket metric value



        • [x] cardinality



        • [x] tdigest percentiles



        • [x] hdr percentiles



        • [x] tdigest percentile_ranks



        • [x] hdr percentile_ranks



        • [x] percentiles bucket






  • [x] search scroll api
  • [x] clear scroll api
:CorFeatureJava High Level REST Client Meta

Most helpful comment

hi @yingqiaomxi , the high level client hasn't been released yet. It will be soon. The artifact will be published once the first release happens. You can though get the 5.6.0-SNAPSHOT or 6.0.0-beta1-SNAPSHOT snapshots from our own http://snapshots.elastic.co/maven/ repo.

All 31 comments

I'm not sure if its correct place for that... Does HTTP Client will have native client features like custer sniffing and other cluster related features?

@falsyvalues ideally ask on discuss.elastic.co.
HTTP Client already has sniffing with https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/sniffer.html

Not related to the high level client.

Could you make it a short term goal to pull out the model objects (e.g. SearchSourceBuilder, QueryBuilders) into their own maven module?

Right now the existing REST client, Jest, suggests adding the transport module as a dependency to use these request objects/builders, which in my case adds 20mb to my otherwise <10mb bundle-which is especially important because I'm using AWS lambda and a bigger bundle takes longer to get started.

Also, separating out these objects makes it easier to use your own HTTP client if you'd like to-IMO the request and response POJOs are much more useful than the client that abstracts making HTTP requests with those objects and deserialization/serialization, since there are already plenty of java HTTP clients and JSON serialization/deserialization libraries. Also in my case I need to configure the HTTP client to sign requests

EDIT: I am playing with separating out the model classes and it does seem like the model classes have a lot of dependencies on lucene and elasticsearch common classes which could make that a lot more difficult than I expected

2nd EDIT: I was able to get around bringing in the elasticsearch dependency building my query using JsonObjects and JsonArrays with a helper class

3rd EDIT: for anyone who runs into the same scenario as me.. here is the helper class that I used to build the query object https://gist.github.com/moodysalem/585033149bb85f1e6079f1d507f8c72d

hi @moodysalem I agree that ideally the Java client should not depend on Elasticsearch. We would love not to have that but we gave high priority to have the high level REST client out there sooner rather than later, although it will still depend on Elasticsearch initially. Once it's out, we may decide to work on taking those requests and response classes out to a common library, but as you found out that will require quite some work, which is why we decided not to do it straight-away but rather go step by step.

one thing that I can see being a low hanging fruit is that we can move out our core analyzers into a module. This would allow us to move out the analyzers-common JAR that is ~ 11MB which would help a lot with this issue already. I don't think breaking out SearchSourceBuilder is a low hanging fruit at all it has lots of dependencies, I also would rather want to make core smaller and keep on depending on it than doing it the other way round. There is also highlighters that can be moved in to a module or maybe even into the same module for simplicity. Suggesters is a similar thing.. I think we should look into this! @rjernst WDYT

I'd be happy to move the analyzers, suggesters, and highlighters into their own modules. I think it'd be faster to exclude their dependencies from the high level client, though a bit less clean.

++ @nik9000

This would allow us to move out the analyzers-common JAR that is ~ 11MB which would help a lot with this issue already.

I opened https://github.com/elastic/elasticsearch/pull/23614 to start on the analyzers. See the PR description for more but, summary:

  1. It looks like 1.4mb instead of 11mb
  2. It isn't that low hanging fruit. Lots of tests have to modified or moved.

I also took a look at doing suggesters. The phrase and term suggesters would be pretty easy to move. The completion suggester is harder but not super difficult. It'd probably require a wire level breaking change. Which, btw, means that suggesters aren't truly plaggable now. I think. I didn't experiment with it for very long.

My organization is in the middle of a large elasticsearch deployment. We are planning on using the tcp transport client, but would like to know when the high level http client is expected to launch? Do you have an estimated delivery date?

We have a policy of never giving dates so we don't disappoint anyone. Your best bet is to look at the progress around the issues and make an educated guess. I know that is a pain and I'm sorry about it.

Hello, please is there any chance the high-level rest client will not depend on lucene libraries? We need to migrate from old compass implementation to elasticsearch client (with independently running ES instances) and the lucene jars conflict of 10 years difference is unresolvable. However the two search systems must run along each other for a while in the app. I think we are not the only users of old compass who would love to upgrade to ES gradually. Thanks a lot for any hints and info.

hi @pavhofman there is no chance at least in the first stages, but we are discussing the possibility of shading our dependencies. Would that work for you? I'd say that lucene is not the biggest problem as not many users depend on lucene in their applications, but we are going to have conflicts when it comes to http-client and jackson for sure.

hi @javanna I have tried to upgrade our jars to match ES and basically got stuck mosty with lucene (compass is way too old). I think shading would work OK for us if it included lucene jars. We would use only the high-level rest client layer. That would be fantastic help should you decide to do that. Thanks a lot :-)

cool @pavhofman thanks for your feedback we will take this into account.

I very much appreciate your consideration. Thanks.

@pavhofman I have the same issue since I still use Lucene for small ancillary indices. I use the binary transport client for indexing (HTTP has a definite slowdown) to Elasticsearch and the Jest HTTP client for searching: https://github.com/searchbox-io/Jest

Another possibility would be the separation of the client API into Lucene-based actions, such as queries, and non-Lucene-based actions, such as cluster admin, or bulk ingest actions. So everybody could decide if Lucene is included or not by simply declaring the client artifacts in the dependencies.

As I said above, although some people may have problems with the lucene dep, most users will have problems rather with jackson, http-client and so on, which pretty much every application depends on. We are not going to address the lucene problem specifically, we will try to solve it for everybody.

The APIs listed above have all been added to the Rest High Level Client. We are working on documentation and shading deps (#25208). This issue can be closed, nothing left to do here.

The high level rest client looks very nice. It seems that I could not find it in maven repository. Is there a way that I can use it?

hi @yingqiaomxi , the high level client hasn't been released yet. It will be soon. The artifact will be published once the first release happens. You can though get the 5.6.0-SNAPSHOT or 6.0.0-beta1-SNAPSHOT snapshots from our own http://snapshots.elastic.co/maven/ repo.

I will try to use the snapshot first then.

Many thanks!

The snapshot works quite well!

One question, I could not find those apis for administration, for example, indices administration. So, I should do that with the low level client, right?

Yes only few APIs are supported for now, see the description of this issue where they are listed. There is already an issue for the APIs you are looking for, see #25847.

Many thank!

Adding to the latest comments in case anyone is looking for the high level rest client that names changed and are now:

        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-client</artifactId>
            <version>5.6.0-SNAPSHOT</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>5.6.0-SNAPSHOT</version>
        </dependency>

@dadoonet when I am using the below dependency, I am getting error in pom.xml


org.elasticsearch.client
elasticsearch-rest-client
5.6.0-SNAPSHOT


org.elasticsearch.client
elasticsearch-rest-high-level-client
5.6.0-SNAPSHOT

error - Missing artifact org.elasticsearch.client:elasticsearch-rest-client:jar:5.6.0-SNAPSHOT

5.6.0 has been released. Remove the SNAPSHOT part

@dadoonet

Below is my pom.xml file..it is not working.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
     <groupId>tools.test</groupId>
  <artifactId>es-demo</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
   <name>es-demo</name>
   <repositories>
      <repository>
        <id>publicesrepo</id>
        <name>publicesrepo</name>
        <url>http://maven.elasticsearch.org/public-releases</url>
    </repository>
    </repositories>
  <dependencies>
   <dependency>
  <groupId>junit</groupId>
  <artifactId>junit</artifactId>
  <version>3.8.1</version>
  <scope>test</scope>
</dependency>
 <dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>5.6.0</version>
  </dependency>
 <dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>5.6.0</version>
 </dependency>
<dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
        <version>2.8.5</version>
 </dependency>
</dependencies>

Error: Original error: Could not transfer artifact org.elasticsearch:elasticsearch:jar:5.6.0 from/to nexus (http://vm-maslxjavadev01.tools.org:8083/repository/maven-public/): unexpected end of stream

Can you please suggest the solution?

@malhotras Please ask your questions on discuss.elastic.co (with a good formatting please) where we can give a better support.

Thanks!

I'd be happy to move the analyzers, suggesters, and highlighters into their own modules. I think it'd be faster to exclude their dependencies from the high level client, though a bit less clean.

can you give an example of what is needed at bare minimum and what to exclude in a maven pom file for the REST client to work?

At the very least you guys can document this properly in examples.

Was this page helpful?
0 / 5 - 0 ratings