Elasticsearch REST API: JEST upsert

I’ve already written about tips and tricks when using the Elasticsearch Java API. The Elasticsearch REST API has been going from strength to strength, and it seems that going forward the Elasticsearch team will focus more on the REST API than the native JAVA client. At the time of writing however, the official java REST library doesn’t seem to have support for the abstraction of the bulk API, so I followed some advice and looked into the JEST library.

The only snag with the Jest library is that when it comes to bulk operations, the documentation only gives examples of scripted updates. The Elasticsearch update API also allows for updates using partial documents. Jest supports this functionality, but I couldn’t find good documentation for this. Here-under is an example for anyone looking for this:


import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
import io.searchbox.core.Update;
import io.searchbox.core.Bulk;
import io.searchbox.core.Bulk.Builder;
import io.searchbox.action.BulkableAction;
import io.searchbox.client.JestClient;
import io.searchbox.client.JestClientFactory;
import io.searchbox.client.JestResult;
import io.searchbox.client.config.HttpClientConfig;
//initialise objects required to do bulk updates
JestClientFactory factory = new JestClientFactory();
factory.setHttpClientConfig(new HttpClientConfig
.Builder("http://"+ElasticSearch_IP+":9200")
.multiThreaded(true)
.build());
client = factory.getObject();
current_bulkable_action = 0;
bulkProcessor = new Bulk.Builder();
// the below assumes "hit" comes from the ES document you'd like to update
String ES_ID = this.hit.get("_id").getAsString();
String ES_INDEX = this.hit.get("_index").getAsString();
String ES_TYPE = this.hit.get("_type").getAsString();
jsonUpdateObject = jsonBuilder().startObject().startObject("doc");
jsonUpdateObject
.field("foo", "bar")
.field("john", "doe");
jsonUpdateObject.endObject().endObject();
update_builder = new Update.Builder(this.jsonUpdateObject.string()).index(ES_INDEX).id(ES_ID).type(ES_TYPE).build());
bulkProcessor.addAction(update_builder);
current_bulkable_action += 1;
if (current_bulkable_action > 1000) {
try {
System.out.println("Flushing BULK processor");
client.execute(bulkProcessor.build());
current_bulkable_action = 0;
} catch (IOException e) {
e.printStackTrace();
}
}

The important points:

  • You can still use the official java elasticsearch client’s “XContentFactory.jsonBuilder” library to more easily build your JSON objects.
  • The trick is in line 26 above:

jsonBuilder().startObject().startObject(doc”)

This creates a nested object with “doc” as the inner JSON object, as outlined by the elasticsearch documentation:

{
    "doc" : {
        "name" : "new_name"
    }
}

The first “startObject()” creates the outer curly brackets, while the second startObject(“doc”) creates the inner “doc” object.

  • We add content to the JSON object in lines 27-29
  • Just like we had to use two startObject() calls, we need to close the object with two endObject() calls as shown in line 31

The rest of the snippet deals with the actual bulk update. We pass the object we just created into an Update Builder, which gives us a “Bulkable Object” that we can pass on to the jest bulk processor. The snippet is taken from a larger program where it resides in a loop – which explains the if/else clause in lines 37-48; it’s important to “flush” the bulk service every so often. The native java client would to this automatically – so far in Jest you need to account for this yourself

 

 

 

 

 

 

 

Advertisement