Add msearch api to high level client #27274

martijnvg · 2017-11-06T10:27:19Z

No description provided.

javanna

thanks a lot @martijnvg for working on this. I left some comments, mainly around testing and minor stuff.

javanna · 2017-11-06T13:43:14Z

client/rest-high-level/src/main/java/org/elasticsearch/client/Request.java

@@ -381,6 +383,17 @@ static Request clearScroll(ClearScrollRequest clearScrollRequest) throws IOExcep
        return new Request("DELETE", "/_search/scroll", Collections.emptyMap(), entity);
    }

+    static Request multiSearch(MultiSearchRequest multiSearchRequest) throws IOException {
+        Params params = Params.builder();
+        if (multiSearchRequest.maxConcurrentSearchRequests() != 0) {


Nit: I would rather not duplicate default values in the client. What's the downside of always passing the parameter no matter what its value is?

Not sure why I did this... I'll always pass down this parameter.

I remember now why I needed to add this. 0 is used as not specified and the maxConcurrentSearchRequests(...) setter doesn't accepts values lower than 1. So checking whether any other value than 0 is specified is required.

I see, yea we can't avoid this then. Maybe share the default value through a constant so at least we don't duplicate it. Odd! :)

this is odd especially because it seems that once you set a value for this field, you can never reset it to its original default value.

agreed, I think this should be fixed and the setter should allow 0 as valid value. So that the msearch api falls back to default based on number of cores.

javanna · 2017-11-06T13:47:36Z

client/rest-high-level/src/test/java/org/elasticsearch/client/MultiSearchIT.java

+        client().performRequest("PUT", "/index3/doc/5", Collections.emptyMap(), doc5);
+        StringEntity doc6 = new StringEntity("{\"field\":\"value2\"}", ContentType.APPLICATION_JSON);
+        client().performRequest("PUT", "/index3/doc/6", Collections.emptyMap(), doc6);
+        client().performRequest("POST", "/index1,index2,index3/_refresh");


does it make sense to merge this class with the existing SearchIT?

javanna · 2017-11-06T13:49:10Z

client/rest-high-level/src/test/java/org/elasticsearch/client/MultiSearchIT.java

+        SearchIT.assertSearchHeader(multiSearchResponse.getResponses()[2].getResponse());
+        assertThat(multiSearchResponse.getResponses()[2].getResponse().getHits().getTotalHits(), Matchers.equalTo(1L));
+        assertThat(multiSearchResponse.getResponses()[2].getResponse().getHits().getAt(0).getId(), Matchers.equalTo("6"));
+    }


shall we also have a test for some other common feature like aggregations / highlighting etc. performed via _msearch?

also maybe have a test around failures, which can be tricky.

I don't think we need to test all possible search features here? I think this covered by the SearchIT?

I'll try to add tests for failures.

not all features, just a few of the most common ones I 'd say.

javanna · 2017-11-06T13:55:34Z

core/src/test/java/org/elasticsearch/action/search/MultiSearchResponseTests.java

+        return new MultiSearchResponse(items, randomNonNegativeLong());
+    }
+
+}


can you add some test to RequestTests which tests the request conversion from MultiSearchRequest to Request (enpoint, method, params etc.)?

javanna · 2017-11-06T20:51:52Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchRequest.java

+    }
+
+    public static BytesRef writeMultiLineFormat(MultiSearchRequest multiSearchRequest, XContent xContent) throws IOException {
+        BytesRefBuilder builder = new BytesRefBuilder();


is the dependency on lucene necessary here? Could we find another way to do the same?

I looked at another api where BytesRef was used, I'll try to covert to BytesReference.

yea I see that also bulk depends on BytesRef which is not great. If it's too much work we can do it as a follow-up.

javanna · 2017-11-07T09:31:25Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchRequest.java

+            try (XContentBuilder xContentBuilder = XContentBuilder.builder(xContent)) {
+                xContentBuilder.startObject();
+                if (request.indices() != null) {
+                    xContentBuilder.field("indices", request.indices());


nit: could we use index rather than indices? That is what we have in our docs, indices is just a synonym but I don't think that other clients use it either.

javanna · 2017-11-07T09:47:14Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchResponse.java

+    }
+
+    static MultiSearchResponse.Item itemFromXContent(XContentParser parser) throws IOException {
+        // TODO: Is there a better way? The msearch item format is very tricky...


can you elaborate on what is tricky? Is that because items can either be errors or proper instance of Item ?

If the only key is error then we need we need to parse the xcontent is an error an otherwise parse as a search response. In order to read the the first field we need to parse it up until the field FIELD token is found to decide what to do, but then we are already to far to use SearchResponse#fromXContent(...), it requires that the first token is a start object token.

gotcha. shall we have an innerFromXContent that does not require start_object then, and accepts that we are already at the field_name token? That could be called from msearch? Exceptions should be parseable already with this technique.

javanna · 2017-11-07T09:53:58Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchRequest.java

+
+    @Override
+    public int hashCode() {
+        return Objects.hash(maxConcurrentSearchRequests, requests, indicesOptions);


can you add specific unit tests for equals / hashcode ?

javanna · 2017-11-07T09:54:25Z

core/src/test/java/org/elasticsearch/action/search/MultiSearchResponseTests.java

+        }
+    }
+
+    private MultiSearchResponse createTestInstance() {


could this be static?

javanna · 2017-11-07T09:57:31Z

core/src/test/java/org/elasticsearch/action/search/MultiSearchResponseTests.java

+            MultiSearchResponse expected = createTestInstance();
+            XContentType xContentType = randomFrom(XContentType.values());
+            BytesReference shuffled = toShuffledXContent(expected, xContentType, ToXContent.EMPTY_PARAMS, false);
+            XContentParser parser = createParser(XContentFactory.xContent(xContentType), shuffled);


can you also add a test that inserts random fields here so that we test forward compatibility? Have a look at SearchResponseTests#testFromXContentWithRandomFields for an example of that.

martijnvg · 2017-11-08T11:20:26Z

@javanna Thanks for reviewing. I've updated the PR.

javanna

I left a couple of comments and questions but LGTM

javanna · 2017-12-01T15:46:00Z

client/rest-high-level/src/main/java/org/elasticsearch/client/Request.java

+    static Request multiSearch(MultiSearchRequest multiSearchRequest) throws IOException {
+        Params params = Params.builder();
+        params.putParam(RestSearchAction.TYPED_KEYS_PARAM, "true");
+        if (multiSearchRequest.maxConcurrentSearchRequests() != 0) {


would you mind adding a constant in MultiSearchRequest for the 0 default value?

javanna · 2017-12-01T15:57:41Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchResponse.java

+        assert token == Token.FIELD_NAME;
+        boolean statusBeenParsed = false;
+        String fieldName = parser.currentName();
+        if ("status".equals(fieldName)) {


I think that this parser relies on order of keys? can we change it so that it doesn't?

we have a test util method that we use to shuffle fields in the response so we make sure that our parsers don't rely on specific keys ordering.

javanna · 2017-12-01T16:00:46Z

core/src/test/java/org/elasticsearch/action/search/MultiSearchResponseTests.java

+public class MultiSearchResponseTests extends ESTestCase {
+
+    public void testFromXContent() throws IOException {
+        for (int runs = 0; runs < 20; runs++) {


nit: I think that we can drop this loop here and in other tests, and instead just rely on multiple runs of the same tests in our CI.

The abstract parsing tests and query builder tests do iterate for several times and that is why I added this loop here, since we extend directly from ESTestCase. I prefer to keep it, if you are ok with it.

javanna · 2017-12-01T16:28:06Z

core/src/test/java/org/elasticsearch/action/search/MultiSearchResponseTests.java

+            XContentType xContentType = randomFrom(XContentType.values());
+            BytesReference shuffled = toShuffledXContent(expected, xContentType, ToXContent.EMPTY_PARAMS, false);
+            if (randomBoolean()) {
+                shuffled = insertRandomFields(xContentType, shuffled, s -> s.contains("responses"), random());


I think that it makes little sense to call insertRandomFields if random fields can only be inserted at the root level (the exclude filter identifies the only root element expected). But I am guessing you are doing this because exceptions cause issues if we insert random elements in their json, and being responses an array, it is not possible to identify exceptions only in an exclude filter?

True, it doesn't make a lot of sense. Identifying whether something is a search response or a exception becomes not possible with the current parsing logic.

I'll remove the insertion of random fields.

javanna · 2017-12-01T16:33:39Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchResponse.java

+            item = new Item(SearchResponse.innerFromXContent(parser), null);
+            assert parser.currentToken() == Token.END_OBJECT; // SearchResponse.innerFromXContent(...) consumes the entire json object
+        }
+        return item;


looking at the code, it seems like this parsing method relies on keys ordering, but when I look at its test it shuffles the keys which means I am not reading it right. I guess the point is that status is ignored, and it will either ignored when it appears before any other element, or while parsing exception or search hits? Maybe that's what your comment above explained but I didn't get it.

It does not really relies on key ordering but instead it checks the field one after the other, ignoring status if found as the first field or ignoring it again right after error, or let it be parsed by the SearchResponse.innerFromXContent().

I agree that this parsing logic is not easy to do but I'm wondering if it could be more readable with the "usual" while loop that skip children on status field, and delegates to the appropriate parsing methods for error and search response?

What @tlrx is saying is correct, it the parsing does not rely on ordering.

I think I can rewrite this code into a loop. However becauseElasticsearchException.failureFromXContent(...) does not parse until end object and SearchResponse#innerFromXContent(...) does, the loop will have to immediately break in order to not read other search responses.

sounds good to me Martijn thanks.

tlrx

This looks good overall! I left some minor comments and I'd like to know if the parsing logic in MultiSearchResponse.itemFromXContent() could be improved a bit. If that's not feasible or has too much impact on other parsing method then it can go in like this.

tlrx · 2017-12-04T10:06:13Z

client/rest-high-level/src/test/java/org/elasticsearch/client/RequestTests.java

+            searchRequest.scroll((Scroll) null);
+            // only expand_wildcards, ignore_unavailable and allow_no_indices can be specified from msearch api, so unset other options:
+            IndicesOptions randomlyGenerated = searchRequest.indicesOptions();
+            IndicesOptions msearchDefault = IndicesOptions.strictExpandOpenAndForbidClosed();


nit: could be new MultiSearchRequest().indicesOptions() instead

tlrx · 2017-12-04T10:29:43Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchRequest.java

+    public String toString() {
+        return "MultiSearchRequest{" +
+                "maxConcurrentSearchRequests=" + maxConcurrentSearchRequests +
+                ", requests=" + requests +


I'm a bit reluctant to have all the search requests rendered here. The SearchRequest.toString() method is verbose and spits out the whole content of the request, so the result here could be very large and I'm worried that it fills some logs or slows down the IDE when debugging. I also don't see much benefit to output everything, but maybe I'm wrong. What do you think?

I agree. I'll remove the toString() method.

tlrx · 2017-12-04T10:38:27Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchRequest.java

+    }
+
+    public static byte[] writeMultiLineFormat(MultiSearchRequest multiSearchRequest, XContent xContent) throws IOException {
+        ByteArrayOutputStream output = new ByteArrayOutputStream();


We could use BytesStreamOutput

I don't think there is an added benefit for using BytesStreamOutput? Also getting the raw byte[] which is what gets used, requires extra steps (BytesReference -> BytesRef -> copy BytesRef's byte buffer)

Also getting the raw byte[] which is what gets used, requires extra steps (BytesReference -> BytesRef -> copy BytesRef's byte buffer)

Good point

tlrx · 2017-12-04T10:39:12Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchResponse.java

+    private static final ConstructingObjectParser<MultiSearchResponse, Void> PARSER = new ConstructingObjectParser<>("multi_search",
+            true, a -> new MultiSearchResponse(((List<Item>)a[0]).toArray(new Item[0]), (long) a[1]));
+    static {
+        PARSER.declareObjectArray(constructorArg(), (p, c) -> itemFromXContent(p),RESPONSES);


nit: space before RESPONSES

tlrx · 2017-12-04T12:43:59Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchResponse.java

+            item = new Item(SearchResponse.innerFromXContent(parser), null);
+            assert parser.currentToken() == Token.END_OBJECT; // SearchResponse.innerFromXContent(...) consumes the entire json object
+        }
+        return item;


It does not really relies on key ordering but instead it checks the field one after the other, ignoring status if found as the first field or ignoring it again right after error, or let it be parsed by the SearchResponse.innerFromXContent().

I agree that this parsing logic is not easy to do but I'm wondering if it could be more readable with the "usual" while loop that skip children on status field, and delegates to the appropriate parsing methods for error and search response?

martijnvg · 2017-12-04T15:56:36Z

Thanks for looking @javanna @tlrx. I've update the PR and changed the multi search response item parsing to be a loop.

tlrx · 2017-12-04T16:16:39Z

core/src/main/java/org/elasticsearch/action/search/MultiSearchResponse.java

+                    if ("status".equals(fieldName)) {
+                        // Ignore the status value
+                    } else {
+                        throw new IllegalArgumentException("unexpected field name [" + fieldName + "]");


I think it could be more permissive and just ignore the field? We used to be permissive when parsing responses in order to have a better forward compatibility...

right, I've changed it.

it should be permissive....but we are not testing forward comp properly because of the structure of these objects. This is quite a bummer, we should probably look into testing this, otherwise permissive or not doesn't make a difference as we don't test it.

@javanna Don't we usually inject random fields to test this?

yes but we discussed as part of this review issues around testing this, because the response contains an array of objects which can be either exceptions or an ordinary search response. Exceptions still can hold metadata fields, hence injecting random fields in there would cause issues and needs to be skipped. It is complicated to distinguish between exceptions and search responses in the array, hence we don't currently insert random fields.

Right, thanks @javanna, I've forgot this discussion. As long as it is permissive, even not tested, it's OK to be merged.

Cool, I'll merge once the pr build is green.

GoldenHans · 2018-01-29T01:06:34Z

hello! Could u tell me when this method is released? i really need it! Thank you for your help!

javanna · 2018-01-29T08:45:56Z

@GoldenHans it will be released with 6.2. It should come out soon.

martijnvg added :Java High Level REST Client >enhancement v6.1.0 v7.0.0 labels Nov 6, 2017

martijnvg requested a review from javanna November 6, 2017 10:27

javanna mentioned this pull request Nov 6, 2017

Java high-level REST client completeness #27205

Closed

80 tasks

javanna requested changes Nov 7, 2017

View reviewed changes

martijnvg force-pushed the hl_client_mseach_api branch from 7fb8c2e to 3672923 Compare November 8, 2017 13:14

javanna self-assigned this Nov 13, 2017

martijnvg added v6.2.0 and removed v6.1.0 labels Nov 22, 2017

javanna approved these changes Dec 1, 2017

View reviewed changes

tlrx self-requested a review December 4, 2017 08:33

tlrx approved these changes Dec 4, 2017

View reviewed changes

martijnvg force-pushed the hl_client_mseach_api branch from 3672923 to 96e8e8a Compare December 4, 2017 15:54

tlrx reviewed Dec 4, 2017

View reviewed changes

martijnvg force-pushed the hl_client_mseach_api branch 2 times, most recently from a4d429f to b826c45 Compare December 5, 2017 07:05

Added msearch api to high level client

4d78e1a

martijnvg force-pushed the hl_client_mseach_api branch from b826c45 to 4d78e1a Compare December 5, 2017 09:18

martijnvg merged commit 4d78e1a into elastic:master Dec 5, 2017

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

pquentin mentioned this pull request Jul 21, 2025

Add missing index query parameter to msearch elastic/elasticsearch-specification#4980

Merged

Add msearch api to high level client #27274

Add msearch api to high level client #27274

Uh oh!

Conversation

martijnvg commented Nov 6, 2017

Uh oh!

javanna left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martijnvg commented Nov 8, 2017

Uh oh!

javanna left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!