-
Notifications
You must be signed in to change notification settings - Fork 264
Map Builders should preserve order #99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Java indeed has
We can introduce a workaround in the Java client for this specific API (by adding an annotation on the API spec that will trigger the workaround in the code generator), but we should not assume that all client libraries will be able to implement it. |
@hendrikmuhs after discussing this with the larger Clients Team and careful evaluation, we've decided to not implement the change you suggest. The problem, as I mentioned previously, is that some languages have no builtin support for order-preserving maps, while others have (even as a default, like Python or Ruby). Yet all languages are compliant with the JSON spec, which is the semantic model on which the Elasticsearch API is built. So implementing this in the Java client, even if simple and innocuous from a technical perspective, will not solve the issue for the entire range or client libraries, maintained either by Elastic or by 3rd parties. If this lack of ordering is a blocking issue, then this API should be updated to a format that can capture order. The example of sort criteria comes to mind, which are defined as an array of objects containing a single key. Feel free to reach out to the Clients Team if you want to discuss this further. |
That's correct w.r.t. the JSON spec, I already mentioned that in the beginning. This is fine for clients that take JSON or a dictionary, but not for the java client: A java client user uses a builder, he does not know and does not care whether JSON is used behind the scenes. This is an implementation detail. I agree that other client languages require their own assessment, I can not speak for all, but e.g. the python client has a totally different interface. There I could use an ordered dict if I want to preserve order.
Yes, I agree. I opened this issue for the java client, but not for other languages.
I think that's the long term way to solve this, the transform API should be changed to optionally take an array of objects:
instead of:
I will discuss this with the backend team. Note that this does not change the way a user interacts with the java client. He would still call the builder, the only difference is the way the client calls the backend afterwards.
I will, after some consultation with the backend team. |
Hi @hendrikmuhs Respectfully, I beg to differ on JSON being an implementation detail. From the sole point of view of the Java client user, this may be true, but the full public product surface offered by Elastic includes not only the API offered by the clients, but also the HTTP/JSON interface itself, which can be used directly. It is everyone's general interest to ensure that the user experience is consistent across all parts of the product surface and divergent semantics (X can be considered ordered if used via method 1, but not via method 2) increases both maintenance costs and usage complexity. So ultimately, we cannot look at at the Java client in isolation; it has to be designed and built in line with all the other components within our product suite. In terms of your specific case, my recommendation would be to discuss how to persist the ordering information across all layers of the system for the API you have in mind. That way, we may be able to enhance the semantics there in a way that works for all parts of the product surface equally. |
The problem appears only if the java client is used (maybe for other clients, too, that has to be investigated), the API works fine using the HTTP API. It's the client that breaks the contract as it shuffles what the user puts in. |
What contract is broken? As already mentioned above, JSON objects are defined as unordered, so any JSON parser is free to yield the values it reads in whatever order is convenient for the implementation. The only way to avoid that and retain the order in which the server happened to write them would be to write your own custom JSON parser which guarantees to retain that information. And similarly, while it may normally do so in practice, the server itself does not guarantee that it will output the information in the payload in the same order as the user provided it. Again, this is a result of JSON having been chosen as the output format, rather than a vendor-specific variant of JSON (e.g. But no contracts that exist today have been broken. It's simply that those contracts don't go far enough to cover your requirement. And so I return to my previous point: that we should instead discuss how to expose the additional semantics for the specific API. |
@hendrikmuhs the exact same issue will happen with our Go, C# and Rust clients, and any custom code people will write in these languages. The Python dictionary, Ruby hash and PHP associative array happen to keep insertion order, but this is just a happy coincidence. |
Let's take this offline, it looks like gh is not the right place to discuss this, we seem to discuss different issues. I am very aware that JSON dictionaries do not define an order, I've never claimed so. We will discuss in backend and get in touch afterwards. |
We have discussed the underlying issue in the backend team: The transform API parses/stores/serializes the configuration in question in a way that preserves order, this works if the REST API is used directly. This is out of the specification and inconsistent to other parts of the stack. The reason for this feature is performance. Starting with 7.15 transform has some builtin optimizations for improving performance by optimizing the group by query. That's one of the reasons why we've decided to keep the API the way it is and not change/enhance the syntax (make it order-aware or add sort criteria). We will rather spend more time on improving the optimizations. As a result we will adjust the documentation accordingly. If you are using REST it will still preserve order, but that's considered an implementation detail. That's for the technical aspect of it. From a usability perspective I still think that consistent order helps UX. But that's not my decision. Thanks for your time, I will close the corresponding HLRC issue. |
Object Builders for maps use an
InternalMap
which is a sub-class of a hash map:elasticsearch-java/java-client/src/main/java/co/elastic/clients/util/ObjectBuilderBase.java
Line 91 in 75feae5
Hash maps are unordered, the order is defined by the internal hashing function and can be different for different runtime environments. Although the JSON standard does not define dictionaries to be ordered, it would be good to switch to a linked hash map with insertion order:
Because this does not seems performance-critical, I suggest to switch to a linked hash map with insertion order
The text was updated successfully, but these errors were encountered: