-
Notifications
You must be signed in to change notification settings - Fork 469
Extend polymorphic compare to support objects. #2657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend polymorphic compare to support objects. #2657
Conversation
Extend polymorphic compare to support objects created as e.g. `[%bs.obj {x=1; y=2}]`. TODO: equality and hash. Objects, when viewed as blocks, have size 0 and no tag. Here an object `o` is recognized by checking that `o.constructor === Objects`. Then, the keys and values can be extracted. The compare function keeps the same structure, with one extra while loop, and does not change unless objects are encountered. Presumably `Object.keys(o)` performs some allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=D
jscomp/runtime/caml_obj.ml
Outdated
@@ -198,6 +208,17 @@ let rec caml_compare (a : Obj.t) (b : Obj.t) : int = | |||
else | |||
let len_a = Bs_obj.length a in | |||
let len_b = Bs_obj.length b in | |||
if len_a = 0 && len_b = 0 && O.is_object a && O.is_object b then | |||
begin | |||
let keys_a = O.keys a in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunate that this allocates though. Should we loop instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we did not support for .. in
currently, maybe it's fine to write such function in raw js?
Here is something I imagined:
let unsafe_js_compare : Obj.t -> Obj.t -> int = [%raw{| function (first,second){
}
|}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing about looping is that we have to traverse both objects, one item at a time, as that's the order in which comparison proceeds. I could not find a way to do that.
One could choose a different order for the comparison, e.g. as in sets sorted by keys, but again I would not know how to do that without allocating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is object keys order important? This passes the type checker:
https://reasonml.github.io/en/try.html?reason=C4TwDgpgBAZg9nKBeKBvAdFARPOWBcUAlgHbAA02ARgIYBOBUAzsHaQOYC+A3AFC8AbCMCgBbEADEEhXMjRZaDQlgAWEEFko4EjAIw9+QkTRJxgaulLgyEc1NryFdWxY1XqsPIA
So you'd accidentally consider these two objs as non-equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible to ignore the order of keys, by considering them as a sorted set. Trying that next.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. So if the order of keys doesn’t matter, you can do two for-loops; check object1 against object2, then object2 against object1. This avoids the allocations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed more offline. Seems like there's no way to avoid allocation: some ordering on keys needs to be defined, and elements compared in that order.
Two possible orders are: 1 the order in which keys appear in the objects, 2 alphabetical order.
If going for 1, one needs to compare the first key in a
with the first key in b
, and so on. This can't be done with a for..in.
(With a for..in one can compare the first key in a
with the corresponding key in b
, but this is not a well-defined order, as e.g. it's not symmetric.)
If going for 2, one needs to compare the keys in alphabetical order, and this can't be done with a for..in.
Currently choosing to go for 2, which has the property that {x:1, y:2}
and {y:2, x:1}
are considered equal, so insertion order does not get in the way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes perfect sense to support objects, some extensive tests needed though.
Equality should be synced up, hash is not that relevant in my opinion
To make the comparison order-independent, sort the keys before comparing them.
@bobzhang there are several tests that exercise a bunch of corner cases. |
jscomp/runtime/caml_obj.ml
Outdated
@@ -223,6 +248,16 @@ and aux_length_b_short (a : Obj.t) (b : Obj.t) i short_length = | |||
let res = caml_compare (Obj.field a i) (Obj.field b i) in | |||
if res <> 0 then res | |||
else aux_length_b_short a b (i+1) short_length | |||
and aux_obj_compare (a: Obj.t) keys_a (b: Obj.t) keys_b i min_len default_res = | |||
if i = min_len then default_res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the length is not the same, should we exit early here?
Note it does not to have to follow the semantics of caml_compare for block which is constrained by existing ocaml specification.
Oh, one thing which reminds of me that caml_compare
semantics will be changed if we change record from array to dict is that the iteration order needs to be the same as declaration order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a way to avoid Object.keys entirely, and only rely on for..in, the code structure will change.
jscomp/runtime/caml_obj.ml
Outdated
let default_res = len_a - len_b in | ||
aux_obj_compare a keys_a b keys_b 0 min_len default_res | ||
end | ||
else | ||
if len_a = len_b then | ||
aux_same_length a b 0 len_a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we put this branch earlier, this branch seems to be hit more often than the previous one? Is it okay to assume Array.isArray
API exists? (cc @chenglou )
if len_a = len_b && is_array a
In that case, we don't need o.constructor == Object
jscomp/runtime/caml_obj.ml
Outdated
let () = O.sort(keys_b) in | ||
aux_obj_equal a keys_a b keys_b 0 len_a | ||
end | ||
else | ||
if len_a = len_b then | ||
aux_equal_length a b 0 len_a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
Found a way to avoid calling |
Found a way to avoid calling `Object.keys entirely` and in principle avoid allocation. A key is a witness that `a` is smaller than `b` if it's only in `b` or it's in both and the corresponding value in `a` is smaller. It's possible to traverse `a` and `b` separately to find the smallest (in alphabetical order) key witnessing that `a` is smaller, and that `b` is smaller, then the smaller of the two keys determines the result.
lib/js/caml_obj.js
Outdated
var a$1 = a; | ||
var b$1 = b; | ||
var result = [/* true */1]; | ||
var do_key_a = (function(b$1,result){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering whether it's possible to avoid creating a function here -- perhaps by writing the entire aux_obj_compare
in js directly.
lib/js/caml_obj.js
Outdated
b$1, | ||
min_key_rhs | ||
]; | ||
var do_key_a = (function(partial_arg){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering whether it's possible to avoid creating a function here -- perhaps by writing the entire aux_obj_compare in js directly.
@@ -326,4 +424,4 @@ exports.caml_lessthan = caml_lessthan; | |||
exports.caml_lessequal = caml_lessequal; | |||
exports.caml_min = caml_min; | |||
exports.caml_max = caml_max; | |||
/* No side effect */ | |||
/* for_in Not a pure module */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may have some cascading effect
I would like to have this functionality in first and iterate later.
|
@bobzhang sure, using |
The algorithm looks correct to me. The api Feel free to merge it when you think it is ready |
@@ -223,6 +237,27 @@ and aux_length_b_short (a : Obj.t) (b : Obj.t) i short_length = | |||
let res = caml_compare (Obj.field a i) (Obj.field b i) in | |||
if res <> 0 then res | |||
else aux_length_b_short a b (i+1) short_length | |||
and aux_obj_compare (a: Obj.t) (b: Obj.t) = | |||
let min_key_lhs = ref None in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can probably avoid the allocation here if we used Js.Nullable.t?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would work, though this runtime library does not have access to Js.Nullable.
I have a straight js version in da928db which does that and simplifies a bunch of other things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bobzhang what do you think?
Extend polymorphic compare to support objects created as e.g.
[%bs.obj {x=1; y=2}]
.The comparison of
{x:1, y:2}
and{x:3, y:4}
is defined logically as if the objects were lists of pairs: the first keys are compared, then the first values, then the second keys, then the second values, etc.In particular, the order of keys matters, and
{x:1, y:2}
is different from{y:2, x:1}
.TODO: equality and hash.
Objects, when viewed as blocks, have size 0 and no tag.
Here an object
o
is recognized by checking thato.constructor === Object
.Then, the keys and values can be extracted.
The compare function keeps the same structure, with one extra while loop, and does not change unless objects are encountered. Presumably
Object.keys(o)
performs some allocation.