Skip to content

should get_range return a KeyValue stream? #123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pH14 opened this issue May 25, 2019 · 3 comments · Fixed by #143
Closed

should get_range return a KeyValue stream? #123

pH14 opened this issue May 25, 2019 · 3 comments · Fixed by #143

Comments

@pH14
Copy link
Collaborator

pH14 commented May 25, 2019

Transaction#get_ranges is a little tricky to use. It returns a Stream<GetRangeResult>, where GetRangeResult itself contains a KeyValues which derefs into a [KeyValue]. This means an extra layer of iteration to loop through all values in a range.

It'd be nice to offer a stream to the underlying KeyValue entries directly, which is something the other higher-level bindings do (links below!).

My proposal would be to:

  • remove the current flavor of get_range as a public function (it forces you to think about iteration count, which isn't intuitive + is only relevant with certain streaming modes)
  • make the current get_ranges return Stream<KeyValue>
  • rename get_ranges to get_range

I think this would bring the binding into closer alignment with the others for range reads.

[1] : Java returns a Iterable<KeyValue> (https://apple.github.io/foundationdb/javadoc/index.html)
[2] : Go returns a RangeResult which can either be used to pull all values into a slice, or act as an iterator(https://godoc.org/github.com/apple/foundationdb/bindings/go/src/fdb#Transaction.GetRange)
[3] : Python returns a generator over the KVs (https://github.com/apple/foundationdb/blob/master/bindings/python/fdb/impl.py#L327 / https://apple.github.io/foundationdb/api-python.html)

@bluejekyll
Copy link
Collaborator

Glad you’re interested in the project. These sound like decent proposals, but I’d like some time to consider them, I’ve been a bit busy over the last few weeks.

@pH14
Copy link
Collaborator Author

pH14 commented Jun 25, 2019

I took a stab at this, but, as a Rust newbie, wasn't able to figure out the lifetimes. If any spectators have any ideas, I'm all ears 😁

Currently to iterate through all kv pairs in a range you might start off with something like:

transaction
.get_ranges(range_option)
.map_err(|(_, e)| e)
.map(|range| { // iterate over the `GetRangeResult` entries from the stream
    for kv in range.key_values().as_ref() { // now iterate over each KV within a given `GetRangeResult`
        // do something with the key values
    }
}) ...

The underlying FDB C API pulls in a GetRangeResult at a time, so the validity of a given KeyValue is tied to the lifetime of its range result. I'm not sure how to reconcile the lifetimes of these GetRangeResult with the lifetime of the overall stream -- each GetRangeResult doesn't live for the entire Stream, and each KeyValue lives as long as its GetRangeResult.

Ideally we can get something that iterates on the keyvalues directly:

transaction
.get_range(range_option)
.map_err(|(_, e)| e)
.map(|kv| { // iterate over KVs directly from underlying `GetRangeResult`s
        // do something with the key values
}) ...

An easy way to do this would be to just copy the byte arrays of the KV entries as they come in (something the other bindings do), but it'd be nice to package this up so it defaults to zero-copy and allows clients to choose whether they want to hold onto the values outside of transaction.

@bluejekyll
Copy link
Collaborator

Sorry you ran into issues here. Sometimes flat_map ends up being the right tool in a situation like this. Otherwise, you end up needing to wrap the result in some other type to manage the iteration over the data...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants