Skip to content
This repository was archived by the owner on Jul 3, 2023. It is now read-only.

Add 'orgchart' example similar to Differential Dataflow first example. #286

Closed
wants to merge 3 commits into from

Conversation

blp
Copy link
Contributor

@blp blp commented Feb 3, 2023

This is my first try at using DBSP. I'm sure stuff is wrong.

The most puzzling thing here for me is in print_output(). It seems
like there should be an easier way to iterate through the output
than two levels of key/value advancement. It also seems odd that
cursor.weight() mutates the cursor.

Signed-off-by: Ben Pfaff [email protected]

@codecov
Copy link

codecov bot commented Feb 3, 2023

Codecov Report

Merging #286 (3cb8ea5) into main (3aa3e42) will decrease coverage by 4.44%.
The diff coverage is 8.63%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #286      +/-   ##
==========================================
- Coverage   84.03%   79.60%   -4.44%     
==========================================
  Files         156      162       +6     
  Lines       28387    30139    +1752     
==========================================
+ Hits        23856    23993     +137     
- Misses       4531     6146    +1615     
Impacted Files Coverage Δ
adapters/src/controller/error.rs 8.69% <0.00%> (ø)
adapters/src/test/kafka.rs 96.64% <ø> (ø)
pipeline_manager/src/compiler.rs 0.00% <0.00%> (ø)
pipeline_manager/src/config.rs 0.00% <0.00%> (ø)
pipeline_manager/src/db.rs 0.00% <0.00%> (ø)
pipeline_manager/src/runner.rs 0.00% <0.00%> (ø)
src/circuit/circuit_builder.rs 87.65% <ø> (ø)
src/operator/aggregate/max.rs 79.16% <0.00%> (-3.45%) ⬇️
src/operator/aggregate/min.rs 79.16% <0.00%> (-3.45%) ⬇️
src/operator/aggregate/mod.rs 96.51% <ø> (ø)
... and 17 more

Copy link
Collaborator

@ryzhyk ryzhyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks! Not sure what happened in CI. Will have a look.

type Weight = i32;
type SkipLevels = OrdIndexedZSet<EmployeeID, SkipLevel, Weight>;

fn print_output(output: &OutputHandle<OrdIndexedZSet<usize, SkipLevel, Weight>>) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usize -> EmployeeID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with my push just now.

let skiplevels: Stream<_, SkipLevels> = manages_by_employee
.join_index::<(), _, _, _, _, _>(&manages_by_manager, |common, m1, m2| {
Some((
*common,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you need to index output by common. Could you just output OrdZSet<SkipLevel, Weight> instead of OrdIndexedZSet<usize, SkipLevel, Weight>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks for pointing that out. I thought that smelled funny but I didn't understand the difference between join and join_index. I fixed it with my push just a minute ago.

@ryzhyk
Copy link
Collaborator

ryzhyk commented Feb 3, 2023

Yes, print_output should be factored into a reusable function. You should be able to use .to_string(), to format batches as strings using the Display trait, but the output isn't very nice. We probably want an IndexedZSet<K:Display, V:Display, R:Display>::display() method with default implementation similar to print_output.

Agreed about weight taking mut cursor being odd. It has to do with internal implementation details of some cursor types. I tried to fix it in the past, but don't remember what the showstopper was.

ryzhyk and others added 2 commits February 3, 2023 13:06
Add derive's for `Encode` and `Decode` in the `orgchart` example, so
that it compiles with persistence enabled.
I didn't understand that the difference between 'join' and 'join_index'
was that the latter indexes the output stream.  The example did not use
the index, so we can omit it.  This commit does so.

Signed-off-by: Ben Pfaff <[email protected]>
@mihaibudiu
Copy link

How about filing issues for the problems you discovered with APIs?

@blp
Copy link
Contributor Author

blp commented Feb 3, 2023

How about filing issues for the problems you discovered with APIs?

Good idea. Filed issue #289 and issue #290.

@blp
Copy link
Contributor Author

blp commented Feb 6, 2023

Closing in favor of PR #295.

@blp blp closed this Feb 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants