-
Notifications
You must be signed in to change notification settings - Fork 290
Rework container abstractions #697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework container abstractions #697
Conversation
4c3db2f to
4804939
Compare
Signed-off-by: Moritz Hoffmann <[email protected]>
I validated all call sites to handle non-empty containers gracefully. Container builders already need to handle it, and inputs should do so, too. Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
4804939 to
58d0b70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR reworks the container abstractions in Timely by splitting the monolithic Container trait into smaller, more focused traits and renaming key concepts for clarity. The main purpose is to separate concerns and make the container interface more modular.
Key changes include:
- Splitting
Containertrait intoWithProgress,IterContainer, andDrainContainertraits - Renaming
Container::lentoWithProgress::update_countto clarify its purpose for progress tracking - Removing
Container::clearmethod and handling cleanup through container builders - Updating all usage sites throughout the codebase to use the new trait names
Reviewed Changes
Copilot reviewed 35 out of 35 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| container/src/lib.rs | Core trait definitions split from monolithic Container to focused traits |
| timely/src/lib.rs | Updated public API to export WithProgress instead of Container |
| timely/src/logging.rs | Updated trait bounds and field names for progress tracking |
| timely/examples/columnar.rs | Updated trait implementations for the new container abstraction |
| timely/src/dataflow/* | Updated trait bounds throughout dataflow operators and channels |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
58d0b70 to
d941d0f
Compare
It's a composite trait requiring WithProgress, Default, Clone, 'static Signed-off-by: Moritz Hoffmann <[email protected]>
c487b3f to
711258b
Compare
Signed-off-by: Moritz Hoffmann <[email protected]>
| } | ||
|
|
||
| /// TODO | ||
| pub trait DrainContainer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider whether DrainContainer should have a clear function. It seems the best place if we'd like to have one.
We could also change the requirement for DrainContainer to be empty after calling drain, but that'd complicate the implementations of DrainContainer as the iterator needs to clear on drop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it makes sense (inventing trouble) to have RefIter and OwnIter? Partly to break the "drain" association of "it ends up empty", which is true only for Vec I think.
Alternately, it feels a bit like these might be requirements that &Self: IntoIterator and &mut Self: IntoIterator, without introducing new traits. Though the traits may be easier to state than then for<'a> bounds you'd need to state the above things. I think we can defer that to later though!
* Properly document where containers are left in an undefined state. * Change update count to record count. * Add some missing documentation and remove outdated comments Signed-off-by: Moritz Hoffmann <[email protected]>
frankmcsherry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me. I left some notes that I think can largely be discussed and resolved async, so afaict no reason not to merge.
| } | ||
|
|
||
| /// TODO | ||
| pub trait DrainContainer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it makes sense (inventing trouble) to have RefIter and OwnIter? Partly to break the "drain" association of "it ends up empty", which is true only for Vec I think.
Alternately, it feels a bit like these might be requirements that &Self: IntoIterator and &mut Self: IntoIterator, without introducing new traits. Though the traits may be easier to state than then for<'a> bounds you'd need to state the above things. I think we can defer that to later though!
| if !data.is_empty() { | ||
| output.session(&time).give_iterator(data.drain().filter(&mut predicate)); | ||
| } | ||
| output.session(&time).give_iterator(data.drain().filter(&mut predicate)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to leave a comment about how this is a potential regression, and then concluded that I don't really care. :D
Signed-off-by: Moritz Hoffmann <[email protected]>
d3010f0 to
153f3d8
Compare
This is another attempt at reworking the container abstractions in Timely.
At the moment, we have two types that a lot of the container abstractions hinge on,
ContainerandContainerBuilder. Both do things that are of localized interest, among the things they need to do. Containers can explain their contents, can clear themselves, even though it's not a primary concern of a container. What it absolutely needs to do is to explain to progress tracking how many updates it contains. The container builder can partition itself, but only if the container is special, i.e., it provides internal access.This PR contains individual commits to do the following, roughly ordered by the sequence of commits:
Container::pushbecause it wasn't used and would just defer topush_into.Container::clear. This is somewhat problematic as forgetting to clear can cause resource leaks or repeating data that we already processed earlier. The good news is that we're using container builders in most places, and container builders need to carefully recycle the containers they produced. They're also in a much better position to clear because they know what kind of container they're handling. I looked through all the call sites in Timely, and convinced myself this change is safe.Container,IterContainer,DrainContainer, without depending on each other. The container provides a length for progress tracking, the iter and drain containers allow to iterate or drain the contents. It's a bit strange as the drain container doesn't need to be empty after draining, probably something to look at later.is_emptycalls where we know containers are non-empty.Container::lentoupdate_count. Document it as specific to progress tracking.ContainertoWithProgressbecause that's what it's about.Partitionertrait in exchange that knows how to partition a specific container into pushers. Remove thepartitionfunction from the container builder. This change is probably worth looking at individually, or even as a separate PR.None of this is intended as "this is the only way it has to happen", but rather a sequence of steps that might get us to a clearer definition of the crucial types we use in Timely.
The location of the traits is up to discussion, and maybe we conclude we don't need the container crate anymore and rather localize the traits.