Skip to content

Conversation

@antiguru
Copy link
Member

@antiguru antiguru commented Aug 16, 2025

This is another attempt at reworking the container abstractions in Timely.

At the moment, we have two types that a lot of the container abstractions hinge on, Container and ContainerBuilder. Both do things that are of localized interest, among the things they need to do. Containers can explain their contents, can clear themselves, even though it's not a primary concern of a container. What it absolutely needs to do is to explain to progress tracking how many updates it contains. The container builder can partition itself, but only if the container is special, i.e., it provides internal access.

This PR contains individual commits to do the following, roughly ordered by the sequence of commits:

  • Remove Container::push because it wasn't used and would just defer to push_into.
  • Remove Container::clear. This is somewhat problematic as forgetting to clear can cause resource leaks or repeating data that we already processed earlier. The good news is that we're using container builders in most places, and container builders need to carefully recycle the containers they produced. They're also in a much better position to clear because they know what kind of container they're handling. I looked through all the call sites in Timely, and convinced myself this change is safe.
  • Split the container trait into Container, IterContainer, DrainContainer, without depending on each other. The container provides a length for progress tracking, the iter and drain containers allow to iterate or drain the contents. It's a bit strange as the drain container doesn't need to be empty after draining, probably something to look at later.
  • Remove some unneeded is_empty calls where we know containers are non-empty.
  • Rename (and change type) of Container::len to update_count. Document it as specific to progress tracking.
  • Rename Container to WithProgress because that's what it's about.
  • (Moved to separate PR Distributor trait #700) Introduce a Partitioner trait in exchange that knows how to partition a specific container into pushers. Remove the partition function from the container builder. This change is probably worth looking at individually, or even as a separate PR.

None of this is intended as "this is the only way it has to happen", but rather a sequence of steps that might get us to a clearer definition of the crucial types we use in Timely.

The location of the traits is up to discussion, and maybe we conclude we don't need the container crate anymore and rather localize the traits.

Signed-off-by: Moritz Hoffmann <[email protected]>
I validated all call sites to handle non-empty containers gracefully.
Container builders already need to handle it, and inputs should do so, too.

Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
@antiguru antiguru requested a review from Copilot August 16, 2025 19:48

This comment was marked as outdated.

@antiguru antiguru requested a review from Copilot August 16, 2025 20:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reworks the container abstractions in Timely by splitting the monolithic Container trait into smaller, more focused traits and renaming key concepts for clarity. The main purpose is to separate concerns and make the container interface more modular.

Key changes include:

  • Splitting Container trait into WithProgress, IterContainer, and DrainContainer traits
  • Renaming Container::len to WithProgress::update_count to clarify its purpose for progress tracking
  • Removing Container::clear method and handling cleanup through container builders
  • Updating all usage sites throughout the codebase to use the new trait names

Reviewed Changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
container/src/lib.rs Core trait definitions split from monolithic Container to focused traits
timely/src/lib.rs Updated public API to export WithProgress instead of Container
timely/src/logging.rs Updated trait bounds and field names for progress tracking
timely/examples/columnar.rs Updated trait implementations for the new container abstraction
timely/src/dataflow/* Updated trait bounds throughout dataflow operators and channels

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

It's a composite trait requiring WithProgress, Default, Clone, 'static

Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
}

/// TODO
pub trait DrainContainer {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider whether DrainContainer should have a clear function. It seems the best place if we'd like to have one.

We could also change the requirement for DrainContainer to be empty after calling drain, but that'd complicate the implementations of DrainContainer as the iterator needs to clear on drop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense (inventing trouble) to have RefIter and OwnIter? Partly to break the "drain" association of "it ends up empty", which is true only for Vec I think.

Alternately, it feels a bit like these might be requirements that &Self: IntoIterator and &mut Self: IntoIterator, without introducing new traits. Though the traits may be easier to state than then for<'a> bounds you'd need to state the above things. I think we can defer that to later though!

@antiguru antiguru mentioned this pull request Aug 22, 2025
* Properly document where containers are left in an undefined state.
* Change update count to record count.
* Add some missing documentation and remove outdated comments

Signed-off-by: Moritz Hoffmann <[email protected]>
Copy link
Member

@frankmcsherry frankmcsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me. I left some notes that I think can largely be discussed and resolved async, so afaict no reason not to merge.

}

/// TODO
pub trait DrainContainer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense (inventing trouble) to have RefIter and OwnIter? Partly to break the "drain" association of "it ends up empty", which is true only for Vec I think.

Alternately, it feels a bit like these might be requirements that &Self: IntoIterator and &mut Self: IntoIterator, without introducing new traits. Though the traits may be easier to state than then for<'a> bounds you'd need to state the above things. I think we can defer that to later though!

if !data.is_empty() {
output.session(&time).give_iterator(data.drain().filter(&mut predicate));
}
output.session(&time).give_iterator(data.drain().filter(&mut predicate));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to leave a comment about how this is a potential regression, and then concluded that I don't really care. :D

Signed-off-by: Moritz Hoffmann <[email protected]>
@frankmcsherry frankmcsherry merged commit af70245 into TimelyDataflow:master Aug 28, 2025
7 checks passed
@antiguru antiguru deleted the container_rework branch August 28, 2025 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants