Skip to content

Conversation

@gianm
Copy link
Contributor

@gianm gianm commented Nov 20, 2025

Since localSort was introduced for sort-merge join in #13506, it has used 1 processor at a time with a max of 2 channels per processor. This is inefficient, because at the time the sorter runs, it is the only sorter running. The patch contains a comment describing the situation in more detail.

This patch has two benefits. First, the sorter should run faster if multiple processing threads are available. Second, the sorter, due to having a larger max channels per merger, will make more efficient use of intermediate channels.

Since localSort was introduced for sort-merge join in apache#13506, it has
used 1 processor at a time with a max of 2 channels per processor.
This is inefficient, because at the time the sorter runs, it is the
only sorter running. The patch contains a comment describing the
situation in more detail.

This patch has two benefits. First, the sorter should run faster
if multiple processing threads are available. Second, the sorter,
due to having a larger max channels per merger, will make more
efficient use of intermediate channels.
@github-actions github-actions bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Nov 20, 2025
// concurrently with the sorter, but in that case it will only have one output channel so won't
// have many frames buffered.
memoryParameters.getSuperSorterConcurrentProcessors(),
memoryParameters.getSuperSorterMaxChannelsPerMerger(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an implicit assumption that getChannelsPerMerger >=2.
I am not sure if that assumption is honored in the memory parameters anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it'll always be at least 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants