Skip to content

Introduce PartitionedChannel #8617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 10, 2023

Conversation

artembilan
Copy link
Member

  • Implement a PartitionedChannel as an extension of the AbstractExecutorChannel
  • Supply this channel with a PartitionedDispatcher which is an extension of the AbstractDispatcher
  • The target partition is essentially a UnicastingDispatcher with a single thead executor

* Implement a `PartitionedChannel` as an extension of the `AbstractExecutorChannel`
* Supply this channel with a `PartitionedDispatcher` which is an extension of the `AbstractDispatcher`
* The target partition is essentially a `UnicastingDispatcher` with a single thead executor
@artembilan
Copy link
Member Author

If this is OK, I'll add some docs and DSL configuration.
Thanks

Copy link
Contributor

@garyrussell garyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions; otherwise LGTM.

Comment on lines 30 to 42
* An {@link AbstractExecutorChannel} implementation for partitioned messages dispatching.
* Requires a number of partitions where each of them is backed by a dedicated thread.
* The {@code partitionKeyFunction} is used to determine to which partition the message
* has to be dispatched.
* <p>
* The actual dispatching and threading logic in implemented in the {@link PartitionedDispatcher}.
* <p>
* The default {@link ThreadFactory} is based on a bean name of this channel plus {@code -partition-thread-}.
* <p>
* The rest of the logic is similar to the {@link ExecutorChannel}, which includes:
* - load balancing for subscribers;
* - fail-over and error handling;
* - channel operations intercepting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* An {@link AbstractExecutorChannel} implementation for partitioned messages dispatching.
* Requires a number of partitions where each of them is backed by a dedicated thread.
* The {@code partitionKeyFunction} is used to determine to which partition the message
* has to be dispatched.
* <p>
* The actual dispatching and threading logic in implemented in the {@link PartitionedDispatcher}.
* <p>
* The default {@link ThreadFactory} is based on a bean name of this channel plus {@code -partition-thread-}.
* <p>
* The rest of the logic is similar to the {@link ExecutorChannel}, which includes:
* - load balancing for subscribers;
* - fail-over and error handling;
* - channel operations intercepting.
* An {@link AbstractExecutorChannel} implementation for partitioned message dispatching.
* Requires a number of partitions where each of them is backed by a dedicated thread.
* The {@code partitionKeyFunction} is used to determine to which partition the message
* has to be dispatched.
* <p>
* The actual dispatching and threading logic in implemented in the {@link PartitionedDispatcher}.
* <p>
* The default {@link ThreadFactory} is based on the bean name of this channel plus {@code -partition-thread-}.
* <p>
* The rest of the logic is similar to the {@link ExecutorChannel}, which includes:
* - load balancing for subscribers;
* - fail-over and error handling;
* - channel operations intercepting.

@Override
public boolean dispatch(Message<?> message) {
int partition = this.partitionKeyFunction.apply(message).hashCode() % this.partitionCount;
UnicastingDispatcher partitionDispatcher = this.partitions.computeIfAbsent(partition, (__) -> newPartition());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should pre-populate - otherwise the thread names won't necessarily match the partition #.

assertThat(messagesInPartition.get(0).getHeaders().get("partitionKey"))
.isEqualTo(messagesInPartition.get(1).getHeaders().get("partitionKey"));
});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we match the thread name to the partition, we could assert that the records went to the right partition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is not always true, since we don't use a partition key as is, but rather its hashCode().
So, while all the same keys goes to the same partition, it does not mean that 1 for key will always determine the partition 0.

* Add docs for `PartitionedChannel` into `channel.adoc`
* Pre-populate partitions in the `PartitionedDispatcher`
to ensure a thread number reflection of the partition it is used for (for default `ThreadFactory`)
* Add Javadocs to `public` methods
* Add Java DSL `PartitionedChannelSpec` and respective factory methods into `Channels`
* Use `IntegrationMessageHeaderAccessor.CORRELATION_ID` header as a default partition key
@artembilan artembilan requested a review from garyrussell May 10, 2023 17:33
Copy link
Contributor

@garyrussell garyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just doc polishing.

return partitionDispatcher.dispatch(message);
}

private void prePopulatedPartitionsIfAny() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private void prePopulatedPartitionsIfAny() {
private synchronized void prePopulatedPartitionsIfAny() {

Comment on lines 209 to 215
Starting with version 6.1, a `PartitionedChannel` implementation is provided.
This is an extension of `AbstractExecutorChannel` and represents a point-to-point dispatching logic where an actual consumption happens on a specific thread determined by the partition key evaluated from a message sent to this channel.
This channel is similar to the `ExecutorChannel` mentioned above, but with a difference that messages with the same partition key are always handled in the same thread preserving ordering.
It does not require an external `TaskExecutor`, but can be configured with a custom `ThreadFactory` (e.g. `Thread.ofVirtual().name("partition-", 0).factory()`).
This factory then used to populate single-thread executors into `MessageDispatcher` delegates per partition.
By default, the `IntegrationMessageHeaderAccessor.CORRELATION_ID` message header is used for partition key.
This channel can be configured as a simple bean:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Starting with version 6.1, a `PartitionedChannel` implementation is provided.
This is an extension of `AbstractExecutorChannel` and represents a point-to-point dispatching logic where an actual consumption happens on a specific thread determined by the partition key evaluated from a message sent to this channel.
This channel is similar to the `ExecutorChannel` mentioned above, but with a difference that messages with the same partition key are always handled in the same thread preserving ordering.
It does not require an external `TaskExecutor`, but can be configured with a custom `ThreadFactory` (e.g. `Thread.ofVirtual().name("partition-", 0).factory()`).
This factory then used to populate single-thread executors into `MessageDispatcher` delegates per partition.
By default, the `IntegrationMessageHeaderAccessor.CORRELATION_ID` message header is used for partition key.
This channel can be configured as a simple bean:
Starting with version 6.1, a `PartitionedChannel` implementation is provided.
This is an extension of `AbstractExecutorChannel` and represents point-to-point dispatching logic where the actual consumption is processed on a specific thread, determined by the partition key evaluated from a message sent to this channel.
This channel is similar to the `ExecutorChannel` mentioned above, but with the difference that messages with the same partition key are always handled in the same thread, preserving ordering.
It does not require an external `TaskExecutor`, but can be configured with a custom `ThreadFactory` (e.g. `Thread.ofVirtual().name("partition-", 0).factory()`).
This factory is used to populate single-thread executors into a `MessageDispatcher` delegate, per partition.
By default, the `IntegrationMessageHeaderAccessor.CORRELATION_ID` message header is used as the partition key.
This channel can be configured as a simple bean:

----
====

The channel will have `3` partitions - dedicated threads; will use a `partitionKey` header to determine in which partition the message must be handled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The channel will have `3` partitions - dedicated threads; will use a `partitionKey` header to determine in which partition the message must be handled.
The channel will have `3` partitions - dedicated threads; will use the `partitionKey` header to determine in which partition the message must be handled.

----
====

The channel will have `3` partitions - dedicated threads; will use a `partitionKey` header to determine in which partition the message must be handled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The channel will have `3` partitions - dedicated threads; will use a `partitionKey` header to determine in which partition the message must be handled.
The channel will have `3` partitions - dedicated threads; will use the `partitionKey` header to determine in which partition the message will be handled.

* Fix Checkstyle violations
* Mark `PartitionedDispatcher.prePopulatedPartitions()` with `synchronized`
@@ -145,13 +145,15 @@ public void shutdown() {

@Override
public boolean dispatch(Message<?> message) {
prePopulatedPartitionsIfAny();
if (this.partitions.isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work, thread 2 might proceed before thread 1 has populated all partition execs, and NPE on the dispatcher returned by the get().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about now?
See the latest commit.

and when a custom `ThreadFactory` is set
public void setThreadFactory(ThreadFactory threadFactory) {
Assert.notNull(threadFactory, "'threadFactory' must not be null");
this.threadFactory = threadFactory;
populatedPartitions();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to shut down and clear existing executors.

Perhaps call shutdown() first?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... Nothing to shutdown I guess: those executors have nit been used yet.
But we indeed have to this.executors.clear() .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know that; they could call this after we started processing messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if not, it's a no-op; so why worry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh... We cannon populate them from the ctor: some other options are required for delegate dispatchers.
So, coming back to the synchronized solution 😄

@garyrussell garyrussell merged commit 5f1e12e into spring-projects:main May 10, 2023
@artembilan artembilan deleted the PartitionedChannel branch May 10, 2023 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants