Skip to content

[CORE-15307] better prefix truncation checks#29274

Merged
michael-redpanda merged 2 commits into
redpanda-data:devfrom
michael-redpanda:sl/core-15307-better-prefix-truncation-checks
Jan 16, 2026
Merged

[CORE-15307] better prefix truncation checks#29274
michael-redpanda merged 2 commits into
redpanda-data:devfrom
michael-redpanda:sl/core-15307-better-prefix-truncation-checks

Conversation

@michael-redpanda

Copy link
Copy Markdown
Contributor

Add a capability check to the sink partition interface to determine whether prefix truncation is supported before attempting the operation. This prevents unnecessary work for partitions that don't support prefix truncation, such as cloud topics.

Fixes: CORE-15307

Summary

When cluster link replication attempts to synchronize start offsets between source and sink partitions, it currently proceeds with prefix truncation logic regardless of whether the sink partition actually supports this operation. Cloud topics, for example, do not support local prefix truncation as they manage retention differently.

This PR:

  • Adds a new can_prefix_truncate() method to the sink partition interface that checks if the partition is locally collectable
  • Updates maybe_synchronize_start_offset() to early-return when prefix truncation is not supported, avoiding unnecessary computation

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.3.x
  • v25.2.x
  • v25.1.x

Release Notes

Bug Fixes

  • Fixed cluster link replication attempting prefix truncation on partitions that do not support it (e.g., cloud topics)

Add a new virtual method to check if a sink partition supports prefix
truncation before attempting the operation. The real implementation
checks if the partition is locally collectable.

Signed-off-by: Michael Boquard <michael@redpanda.com>
…itions

Check can_prefix_truncate() before attempting to synchronize start
offsets. This avoids unnecessary work for partitions that don't support
prefix truncation.

Signed-off-by: Michael Boquard <michael@redpanda.com>
@michael-redpanda michael-redpanda self-assigned this Jan 15, 2026
Copilot AI review requested due to automatic review settings January 15, 2026 16:45

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a capability check to prevent cluster link replication from attempting prefix truncation on partitions that don't support it, such as cloud topics. The check interrogates the partition's configuration to determine local collectability before proceeding with truncation operations.

Changes:

  • Added can_prefix_truncate() method to the data_sink interface to check if prefix truncation is supported
  • Updated maybe_synchronize_start_offset() to early-return when prefix truncation is not supported
  • Implemented the new method across all sink implementations (production and test code)

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/v/cluster_link/replication/deps.h Adds interface method declaration for prefix truncation capability check
src/v/cluster_link/replication/partition_replicator.cc Adds early-return logic when sink doesn't support prefix truncation
src/v/cluster_link/service.cc Implements capability check based on partition's local collectability
src/v/cluster_link/replication/tests/partition_replicator_tests.cc Implements capability check in test sink (returns true)
src/v/cluster_link/replication/tests/link_replication_mgr_tests.cc Implements capability check in test sink (returns true)
src/v/cluster_link/replication/tests/deps_test_impl.h Adds method declaration to test accounting sink
src/v/cluster_link/replication/tests/deps_test_impl.cc Implements capability check in test accounting sink (returns true)

// Returns the HWM of the partition
virtual kafka::offset high_watermark() const = 0;

// Returns whether or not the sink support prefix truncation

Copilot AI Jan 15, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected grammar: 'support' should be 'supports' to match subject-verb agreement.

Suggested change
// Returns whether or not the sink support prefix truncation
// Returns whether or not the sink supports prefix truncation

Copilot uses AI. Check for mistakes.
@vbotbuildovich

Copy link
Copy Markdown
Collaborator

CI test results

test results on build#79088
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
ControllerForcedReconfiguration_Size5 test_cluster_recovery {"scenario": "Simple"} integration https://buildkite.com/redpanda/redpanda/builds/79088#019bc29f-d817-4185-9d97-b6831e79accf FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0280, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ControllerForcedReconfiguration_Size5&test_method=test_cluster_recovery
MountUnmountIcebergTest test_simple_remount {"cloud_storage_type": 1} integration https://buildkite.com/redpanda/redpanda/builds/79088#019bc29f-d81e-4602-bb71-53a27adddfd7 FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.1822, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.4530, p1=0.0024, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=MountUnmountIcebergTest&test_method=test_simple_remount
WriteCachingFailureInjectionE2ETest test_crash_all {"use_transactions": false} integration https://buildkite.com/redpanda/redpanda/builds/79088#019bc29f-d81c-4487-94a5-3275d6d47558 FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.1030, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.2782, p1=0.0384, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=WriteCachingFailureInjectionE2ETest&test_method=test_crash_all


bool can_prefix_truncate() const final {
_gate.check();
return _partition->get_ntp_config().is_locally_collectable();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check this because source and shadow topics can have mismatched retention configurations?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - cleanup.policy is always replicated, however is_locally_collectable also checks to see if TS is enabled on the topic which could differ between source and shadow.

@michael-redpanda michael-redpanda merged commit 71db235 into redpanda-data:dev Jan 16, 2026
20 checks passed
@vbotbuildovich

Copy link
Copy Markdown
Collaborator

/backport v25.3.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants