Skip to content

[CORE-15247] kafka: add size validation to string & arrays read off the wire#29208

Merged
WillemKauf merged 3 commits into
redpanda-data:devfrom
WillemKauf:kafka_wire_validate_input
Jan 12, 2026
Merged

[CORE-15247] kafka: add size validation to string & arrays read off the wire#29208
WillemKauf merged 3 commits into
redpanda-data:devfrom
WillemKauf:kafka_wire_validate_input

Conversation

@WillemKauf

Copy link
Copy Markdown
Contributor

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.3.x
  • v25.2.x
  • v25.1.x

Release Notes

Bug Fixes

  • Adds better validation to arrays and strings being read off the wire from kafka clients.

@WillemKauf WillemKauf changed the title kafka: add size validation to string & arrays read off the wire [CORE-15247] kafka: add size validation to string & arrays read off the wire Jan 9, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances input validation for Kafka protocol deserialization by adding bounds checks to prevent reading beyond available buffer space when parsing strings and arrays from client requests.

Changes:

  • Added validation to ensure string and array lengths don't exceed remaining bytes in the parser buffer
  • Enhanced error logging when disconnecting clients to include the exception details

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/v/kafka/protocol/wire.h Added bounds validation for string and array deserialization to prevent buffer overruns
src/v/kafka/server/connection_context.cc Enhanced disconnect logging to include exception details for better debugging

Comment thread src/v/kafka/protocol/wire.h Outdated
Comment on lines +343 to +350
if (static_cast<size_t>(len) > _parser.bytes_left()) {
throw std::out_of_range(
fmt::format(
"Array length {} exceeds remaining bytes {}",
len,
_parser.bytes_left()));
}

Copilot AI Jan 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation compares the array length against remaining bytes, but this check is insufficient. The array length represents the number of elements, not the byte size. Each element will consume a variable number of bytes depending on type T. This validation will fail to catch cases where the total size of all elements exceeds remaining bytes. Consider validating during element-by-element parsing instead, or calculating the minimum required bytes based on element type.

Suggested change
if (static_cast<size_t>(len) > _parser.bytes_left()) {
throw std::out_of_range(
fmt::format(
"Array length {} exceeds remaining bytes {}",
len,
_parser.bytes_left()));
}

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is fine though

@dotnwat dotnwat left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a jira ticket or a description of how you encountered this?

_parser.bytes_left()));
}

return _parser.read_string(n);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the above validations should be inside the parser read_string method?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be, though iobuf_parser is used a lot internally and performing validation here at the kafka layer seems reasonable as well. WDYT?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iobuf_parser is used a lot internally and performing validation here at the kafka layer seems reasonable as well. WDYT?

maybe i'm missing something, but isn't this an argument for putting the validation in the parser itself, so that the validation happens for all users not just this one spot?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if two extra if statements is a concern for performance 👍

WDYT about putting these checks into iobuf_parser::read_string_safe() and calling that from wire.h instead?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sorry, you're right. i was thinking that the iobuf_parser interfaces were reading the size, but infact we are providing it as parsed out of the kafka protocol. so yeh, doing the check above the iobuf_parser makes sense!

@WillemKauf

Copy link
Copy Markdown
Contributor Author

is there a jira ticket

https://redpandadata.atlassian.net/browse/CORE-15247

@WillemKauf WillemKauf force-pushed the kafka_wire_validate_input branch from 3b7ad38 to a968756 Compare January 9, 2026 16:26
dotnwat
dotnwat previously approved these changes Jan 9, 2026

@michael-redpanda michael-redpanda left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to perform a remaining bytes check in read_tags and consume_unknown_tag as well?

Comment thread src/v/kafka/protocol/wire.h Outdated
throw std::out_of_range("Asked to read a 0 byte flex string");
}

if (static_cast<size_t>(n - 1) > _parser.bytes_left()) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for performance resons should these be marked [[unlikely]]?

@WillemKauf

Copy link
Copy Markdown
Contributor Author

do we need to perform a remaining bytes check in read_tags and consume_unknown_tag as well?

I haven't been able to hit any crashes within read_tags() or consume_unknown_tag() with my protocol fuzz testing, but I suppose it's easy enough to add a check there as well.

@WillemKauf WillemKauf force-pushed the kafka_wire_validate_input branch from 943e418 to 61bfa92 Compare January 9, 2026 17:18

@michael-redpanda michael-redpanda left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to get a unit test written that triggers this behavior?

Comment thread src/v/kafka/protocol/wire.h Outdated
@WillemKauf

Copy link
Copy Markdown
Contributor Author

Would it be possible to get a unit test written that triggers this behavior?

Added one courtesy of Claude.

Add wire_validation_test.cc with tests that verify bounds checking
and error handling in the Kafka protocol decoder. Tests cover:
- Array length validation (exceeds buffer, negative, max int32)
- Flex array validation (exceeds buffer, zero length)
- String/flex string validation (exceeds buffer, negative/zero length)
- Bytes/flex bytes validation
- Tagged fields validation (count exceeds buffer, duplicate/non-ascending
  IDs, size exceeds buffer)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@WillemKauf WillemKauf force-pushed the kafka_wire_validate_input branch from 8137870 to a78c238 Compare January 9, 2026 17:40
@WillemKauf

Copy link
Copy Markdown
Contributor Author

Alright done with force pushing (just wanted to make what Claude wrote for the unit test a tiny bit nicer). Approve away

Comment on lines +264 to 273
if (unlikely(static_cast<size_t>(num_tags) > _parser.bytes_left())) {
throw std::out_of_range(
fmt::format(
"Number of tags {} exceeds remaining bytes {}",
num_tags,
_parser.bytes_left()));
}
int64_t prev_tag_id = -1;
while (num_tags-- > 0) {
auto id = read_unsigned_varint(); // consume tag id

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-redpanda what was the concern for this case? it looks like if num_tags is too large then the loop will hit end-of-buffer exception. the other cases were large allocations because the bogus value was plugged directly into malloc.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my concern isn't here, but below in the iobuf_to_bytes(_parser.share(size)) call. Maybe that one is ok as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure _parser.share() won't malloc() either, it iterates over fragments.

@WillemKauf

Copy link
Copy Markdown
Contributor Author

Any outstanding concerns here?

@WillemKauf

Copy link
Copy Markdown
Contributor Author

@michael-redpanda I'll wait on your approval.

@WillemKauf WillemKauf merged commit d0f48f0 into redpanda-data:dev Jan 12, 2026
19 checks passed
@vbotbuildovich

Copy link
Copy Markdown
Collaborator

/backport v25.3.x

@vbotbuildovich

Copy link
Copy Markdown
Collaborator

/backport v25.2.x

@vbotbuildovich

Copy link
Copy Markdown
Collaborator

/backport v25.1.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants