Skip to content

Fix a couple error cases in Json parsing #116446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 13, 2025

Conversation

BrennanConroy
Copy link
Member

Found a couple edge cases where ArgumentException would be thrown instead of a cleaner JsonException.

  1. DateOnlyConverter didn't check the input length after doing escaping
  2. Utf8JsonReader.MultiSegment was trying to copy a larger span into a smaller span in a couple edge cases

@Copilot Copilot AI review requested due to automatic review settings June 9, 2025 20:49
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds edge-case tests and tightens error handling to ensure JsonException is thrown instead of ArgumentException when parsing invalid JSON in multi-segment readers and date parsing.

  • Added two multi-segment literal tests to cover oversized segment errors.
  • Extended DateOnly_Read_Failure with a test for escaped-length validation.
  • Updated DateOnlyConverter to check length bounds before parsing and throw on failure.
  • Refactored Utf8JsonReader.CheckLiteralMultiSegment to use a safe AmountToWrite helper.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
Utf8JsonReaderTests.MultiSegment.cs Added tests for invalid literals spanning segments
Value.ReadTests.cs Added InlineData for DateOnly escaped-length failure
JsonTestHelper.cs Introduced GetSequence helper and System.Linq import
DateOnlyConverter.cs Added range check before ISO parse and improved exception flow
Utf8JsonReader.MultiSegment.cs Replaced inline Math.Min calls with AmountToWrite helper
Comments suppressed due to low confidence (2)

src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Value.ReadTests.cs:653

  • Consider adding a test case where the unescaped DateOnly string exceeds MaxEscapedFormatLength to verify that JsonException is thrown for overly long inputs.
[InlineData("05-1\\u0000", true)] // String length 10 before unescaping, less than 10 after escaping

src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.MultiSegment.cs:620

  • [nitpick] The helper method name AmountToWrite is a bit terse; consider renaming it to CalculateAmountToWrite or GetWritableCount for clearer intent.
static int AmountToWrite(ReadOnlySpan<byte> span, long bytePositionInLine, ReadOnlySpan<byte> readSoFar, int written)

@@ -47,12 +47,14 @@ private static DateOnly ReadCore(ref Utf8JsonReader reader)
source = stackSpan.Slice(0, bytesWritten);
}

if (!JsonHelpers.TryParseAsIso(source, out DateOnly value))
if (JsonHelpers.IsInRangeInclusive(source.Length, FormatLength, MaxEscapedFormatLength) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this additional check necessary? I assumed the same check in line 33 does take escaping into account.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 46:
int bytesWritten = reader.CopyString(stackSpan);
performs unescaping which can change the length of the source.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move it into the else after the copy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 46: int bytesWritten = reader.CopyString(stackSpan); performs unescaping which can change the length of the source.

I guess I'm confused by the fact that this check keeps using MaxEscapedFormatLength as the upper bound even though source is guaranteed to be unescaped at this point.

Judging by the test you've included a lower bound check would suffice, but I suspect the problem really lies in the fact that the TryParseAsIso method isn't handling short inputs correctly (it's throwing presumably?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TryParseAsIso method isn't handling short inputs correctly (it's throwing presumably?)

It has a Debug.Assert
https://source.dot.net/#System.Text.Json/System/Text/Json/JsonHelpers.Date.cs,146
I chose not to change it to a runtime check in order to reduce noise from changing all the callsites that do properly check the length.

It then starts accessing various positions in the Span assuming there are at least 10 digits available, so at runtime you could get ArgumentException.

Judging by the test you've included a lower bound check would suffice

Sure, I can change it to if (bytesWritten < FormatLength)

{
return Math.Min(
readSoFar.Length - written,
Math.Min(span.Length, (int)bytePositionInLine + 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it safe to cast the long to an int? If it is over int.Max, won't that cast to a negative value, which means this method would return a negative value?

If bytePositionInLine can never be bigger than int.Max, then why is it a long?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I thought the whole point of ReadOnlySequence is that it could allow data that is bigger than int.MaxValue. (Therefore, we would need to re-evaluate all locations that cast this and ValueSequence.Length to a regular integer).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That code was there before this PR. Also, we're looking at a Span not a ReadOnlySequence

If there is some re-evaluation needed, it probably should be done separately.

@BrennanConroy BrennanConroy merged commit e3c6605 into dotnet:main Jun 13, 2025
84 of 86 checks passed
@BrennanConroy BrennanConroy deleted the brecon/edgecases branch June 13, 2025 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants