Skip to content

feat: partial success in trace ingestion#7892

Merged
shuiyisong merged 11 commits intoGreptimeTeam:mainfrom
shuiyisong:feat/partial_success
Apr 1, 2026
Merged

feat: partial success in trace ingestion#7892
shuiyisong merged 11 commits intoGreptimeTeam:mainfrom
shuiyisong:feat/partial_success

Conversation

@shuiyisong
Copy link
Copy Markdown
Contributor

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

close #7875

What's changed and what's your intention?

This PR changes OTLP trace ingestion to preserve valid spans when part of a request fails, instead of failing the entire batch.

  • add OTLP trace partial success responses with rejected span counts and summarized error messages
  • ingest traces by OTLP resource/scope group, then subchunk each group for bounded writes
  • retry deterministic chunk failures span-by-span to salvage valid spans
  • discard ambiguous chunk failures without retrying and count those spans as rejected
  • split trace row building so main trace rows and auxiliary trace tables are handled independently
  • derive auxiliary trace tables only from spans whose main-table writes succeeded

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

@shuiyisong shuiyisong requested a review from a team as a code owner March 31, 2026 08:38
@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Mar 31, 2026
@fengys1996
Copy link
Copy Markdown
Contributor

@codex review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a chunked ingestion strategy for OTLP trace spans, featuring a fallback mechanism that retries individual spans when deterministic chunk failures occur. It introduces a TraceIngestOutcome struct to support OTLP partial success responses, allowing the server to report accepted and rejected spans to the client. The parsing logic has been refactored to preserve resource and scope groupings, and the system now updates auxiliary tables for services and operations based on successfully ingested spans. Review feedback focuses on refining the error classification logic to avoid wasteful retries on metadata errors, ensuring transient errors are propagated rather than swallowed to allow client-side retries, and optimizing the parsing process by removing redundant span count calculations.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6a05653073

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@shuiyisong shuiyisong force-pushed the feat/partial_success branch from 6a05653 to fe02acc Compare March 31, 2026 09:41
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
@shuiyisong shuiyisong force-pushed the feat/partial_success branch from fe02acc to 2811f22 Compare April 1, 2026 06:43
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
@github-actions github-actions bot added size/L and removed size/M labels Apr 1, 2026
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
@shuiyisong shuiyisong requested a review from killme2008 April 1, 2026 09:27
Copy link
Copy Markdown
Member

@killme2008 killme2008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Recommend adding more tests that simulate span or chunk write failures.

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
@shuiyisong shuiyisong enabled auto-merge April 1, 2026 11:24
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
@shuiyisong shuiyisong added this pull request to the merge queue Apr 1, 2026
Merged via the queue into GreptimeTeam:main with commit 3f3407f Apr 1, 2026
46 checks passed
@shuiyisong shuiyisong deleted the feat/partial_success branch April 1, 2026 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required This change does not impact docs. size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OTLP ingestion: single span error should not reject the entire write batch

4 participants