Skip to content

fix: isolate REST timeout middleware handler context#49982

Merged
sre-ci-robot merged 1 commit into
milvus-io:masterfrom
congqixia:fix/isolate_gincontext_handler
May 22, 2026
Merged

fix: isolate REST timeout middleware handler context#49982
sre-ci-robot merged 1 commit into
milvus-io:masterfrom
congqixia:fix/isolate_gincontext_handler

Conversation

@congqixia

Copy link
Copy Markdown
Contributor

Related to #49981

The REST timeout middleware could return a timeout response while the handler goroutine continued using the original Gin context. Once Gin recycled that context, late handler writes could race with a later request and trigger concurrent map writes.

Run timeout-wrapped handlers with a copied Gin context and a fully buffered response recorder instead. The original context and real response writer now remain owned by the middleware goroutine, while normal completion explicitly propagates selected metadata and commits the recorder. Timeout closes the recorder and writes the 408 response through the real writer, so late handler writes are discarded safely.

Add tests for recorder isolation, normal buffered commits, metadata and abort propagation, late writes after timeout, and race validation of the late-write timeout path.

Related to milvus-io#49981

The REST timeout middleware could return a timeout response while the
handler goroutine continued using the original Gin context. Once Gin
recycled that context, late handler writes could race with a later
request and trigger concurrent map writes.

Run timeout-wrapped handlers with a copied Gin context and a fully
buffered response recorder instead. The original context and real
response writer now remain owned by the middleware goroutine, while
normal completion explicitly propagates selected metadata and commits
the recorder. Timeout closes the recorder and writes the 408 response
through the real writer, so late handler writes are discarded safely.

Add tests for recorder isolation, normal buffered commits, metadata and
abort propagation, late writes after timeout, and race validation of the
late-write timeout path.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: congqixia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot added the size/L Denotes a PR that changes 100-499 lines. label May 21, 2026
@mergify mergify Bot added dco-passed DCO check passed. kind/bug Issues or changes related a bug labels May 21, 2026
@sre-ci-robot

Copy link
Copy Markdown
Contributor

[ci-v2-notice]
Notice: New ci-v2 system is enabled for this PR.

To rerun ci-v2 checks, comment with:

  • /ci-rerun-code-check // for ci-v2/code-check
  • /ci-rerun-build // for ci-v2/build
  • /ci-rerun-build-all // for ci-v2/build-all (multi-arch builds)
  • /ci-rerun-buildenv // for ci-v2/build-env (build milvus-env builder images)
  • /ci-rerun-ut-integration // for ci-v2/ut-integration, will rerun ci-v2/build
  • /ci-rerun-ut-go // for ci-v2/ut-go, will rerun ci-v2/build
  • /ci-rerun-ut-cpp // for ci-v2/ut-cpp
  • /ci-rerun-ut // for all ci-v2/ut-integration, ci-v2/ut-go, ci-v2/ut-cpp, will rerun ci-v2/build
  • /ci-rerun-e2e-default // for ci-v2/e2e-default
  • /ci-rerun-e2e-amd // for ci-v2/e2e-amd (e2e pool dispatcher)
  • /ci-rerun-build-ut-cov // for ci-v2/build-ut-cov (build + unit tests in one pipeline)
  • /ci-rerun-gosdk // for ci-v2/go-sdk (Go SDK E2E tests, ARM)

If you have any questions or requests, please contact @zhikunyao.

@sre-ci-robot

Copy link
Copy Markdown
Contributor

✅ CI Loop Results c9ee73e

Stage Result Duration Tests
✅ Build SUCCESS 12.9min -
✅ Code-Check SUCCESS 4.8min -
✅ UT-GO SUCCESS 15.1min 1012 passed
✅ UT-Integration SUCCESS 23.7min 46 passed
✅ UT-CPP-Cov SUCCESS 52.4min 7725 passed

Total: 69min | Pipeline | Artifacts

Overall Coverage: 67.4%
Diff Coverage: Go 72.1% (101 hit, 39 miss, 140 measurable lines)
Total Patch Coverage: 72.1% (101/140 measurable lines)

@tedxu

tedxu commented May 21, 2026

Copy link
Copy Markdown
Contributor

/lgtm

@codecov

codecov Bot commented May 21, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 79.48718% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.51%. Comparing base (d7f0751) to head (c9ee73e).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
...distributed/proxy/httpserver/timeout_middleware.go 79.48% 18 Missing and 6 partials ⚠️

❌ Your patch status has failed because the patch coverage (79.48%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #49982      +/-   ##
==========================================
- Coverage   78.55%   78.51%   -0.05%     
==========================================
  Files        2209     2209              
  Lines      383180   383231      +51     
==========================================
- Hits       301012   300891     -121     
- Misses      73039    73188     +149     
- Partials     9129     9152      +23     
Components Coverage Δ
Client 80.59% <ø> (ø)
Core 85.29% <ø> (ø)
Go 76.50% <79.48%> (-0.07%) ⬇️
Files with missing lines Coverage Δ
internal/distributed/proxy/httpserver/utils.go 82.23% <ø> (+0.10%) ⬆️
...distributed/proxy/httpserver/timeout_middleware.go 78.60% <79.48%> (+8.75%) ⬆️

... and 35 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sre-ci-robot sre-ci-robot added the low-code-coverage add test-label from zhikun, diff coverage > 80% label May 21, 2026
sre-ci-robot pushed a commit that referenced this pull request May 21, 2026
Cherry-pick from master
pr: #49982
Related to #49981

The REST timeout middleware could return a timeout response while the
handler goroutine continued using the original Gin context. Once Gin
recycled that context, late handler writes could race with a later
request and trigger concurrent map writes.

Run timeout-wrapped handlers with a copied Gin context and a fully
buffered response recorder instead. The original context and real
response writer now remain owned by the middleware goroutine, while
normal completion explicitly propagates selected metadata and commits
the recorder. Timeout closes the recorder and writes the 408 response
through the real writer, so late handler writes are discarded safely.

Add tests for recorder isolation, normal buffered commits, metadata and
abort propagation, late writes after timeout, and race validation of the
late-write timeout path.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
@congqixia

Copy link
Copy Markdown
Contributor Author

/ci-rerun-e2e-default

1 similar comment
@congqixia

Copy link
Copy Markdown
Contributor Author

/ci-rerun-e2e-default

@mergify mergify Bot added the ci-passed label May 22, 2026
@sre-ci-robot sre-ci-robot merged commit 08b723a into milvus-io:master May 22, 2026
16 of 19 checks passed
congqixia added a commit to congqixia/milvus that referenced this pull request May 25, 2026
Related to milvus-io#49981

The REST timeout middleware could return a timeout response while the
handler goroutine continued using the original Gin context. Once Gin
recycled that context, late handler writes could race with a later
request and trigger concurrent map writes.

Run timeout-wrapped handlers with a copied Gin context and a fully
buffered response recorder instead. The original context and real
response writer now remain owned by the middleware goroutine, while
normal completion explicitly propagates selected metadata and commits
the recorder. Timeout closes the recorder and writes the 408 response
through the real writer, so late handler writes are discarded safely.

Add tests for recorder isolation, normal buffered commits, metadata and
abort propagation, late writes after timeout, and race validation of the
late-write timeout path.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
sre-ci-robot pushed a commit that referenced this pull request May 25, 2026
…50006)

Cherry-pick from master
pr: #49982
Related to #49981

The REST timeout middleware could return a timeout response while the
handler goroutine continued using the original Gin context. Once Gin
recycled that context, late handler writes could race with a later
request and trigger concurrent map writes.

Run timeout-wrapped handlers with a copied Gin context and a fully
buffered response recorder instead. The original context and real
response writer now remain owned by the middleware goroutine, while
normal completion explicitly propagates selected metadata and commits
the recorder. Timeout closes the recorder and writes the 408 response
through the real writer, so late handler writes are discarded safely.

Add tests for recorder isolation, normal buffered commits, metadata and
abort propagation, late writes after timeout, and race validation of the
late-write timeout path.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
@congqixia congqixia deleted the fix/isolate_gincontext_handler branch May 26, 2026 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved ci-passed dco-passed DCO check passed. kind/bug Issues or changes related a bug lgtm low-code-coverage add test-label from zhikun, diff coverage > 80% size/L Denotes a PR that changes 100-499 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants