Skip to content

Conversation

@felixweinberger
Copy link
Contributor

@felixweinberger felixweinberger commented Nov 23, 2025

Summary

Implements SEP-1699 which enables servers to disconnect SSE connections at will by sending priming events and retry fields.

Motivation and Context

SEP-1699 introduces SSE polling behavior that allows servers to control client reconnection timing and close connections gracefully. This enables more efficient resource management on the server side while maintaining resumability.

We implement this on the POST SSE stream as implied by the SEP language linked above. I.e. when a server establishes an SSE stream:

  1. It's first message will be an event including no data, only an event ID.
  2. After that, it may call close_sse_stream to close the stream while still gathering the events.
  3. The client can start "polling" the SSE stream based on the retryInterval supplied by the server before disconnection.

How Has This Been Tested?

Example server and client:

CleanShot 2025-11-24 at 21 33 16

Breaking Changes

None. Client falls back to exponential backoff if no retry field is provided.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

@felixweinberger felixweinberger marked this pull request as draft November 23, 2025 22:51
felixweinberger added a commit that referenced this pull request Nov 23, 2025
Add tests for previously uncovered lines in streamable_http client
and server to achieve full test coverage:

- Client _get_next_reconnection_delay with server retry values
- Client resume_stream early returns and error handling
- Server _create_priming_event with various configurations
- Server close_sse_stream including edge cases

Github-Issue:#1654
felixweinberger added a commit that referenced this pull request Nov 23, 2025
- Add pragma: no cover to defensive code paths that require complex
  mocking (server retry field, resume_stream success path, generic
  exception handlers)
- Fix ruff formatting in test file

Github-Issue:#1654
felixweinberger added a commit that referenced this pull request Nov 23, 2025
- Add type annotations to memory object streams in tests
- Import SessionMessage for type annotations
- Add pragma: no cover to lines 556 and 571 specifically

Github-Issue:#1654
@felixweinberger felixweinberger force-pushed the fweinberger/sep-1699 branch 2 times, most recently from 89d038c to 99d1873 Compare November 24, 2025 21:13
felixweinberger added a commit that referenced this pull request Nov 24, 2025
The SSE polling reconnection code paths run in a subprocess during testing,
making them difficult to cover with the main test process's coverage
instrumentation. Add pragma comments to exclude these from coverage
requirements:

- Client _attempt_sse_reconnection method and call site
- Server StreamableHTTPSessionManager.close_sse_stream method
- Test callback branch for empty data handling

Github-Issue:#1654
@felixweinberger felixweinberger marked this pull request as draft November 26, 2025 15:21
felixweinberger added a commit that referenced this pull request Nov 27, 2025
Implements SEP-1699 which enables servers to disconnect SSE connections
at will by sending priming events and retry fields. This enables more
efficient resource management on the server side while maintaining
resumability.

Key changes:
- Server sends priming event (empty data with event ID) on SSE stream
- Server can call close_sse_stream() to close stream while gathering events
- Client auto-reconnects using server-provided retryInterval or exponential backoff
- Added e2e integration tests and example server/client

Github-Issue:#1654
felixweinberger added a commit that referenced this pull request Nov 27, 2025
Sets up test infrastructure and API surface for SEP-1699 SSE polling.
Tests will fail until implementation is complete.

New APIs (stubbed):
- StreamableHTTPReconnectionOptions dataclass
- Server: _create_priming_event(), close_sse_stream(), retry_interval
- Client: resume_stream(), _get_next_reconnection_delay()
- RequestContext.close_sse_stream callback

Github-Issue:#1654
felixweinberger added a commit that referenced this pull request Nov 27, 2025
Implements the SSE polling behavior defined in SEP-1699:
- Server sends priming event (empty data with event ID) on SSE stream
- Server can call close_sse_stream() to trigger client reconnection
- Client auto-reconnects using server-provided retryInterval or exponential backoff

Github-Issue:#1654
@felixweinberger
Copy link
Contributor Author

Reworking this from scratch.

This commit adds the API stubs and failing tests for the server-side
disconnect feature that enables SSE polling. When implemented, this
will allow servers to disconnect SSE streams without terminating them,
triggering client reconnection for polling patterns.

API stubs added:
- CloseSSEStreamCallback type in message.py
- close_sse_stream field in ServerMessageMetadata and RequestContext
- close_sse_stream() stub in StreamableHTTPServerTransport
- close_sse_stream() stub in FastMCP Context
- retry_interval parameter in session manager and transport

Tests added (all expected to fail until implementation):
- test_streamablehttp_client_receives_priming_event
- test_server_close_sse_stream_via_context
- test_streamablehttp_client_auto_reconnects
- test_streamablehttp_client_respects_retry_interval
- test_streamablehttp_sse_polling_full_cycle
- test_streamablehttp_events_replayed_after_disconnect

Github-Issue:#1699
Server now sends a priming event (SSE event with ID but empty data) at the
start of POST SSE streams when an EventStore is configured. This enables
clients to reconnect with Last-Event-ID even if the server closes the
connection before sending any actual data.

Changes:
- EventStore.store_event now accepts JSONRPCMessage | None (None for priming)
- Server sends priming event before processing messages in sse_writer
- Client calls resumption callback for empty-data events that have an ID
Server now supports closing SSE streams mid-operation via close_sse_stream(),
which triggers client reconnection. Client automatically reconnects when the
stream closes after receiving a priming event.

Changes:
- Server transport: Implement close_sse_stream() to close SSE writer
- Server transport: Create callback and pass via ServerMessageMetadata
- Lowlevel server: Thread close_sse_stream callback to RequestContext
- FastMCP Context: Wire close_sse_stream() to call the callback
- Client: Track priming events and auto-reconnect with Last-Event-ID
Server now sends the retry field in SSE priming events when retry_interval
is configured. Client respects this field and waits the specified interval
before reconnecting.

Changes:
- Server: Add retry field to priming event when retry_interval is set
- Server: Extract _send_priming_event() helper method
- Client: Track retry interval from SSE events
- Client: Wait for retry interval before reconnecting
Prevents potential DDOS when server doesn't provide retry interval.

Changes:
- Always wait before reconnecting (server retry value or 1s default)
- Track failed attempts only - successful reconnections reset counter
- Bail after 2 consecutive failures
@felixweinberger felixweinberger force-pushed the fweinberger/sep-1699 branch 2 times, most recently from a8aa7ea to e3a1c06 Compare November 27, 2025 15:22
@felixweinberger felixweinberger force-pushed the fweinberger/sep-1699 branch 6 times, most recently from 4bf5296 to 6170f02 Compare November 27, 2025 15:51
Demonstrates the SSE polling pattern with close_sse_stream():
- Server: process_batch tool that checkpoints periodically
- Client: auto-reconnects transparently with Last-Event-ID
- Shows priming events, retry interval, and event replay
…EP-1699)

- Register SSE writer in _replay_events() so subsequent close_sse_stream() calls work
- Send priming event on each reconnection
- Handle ClosedResourceError gracefully in both POST and GET SSE writers
- Add disconnect/reconnect logging at INFO level for visibility
- Add test for multiple reconnections during long-running tool calls
- Remove pragma from store_event (now covered by tests)
@felixweinberger felixweinberger force-pushed the fweinberger/sep-1699 branch 3 times, most recently from 21aafe8 to 39675e3 Compare November 27, 2025 17:33
…ver (SEP-1699)

- Add retry_interval parameter to FastMCP for SSE polling control
- Add InMemoryEventStore and test_reconnection tool to everything-server
- Enables SSE polling conformance test to pass (server-sse-polling scenario)
@felixweinberger felixweinberger force-pushed the fweinberger/sep-1699 branch 6 times, most recently from 238d4a2 to 01f5876 Compare November 27, 2025 18:24
@felixweinberger
Copy link
Contributor Author

Reworked this from the ground by going from tests first.

The best way to review this is probably look at the tests in tests/shared/test_streamable_http.py which describe the expected behavior introduced by SEP-1699

Then looking through server/streamable_http.py where we introduced close_sse_stream and streamable_http.py where we actually perform the automatic reconnection via GET stream in _handle_reconnection

The close_sse_stream is available via the RequestContext so the server doesn't have to reach down into the transport.

@felixweinberger felixweinberger marked this pull request as ready for review November 27, 2025 18:32
@felixweinberger
Copy link
Contributor Author

I updated the examples as well to clearly demonstrate how a stream can be disconnected by the server, triggering a reconnection & finally completing the request:

CleanShot 2025-11-27 at 18 33 05

Comment on lines +64 to +65
checkpoint_every = arguments.get("checkpoint_every", 3)

Copy link
Contributor

@crondinini-ant crondinini-ant Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've ran this example with a proxy in the middle, saved the request/results and I checked the event IDs being used. I noticed that the priming event ID is not used in this scenario because the connection is closed after sending messages, so the client ends up using the event ID from the last message, not from the priming event.
I'm not sure if the intention was to test the usage of the priming event, but if we wanted to add a test to this, then my local test with Claude indicates you'd have to close a connection before sending the first message:

            if test_priming and ctx.close_sse_stream:
                logger.info("Testing priming: closing SSE stream immediately (before any events)")
                await ctx.close_sse_stream()

it does work successfully btw, this is more of a comment in case we wanted to add a specific flag to test this.

Comment on lines 538 to +540
async with sse_stream_writer, request_stream_reader:
# Send priming event for SSE resumability
await self._send_priming_event(request_id, sse_stream_writer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[future improvement idea] not something for now, but as I understand it, a context manager would be nice here to make the code even cleaner and ensure we don't forget to clean up the writers. i.e.:

  @asynccontextmanager
  async def _managed_sse_stream(self, request_id, sse_writer, request_reader):
      self._sse_stream_writers[request_id] = sse_writer
      try:
          async with sse_writer, request_reader:
              yield
      finally:
          self._sse_stream_writers.pop(request_id, None)
          await self._clean_up_memory_streams(request_id)

  Usage:
  async with self._managed_sse_stream(request_id, sse_stream_writer, request_stream_reader):
      ...

# Then send the message to be processed by the server
metadata = ServerMessageMetadata(request_context=request)
session_message = SessionMessage(message, metadata=metadata)
session_message = self._create_session_message(message, request, request_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_create_session_message was a nice solution for passing around the close stream

# If stream ID not in mapping, create it
if stream_id and stream_id not in self._request_streams:
# Register SSE writer so close_sse_stream() can close it
self._sse_stream_writers[stream_id] = sse_stream_writer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[related to other comment] just tying it back to the contextmanager idea, if we had that, then we would not have to remember to close it here


await sse_stream_writer.send(event_data)
except anyio.ClosedResourceError:
# Expected when close_sse_stream() is called
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[q] I'm reading this relatively fast, so might be an incorrect question:
do we need to pop self._sse_stream_writers.pop(request_id, None) here?

await ctx.read_stream_writer.send(e) # pragma: no cover
return # Normal completion, no reconnect needed
except Exception as e: # pragma: no cover
logger.debug(f"SSE stream ended: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm bit confused about the change in wording here. Is this now related to the stream ending? I would expect this only happens on error reading stream?

# Stream ended without response - reconnect if we received an event with ID
if last_event_id is not None: # pragma: no branch
logger.info("SSE stream disconnected, reconnecting...")
await self._handle_reconnection(ctx, last_event_id, retry_interval_ms)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[future] I think this is aligned with how we current do things in the SDK, but would love to eventually revisit these things to make it easier to handle this sort of things in a distributed manner. It's related to my previous comment here, but essentially I'd like to allow client developers to be able handle reconnection in different ways. This is 100% out of scope for this feature, specially because we'd need to change so many other things in the parent level, but just leaving a breadcrumb for the future.

headers=headers,
timeout=httpx.Timeout(self.timeout, read=self.sse_read_timeout),
) as event_source:
event_source.response.raise_for_status()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[curious] was there a challenge with reading the response before raising?


# Stream ended again without response - reconnect again (reset attempt counter)
logger.info("SSE stream disconnected, reconnecting...")
await self._handle_reconnection(ctx, reconnect_last_event_id, reconnect_retry_ms, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[q] why do we need to reset the attempt counter here?

Copy link
Contributor

@crondinini-ant crondinini-ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement SEP-1699: Support SSE Polling via Server-Side Disconnect

3 participants