Skip to content

Fix LLMFullResponseEndFrame racing ahead of final TTSTextFrame#4127

Merged
markbackman merged 2 commits intomainfrom
mb/tts-text-frame-ordering
Mar 24, 2026
Merged

Fix LLMFullResponseEndFrame racing ahead of final TTSTextFrame#4127
markbackman merged 2 commits intomainfrom
mb/tts-text-frame-ordering

Conversation

@markbackman
Copy link
Copy Markdown
Contributor

@markbackman markbackman commented Mar 24, 2026

Context

This PR solves the following issue:

When using a TTSService that relies on the TTSService base class to push text frame (e.g. push_text_frames=True) then the frames pushed by the TTSService for a given turn will be in this order:

  • LLMFullResponseStartFrame
  • TTSTextFrame (#1 through #n-1)
  • LLMFullResponseEndFrame
  • TTSTextFrame (#n)

This causes a problem in the LLMAssistantAggregator for text input only. Text is added to the context only when between the LLMFullResponseStart/EndFrames. The dangling message was getting dropped from the context.

The solution is to have the TTSService correctly output frames in order. That's what this PR focuses on by using the serialization_queue.

Summary

  • Route LLMFullResponseEndFrame through the TTS serialization queue (instead of pushing directly downstream) when push_text_frames is enabled
  • Ensures the frame is emitted only after the audio context is fully drained, preserving correct ordering relative to TTSTextFrames
  • Fixes an issue where the final sentence was dropped from the conversation context when using RTVI text input with non-word-timestamp TTS services

Test plan

  • Added test_http_push_text_llm_response_end_after_tts_text that verifies all TTSTextFrames precede LLMFullResponseEndFrame
  • Confirmed the new test fails without the fix and passes with it
  • All existing TTS frame ordering tests pass (4/4)
  • Full non-integration test suite passes (1151 passed)

🤖 Generated with Claude Code

Fixes #4111

Route LLMFullResponseEndFrame through the serialization queue instead
of pushing it directly downstream when push_text_frames is enabled.
This ensures the frame is emitted only after the audio context is
fully drained, preserving correct ordering relative to TTSTextFrames.

Previously, the final sentence TTSTextFrame would arrive at the
LLMAssistantAggregator after LLMFullResponseEndFrame, causing it to
be dropped from the conversation context (especially with RTVI text
input where no subsequent interruption would flush the orphaned text).
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
src/pipecat/services/tts_service.py 69.00% <100.00%> (+3.67%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@markbackman markbackman requested a review from filipi87 March 24, 2026 19:17
Copy link
Copy Markdown
Contributor

@filipi87 filipi87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, nice catch! 🚀

@markbackman markbackman merged commit b49bf1c into main Mar 24, 2026
6 checks passed
@markbackman markbackman deleted the mb/tts-text-frame-ordering branch March 24, 2026 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Assistant Context Is Committed Too Early After Multi-Sentence TTS

2 participants