Skip to content

TTS services fixes.#3729

Merged
filipi87 merged 7 commits intomainfrom
filipi/elevenlabs_issue
Feb 12, 2026
Merged

TTS services fixes.#3729
filipi87 merged 7 commits intomainfrom
filipi/elevenlabs_issue

Conversation

@filipi87
Copy link
Copy Markdown
Contributor

@filipi87 filipi87 commented Feb 12, 2026

Summary

  • Fixed word timestamp interleaving issue in ElevenLabsTTS when processing multiple sentences
  • Added create_context_id() override to 6 TTS services to properly reuse context IDs across multiple run_tts invocations
  • Fixed initialization logic to only emit TTSStartedFrame and start metrics once per turn, not once per sentence

The Problem

When an LLM generates multiple sentences in a single turn (before LLMFullResponseEndFrame), the base TTSService.create_context_id() creates a new UUID for each sentence. This caused:

  1. Word timestamp interleaving - timestamps tracked in separate contexts instead of one continuous stream
  2. Multiple TTSStartedFrame emissions - incorrect lifecycle signaling
  3. Context tracking issues - base class tracking multiple contexts when only one is actively used

The root cause: frames are only paused when receiving LLMFullResponseEndFrame, but by that point run_tts() has been invoked multiple times (once per sentence). Some services need to reuse the same context_id across all invocations within a turn.

The Solution

Services that extend AudioContextTTSService or AudioContextWordTTSService and maintain self._context_id for turn tracking now override create_context_id() to:

  1. Return existing self._context_id if one is already in progress
  2. Create a new UUID only when starting a new turn
  3. Reset self._context_id on interruptions or turn completion

Services Fixed

  1. ElevenLabsTTSService - Context reuse + Word timestamp interleaving across sentences
  2. InworldTTSService - Context reuse + Moved initialization inside guard
  3. RimeTTSService - Context reuse
  4. CartesiaTTSService - Context reuse
  5. AsyncAITTSService - Context reuse + Moved initialization inside guard
  6. PlayHTTTSService - Context reuse

Fixes #3723

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 12, 2026

Codecov Report

❌ Patch coverage is 0% with 49 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/pipecat/services/elevenlabs/tts.py 0.00% 19 Missing ⚠️
src/pipecat/services/asyncai/tts.py 0.00% 9 Missing ⚠️
src/pipecat/services/inworld/tts.py 0.00% 6 Missing ⚠️
src/pipecat/services/cartesia/tts.py 0.00% 5 Missing ⚠️
src/pipecat/services/playht/tts.py 0.00% 5 Missing ⚠️
src/pipecat/services/rime/tts.py 0.00% 5 Missing ⚠️
Files with missing lines Coverage Δ
src/pipecat/services/cartesia/tts.py 0.00% <0.00%> (ø)
src/pipecat/services/playht/tts.py 0.00% <0.00%> (ø)
src/pipecat/services/rime/tts.py 0.00% <0.00%> (ø)
src/pipecat/services/inworld/tts.py 0.00% <0.00%> (ø)
src/pipecat/services/asyncai/tts.py 0.00% <0.00%> (ø)
src/pipecat/services/elevenlabs/tts.py 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@filipi87 filipi87 requested a review from markbackman February 12, 2026 16:04
# If an ID exists, continue using the current ID.
# When interruptions happens, user speech results in
# an interruption, which resets the context ID.
if not self._context_id:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just move this logic to the base class?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think we should for now. Most of the TTS services don’t save the context_id, and we already have a reference to it in each TTS service.

So maybe, in a follow up refactor, we can check which services are currently saving the context_id and refactor them. But for now, I would keep it like this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markbackman, I’ve added this to our 1.0 wishlist, along with the other TTS improvements, so we can review how the classes are handling self._context_id. 👍

@filipi87 filipi87 changed the title Fixing ElevenLabs TTS word timestamp interleaving across sentences. TTS services fixes. Feb 12, 2026
Copy link
Copy Markdown
Contributor

@markbackman markbackman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

This works as a temporary fix, but I agree that we'll want to make the API easier to work with in the future.

@filipi87 filipi87 merged commit 432870c into main Feb 12, 2026
6 checks passed
@filipi87 filipi87 deleted the filipi/elevenlabs_issue branch February 12, 2026 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

assistant context sentences are getting jumbled up

2 participants