Skip to content

Add ResembleAITTSService#3134

Merged
markbackman merged 2 commits intomainfrom
mb/resemble-tts-draft
Feb 2, 2026
Merged

Add ResembleAITTSService#3134
markbackman merged 2 commits intomainfrom
mb/resemble-tts-draft

Conversation

@markbackman
Copy link
Copy Markdown
Contributor

@markbackman markbackman commented Nov 26, 2025

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

cc @krishvadhani19 for review.

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 26, 2025

Codecov Report

❌ Patch coverage is 0% with 220 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/pipecat/services/resembleai/tts.py 0.00% 220 Missing ⚠️
Files with missing lines Coverage Δ
src/pipecat/services/resembleai/tts.py 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@markbackman markbackman force-pushed the mb/resemble-tts-draft branch 3 times, most recently from a5f014c to 6c1229e Compare January 29, 2026 04:54
@markbackman markbackman marked this pull request as ready for review January 29, 2026 05:07
@markbackman markbackman force-pushed the mb/resemble-tts-draft branch from 85c02ef to 97bca9d Compare January 29, 2026 05:08
Comment on lines +96 to +100
# Jitter buffer: accumulate audio before starting playback to absorb network latency
# ResembleAI sends audio in bursts with 300-450ms gaps between them
# We need to buffer enough to cover these gaps before starting playback
self._jitter_buffer_bytes = 44100 # ~1000ms at 22050Hz to handle 400ms+ network gaps
self._playback_started: dict[str, bool] = {} # Track if we've started playback per request
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that for all other TTS services we send the audio as soon as we get it.

And I thought we were already handling this buffering before starting playback somewhere else in the pipeline, but I might be mistaken.

So I’m just double checking if this is really needed for this service.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is kind of the same thing with _playback_started.

It feels like this service is slightly more complicated than the others we’ve developed, so I just want to confirm that this is really needed here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do, but when I do that, the audio is choppy. I think this is an issue with the Resemble TTS service, which would ideally be fixed in the service itself. That is, you can't stream audio from Resemble, you have to stream to the client and then buffer. You can see the version before this without the buffering:
6c1229e

cc @krishvadhani19 who wrote this code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see,yeah, the downside is that we would basically always add a 400 ms latency due to the buffer. But if this is a current service limitation, I think it makes sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. This is a pretty big downside. The best TTS services has a P90+ latency of 200ms.

@krishvadhani19 can we avoid needing to buffer? Without it, I hear very choppy audio.

Copy link
Copy Markdown
Contributor

@filipi87 filipi87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Hopefully we’ll be able to remove this audio buffer in the future, but for now it probably makes sense to keep it to prevent the audio from being choppy.

@krishvadhani19
Copy link
Copy Markdown

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

cc @krishvadhani19 for review.

Hey @markbackman, the buffer I have maintained is because resemble streams audio in bursts. We have sometimes gaps of more than 500ms. So this initial buffer helps solve it someway. I have communicated the above problem with the team.
Without the buffer, the audio is extremely choppy.

Once fixed, I will get rid of this buffer from our service. thank you.

cc: @filipi87

@markbackman
Copy link
Copy Markdown
Contributor Author

@krishvadhani19 thanks for the reply. Given that, we'll merge as is and once your service no longer requires this, we can remove the buffering.

@krishvadhani19
Copy link
Copy Markdown

@krishvadhani19 thanks for the reply. Given that, we'll merge as is and once your service no longer requires this, we can remove the buffering.

thank you @markbackman.

@markbackman markbackman force-pushed the mb/resemble-tts-draft branch from 97bca9d to a592b7f Compare February 2, 2026 13:55
@markbackman markbackman merged commit 54e62a8 into main Feb 2, 2026
10 checks passed
@markbackman markbackman deleted the mb/resemble-tts-draft branch February 2, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants