Skip to content

Commit e5632a9

Browse files
committed
transition Hathora service to use the unified API and apply PR feedback
add Hathora to root files Hathora run linter added hathora changelog
1 parent 1510fb4 commit e5632a9

8 files changed

Lines changed: 152 additions & 205 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,9 +73,9 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
7373

7474
| Category | Services |
7575
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
76-
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Gradium](https://docs.pipecat.ai/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Sarvam](https://docs.pipecat.ai/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
76+
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Gradium](https://docs.pipecat.ai/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [Hathora](https://docs.pipecat.ai/server/services/stt/hathora), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Sarvam](https://docs.pipecat.ai/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
7777
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
78-
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Gradium](https://docs.pipecat.ai/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
78+
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Gradium](https://docs.pipecat.ai/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hathora](https://docs.pipecat.ai/server/services/tts/hathora), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
7979
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/server/services/s2s/ultravox), |
8080
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
8181
| Serializers | [Exotel](https://docs.pipecat.ai/server/utilities/serializers/exotel), [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx), [Vonage](https://docs.pipecat.ai/server/utilities/serializers/vonage) |

changelog/3169.added.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
- Added Hathora service to support Hathora-hosted TTS and STT models (only non-streaming)

env.example

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,9 @@ GROK_API_KEY=...
8585
# Groq
8686
GROQ_API_KEY=...
8787

88+
# Hathora
89+
HATHORA_API_KEY=...
90+
8891
# Heygen
8992
HEYGEN_API_KEY=...
9093
HEYGEN_LIVE_AVATAR_API_KEY=...

examples/foundational/07af-interruptible-hathora.py renamed to examples/foundational/07ag-interruptible-hathora.py

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@
2121
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
2222
from pipecat.runner.types import RunnerArguments
2323
from pipecat.runner.utils import create_transport
24-
from pipecat.services.hathora.stt import ParakeetSTTService
25-
from pipecat.services.hathora.tts import ChatterboxTTSService, KokoroTTSService
24+
from pipecat.services.hathora.stt import HathoraSTTService
25+
from pipecat.services.hathora.tts import HathoraTTSService
2626
from pipecat.services.openai.llm import OpenAILLMService
2727
from pipecat.transports.base_transport import BaseTransport, TransportParams
2828
from pipecat.transports.daily.transport import DailyParams
@@ -38,38 +38,34 @@
3838
audio_in_enabled=True,
3939
audio_out_enabled=True,
4040
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
41-
turn_analyzer=LocalSmartTurnAnalyzerV3(),
41+
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
42+
),
43+
"twilio": lambda: FastAPIWebsocketParams(
44+
audio_in_enabled=True,
45+
audio_out_enabled=True,
46+
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
47+
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
4248
),
4349
"webrtc": lambda: TransportParams(
4450
audio_in_enabled=True,
4551
audio_out_enabled=True,
4652
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
47-
turn_analyzer=LocalSmartTurnAnalyzerV3(),
53+
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
4854
),
4955
}
5056

5157

5258
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
5359
logger.info(f"Starting bot")
5460

55-
# See https://models.hathora.dev/model/nvidia-parakeet-tdt-0.6b-v3
56-
stt = ParakeetTDTSTTService(
57-
base_url="https://app-1c7bebb9-6977-4101-9619-833b251b86d1.app.hathora.dev/v1/transcribe",
58-
api_key=os.getenv("HATHORA_API_KEY")
61+
stt = HathoraSTTService(
62+
model="nvidia-parakeet-tdt-0.6b-v3",
5963
)
6064

61-
# See https://models.hathora.dev/model/hexgrad-kokoro-82m
62-
tts = KokoroTTSService(
63-
base_url="https://app-01312daf-6e53-4b9d-a4ad-13039f35adc4.app.hathora.dev/synthesize",
64-
api_key=os.getenv("HATHORA_API_KEY"),
65+
tts = HathoraTTSService(
66+
model="hexgrad-kokoro-82m",
6567
)
6668

67-
# See https://models.hathora.dev/model/resemble-ai-chatterbox
68-
# tts = ChatterboxTTSService(
69-
# base_url="https://app-efbc8fe2-df55-4f96-bbe3-74f6ea9d986b.app.hathora.dev/v1/generate",
70-
# api_key=os.getenv("HATHORA_API_KEY")
71-
# )
72-
7369
# See https://models.hathora.dev/model/qwen3-30b-a3b
7470
llm = OpenAILLMService(
7571
base_url="https://app-362f7ca1-6975-4e18-a605-ab202bf2c315.app.hathora.dev/v1",
Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +0,0 @@
1-
#
2-
# Copyright (c) 2024–2025, Daily
3-
#
4-
# SPDX-License-Identifier: BSD 2-Clause License
5-
#
6-
7-
import sys
8-
9-
from pipecat.services import DeprecatedModuleProxy
10-
11-
from .stt import *
12-
from .tts import *
13-
14-
sys.modules[__name__] = DeprecatedModuleProxy(globals(), "hathora", "hathora.[stt,tts]")

0 commit comments

Comments
 (0)