Gemini Live: User transcription not returned (only assistant transcription)

### pipecat version

```
pipecat-ai-small-webrtc-prebuilt>=2.2.0
pipecat-ai[cartesia,daily,deepgram,google,silero,tracing,webrtc]>=0.0.102
pipecatcloud>=0.2.20
```

### Python version

3.13

### Operating System

macOS Tahoe

### Related issue

This is a follow-up to #3350 — the issue persists.

### Issue description

Using the Gemini Live model without a separate STT service, only the **assistant transcription** is returned. User transcription is not surfaced to the client, even though the Gemini model does appear to receive and process user audio.

Apologies if this turns out to be an issue on Gemini's side.

When the transcription option is enabled, only assistant transcription comes back from the Gemini model. Sometimes it works fine, but most of the time user transcription is missing. I also tried to capture user transcription using a custom frame processor, but it does not appear to be returned at all.

Is there a way to enable a "transcribe audio" option similar to the previous Gemini multimodal library version?

### Reproduction steps

Running the foundational example [`26a-gemini-live-transcription.py`](https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/26a-gemini-live-transcription.py) as-is.

### Expected behavior

Both user and assistant transcriptions should appear in the client UI:

```
assistant: Hello! Do you want to hear a joke?
user: Yeah, sure.
assistant: Why don't scientists trust atoms?
user: I'm not sure.
assistant: Because they make everything up!
```

### Actual behavior

Only the assistant transcription is displayed in the client UI. The user messages are missing:

<img width="1182" height="300" alt="Image" src="https://github.com/user-attachments/assets/639e7cf2-78ac-4c25-b4b5-3d16505893fa" />

*(Screenshot: The client UI shows only "assistant" messages with no user messages visible.)*

### Logs

Note: the server-side logs do show `[Transcription:user]` debug lines from Gemini, but these do not appear to propagate to the client.

```
2026-02-19 23:05:06.077 | INFO     | __main__:run_bot:55 - Starting bot
2026-02-19 23:05:06.093 | DEBUG    | pipecat.audio.vad.silero:__init__:147 - Loading Silero VAD model...
2026-02-19 23:05:06.127 | DEBUG    | pipecat.audio.vad.silero:__init__:169 - Loaded Silero VAD
2026-02-19 23:05:06.128 | DEBUG    | pipecat.audio.turn.smart_turn.local_smart_turn_v3:__init__:74 - Loading Local Smart Turn v3.x model from /Users/will/dev/phil/.venv/lib/python3.13/site-packages/pipecat/audio/turn/smart_turn/data/smart-turn-v3.2-cpu.onnx...
2026-02-19 23:05:06.151 | DEBUG    | pipecat.audio.turn.smart_turn.local_smart_turn_v3:__init__:85 - Loaded Local Smart Turn v3.x
2026-02-19 23:05:06.151 | DEBUG    | pipecat.processors.frame_processor:link:561 - Linking Pipeline#0::Source -> SmallWebRTCInputTransport#0
2026-02-19 23:05:06.151 | DEBUG    | pipecat.processors.frame_processor:link:561 - Linking SmallWebRTCInputTransport#0 -> LLMUserAggregator#0
2026-02-19 23:05:06.151 | DEBUG    | pipecat.processors.frame_processor:link:561 - Linking LLMUserAggregator#0 -> GeminiLiveLLMService#0
2026-02-19 23:05:06.151 | DEBUG    | pipecat.processors.frame_processor:link:561 - Linking GeminiLiveLLMService#0 -> SmallWebRTCOutputTransport#0
2026-02-19 23:05:06.151 | DEBUG    | pipecat.processors.frame_processor:link:561 - Linking SmallWebRTCOutputTransport#0 -> LLMAssistantAggregator#0
2026-02-19 23:05:06.151 | DEBUG    | pipecat.processors.frame_processor:link:561 - Linking LLMAssistantAggregator#0 -> Pipeline#0::Sink
2026-02-19 23:05:06.151 | DEBUG    | pipecat.processors.frame_processor:link:561 - Linking PipelineTask#0::Source -> RTVIProcessor#0
2026-02-19 23:05:06.151 | DEBUG    | pipecat.processors.frame_processor:link:561 - Linking RTVIProcessor#0 -> Pipeline#0
2026-02-19 23:05:06.151 | DEBUG    | pipecat.processors.frame_processor:link:561 - Linking Pipeline#0 -> PipelineTask#0::Sink
2026-02-19 23:05:06.151 | DEBUG    | pipecat.pipeline.runner:run:71 - Runner PipelineRunner#0 started running PipelineTask#0
2026-02-19 23:05:06.152 | DEBUG    | pipecat.pipeline.task:_wait_for_pipeline_start:718 - PipelineTask#0: Starting. Waiting for StartFrame#0 to reach the end of the pipeline...
2026-02-19 23:05:06.153 | INFO     | pipecat.services.google.gemini_live.llm:_connect:1072 - Connecting to Gemini service
2026-02-19 23:05:06.228 | INFO     | pipecat.services.google.gemini_live.llm:_connection_task_handler:1187 - Connected to Gemini service
2026-02-19 23:05:06.228 | DEBUG    | pipecat.services.google.gemini_live.llm:_create_initial_response:1390 - Creating initial response
2026-02-19 23:05:07.088 | DEBUG    | pipecat.processors.metrics.frame_processor_metrics:stop_ttfb_metrics:131 - GeminiLiveLLMService#0 TTFB: 0.8599798679351807
2026-02-19 23:05:07.636 | DEBUG    | pipecat.transports.base_output:_bot_started_speaking:608 - Bot started speaking
2026-02-19 23:05:09.828 | DEBUG    | pipecat.processors.metrics.frame_processor_metrics:start_llm_usage_metrics:173 - GeminiLiveLLMService#0 prompt tokens: 355, completion tokens: 60, reasoning tokens: 35
2026-02-19 23:05:09.871 | INFO     | __main__:on_assistant_turn_stopped:133 - Transcript: [2026-02-19T23:05:07.089+00:00] assistant: Hello! Do you want to hear a joke?
2026-02-19 23:05:10.220 | DEBUG    | pipecat.transports.base_output:_bot_stopped_speaking:630 - Bot stopped speaking
2026-02-19 23:05:10.725 | DEBUG    | pipecat.processors.aggregators.llm_response_universal:_on_user_turn_started:685 - LLMUserAggregator#0: User started speaking
2026-02-19 23:05:11.556 | DEBUG    | pipecat.services.google.gemini_live.llm:_handle_msg_input_transcription:1683 - [Transcription:user] [Yeah, sure.]
2026-02-19 23:05:11.576 | DEBUG    | pipecat.audio.turn.smart_turn.base_smart_turn:analyze_end_of_turn:162 - End of Turn result: EndOfTurnState.INCOMPLETE
2026-02-19 23:05:14.500 | DEBUG    | pipecat.processors.aggregators.llm_response_universal:_on_user_turn_stopped:703 - LLMUserAggregator#0: User stopped speaking
2026-02-19 23:05:14.501 | INFO     | __main__:on_user_turn_stopped:127 - Transcript: [2026-02-19T23:05:10.725+00:00] user: Yeah, sure.
2026-02-19 23:05:15.699 | DEBUG    | pipecat.processors.metrics.frame_processor_metrics:start_llm_usage_metrics:173 - GeminiLiveLLMService#0 prompt tokens: 471, completion tokens: 77, reasoning tokens: 23
2026-02-19 23:05:15.715 | INFO     | __main__:on_assistant_turn_stopped:133 - Transcript: [2026-02-19T23:05:12.284+00:00] assistant: Why don't scientists trust atoms?
2026-02-19 23:05:16.063 | DEBUG    | pipecat.transports.base_output:_bot_stopped_speaking:630 - Bot stopped speaking
2026-02-19 23:05:17.220 | DEBUG    | pipecat.processors.aggregators.llm_response_universal:_on_user_turn_started:685 - LLMUserAggregator#0: User started speaking
2026-02-19 23:05:18.033 | DEBUG    | pipecat.services.google.gemini_live.llm:_handle_msg_input_transcription:1683 - [Transcription:user] [I'm not sure.]
2026-02-19 23:05:18.149 | DEBUG    | pipecat.audio.turn.smart_turn.base_smart_turn:analyze_end_of_turn:162 - End of Turn result: EndOfTurnState.COMPLETE
2026-02-19 23:05:18.149 | DEBUG    | pipecat.processors.aggregators.llm_response_universal:_on_user_turn_stopped:703 - LLMUserAggregator#0: User stopped speaking
2026-02-19 23:05:18.149 | INFO     | __main__:on_user_turn_stopped:127 - Transcript: [2026-02-19T23:05:17.220+00:00] user: I'm not sure.
2026-02-19 23:05:18.775 | DEBUG    | pipecat.processors.metrics.frame_processor_metrics:stop_ttfb_metrics:131 - GeminiLiveLLMService#0 TTFB: 0.6261801719665527
2026-02-19 23:05:19.237 | DEBUG    | pipecat.transports.base_output:_bot_started_speaking:608 - Bot started speaking
2026-02-19 23:05:21.747 | DEBUG    | pipecat.processors.metrics.frame_processor_metrics:start_llm_usage_metrics:173 - GeminiLiveLLMService#0 prompt tokens: 601, completion tokens: 69, reasoning tokens: 28
2026-02-19 23:05:21.752 | INFO     | __main__:on_assistant_turn_stopped:133 - Transcript: [2026-02-19T23:05:18.777+00:00] assistant: Because they make everything up!
2026-02-19 23:05:22.101 | DEBUG    | pipecat.transports.base_output:_bot_stopped_speaking:630 - Bot stopped speaking
2026-02-19 23:05:22.505 | DEBUG    | pipecat.transports.smallwebrtc.connection:_handle_new_connection_state:564 - Connection state changed to: closed
2026-02-19 23:05:22.506 | INFO     | __main__:on_client_disconnected:120 - Client disconnected
2026-02-19 23:05:22.508 | INFO     | pipecat.services.google.gemini_live.llm:_disconnect:1292 - Disconnecting from Gemini service
2026-02-19 23:05:22.523 | DEBUG    | pipecat.pipeline.task:run:616 - Pipeline task PipelineTask#0 has finished
```

**Key observation:** The server logs show `[Transcription:user]` debug lines from Gemini (e.g., `[Transcription:user] [Yeah, sure.]`), and `on_user_turn_stopped` fires with the correct text. However, the user transcription does not appear in the client UI — only assistant messages are displayed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemini Live: User transcription not returned (only assistant transcription) #3780

pipecat version

Python version

Operating System

Related issue

Issue description

Reproduction steps

Expected behavior

Actual behavior

Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Gemini Live: User transcription not returned (only assistant transcription) #3780

Description

pipecat version

Python version

Operating System

Related issue

Issue description

Reproduction steps

Expected behavior

Actual behavior

Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions