Add latency breakdown to UserBotLatencyObserver by markbackman · Pull Request #3885 · pipecat-ai/pipecat

markbackman · 2026-03-01T12:40:34Z

Context

The goal is to measure the user to bot latency and to display the contributing factors to that latency number. The possible contributors are:

VAD stop_secs
STT latency
Turn detection
LLM latency
Pre-TTS text aggregation
TTS latency
Function call execution time

The sequence of events looks something like this:

We are breaking the measurement down into the following categories:

User turn secs: This measures from when the user actually stops speaking (VADStoppedSpeakingFrame - VADParams stop_secs) to when the UserStoppedSpeakingFrame occurs. There are two main contributors to this phase: STT latency and turn detection. When an STT latency is available, we display that correspond TTFB value. We do not explicitly display the turn detection latency because it is difficult to characterize, but it is generally going to be the main contributor to the long user_turn_secs values.
TTFB latencies: STT, LLM, TTS. The LLM and TTS portions directly impact the response time. The STT is a function of more complex logic, which I mention above.
Text aggregation latency: This is the time to build a sentence from the LLM streamed token. If sentence aggregation is in place, we measure it to help the developer understand the impact.
Function call latency: Time spent executing tool/function calls between LLM roundtrips. Measured from FunctionCallInProgressFrame to FunctionCallResultFrame, making previously invisible API call time visible in the waterfall.

Note: we do not provide a measurement for the turn result because it does not have a direct impact on response time. The result itself is used by the LLMUserAggregator logic to determine how to respond, when the user stop strategy includes a turn analyzer. The turn result latency is available in a MetricsFrame for those that want access, perhaps for troubleshooting or debugging.

Summary

Added on_latency_breakdown event to UserBotLatencyObserver with a LatencyBreakdown model providing per-service TTFB, text aggregation, user turn duration, and function call latency metrics
Added on_first_bot_speech_latency event measuring time from ClientConnectedFrame to first BotStartedSpeakingFrame, with a latency breakdown including per-service metrics. Skipped when user speaks first (only meaningful for greetings).
Added FunctionCallMetrics tracking to LatencyBreakdown, measuring execution time of each function call via FunctionCallInProgressFrame to FunctionCallResultFrame
Measures user_turn_secs from actual user silence (VAD timestamp minus stop_secs) to turn release, capturing VAD silence detection, STT finalization, and turn analyzer wait time
Resets metric accumulators on InterruptionFrame to discard stale metrics from cancelled LLM/TTS cycles
All breakdown data models are Pydantic BaseModel with start_time (Unix timestamp) and duration_secs fields, enabling .model_dump() serialization and timeline plotting
Updated example 29 with waterfall-style latency breakdown display including function calls

Testing

uv run pytest tests/test_user_bot_latency_observer.py -v

🤖 Generated with Claude Code

codecov · 2026-03-01T12:46:00Z

Codecov Report

❌ Patch coverage is 98.14815% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/pipecat/observers/user_bot_latency_observer.py	98.14%	2 Missing ⚠️

Files with missing lines	Coverage Δ
src/pipecat/observers/user_bot_latency_observer.py	`97.67% <98.14%> (+5.67%)`	⬆️

... and 15 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The ServiceSettings refactor (PR #3714) changed self._settings from dicts to dataclass subclasses, but tracing code still used .items(), in containment, and subscript access, causing AttributeError on every traced call. Use given_fields() for iteration and attribute access for named fields.

Add per-service latency breakdown metrics alongside existing user-to-bot latency measurement. When enable_metrics=True, the observer now emits an on_latency_breakdown event with TTFB, text aggregation, and user turn duration metrics collected between VADUserStoppedSpeakingFrame and BotStartedSpeakingFrame. - Add LatencyBreakdown dataclass with ttfb, text_aggregation, user_turn_secs fields - Accumulate MetricsFrame data during user→bot cycles - Reset accumulators on InterruptionFrame to discard stale metrics - Measure user_turn_secs from actual user silence (VAD timestamp - stop_secs) to turn release (UserStoppedSpeakingFrame) - Filter zero-value TTFB entries from startup metric resets - Add frame deduplication using bounded deque + set pattern - Update example 29 with latency breakdown display

Measure time from ClientConnectedFrame to first BotStartedSpeakingFrame, emitting a one-time on_first_bot_speech_latency event with breakdown.

Enables .model_dump() serialization for Pipecat Cloud collection. All metrics now include start_time (Unix timestamp) for timeline plotting alongside duration_secs.

aconchillo · 2026-03-02T23:11:26Z

+
+        if breakdown.text_aggregation:
+            ta = breakdown.text_aggregation
+            logger.info(f"  {ta.processor}: text aggregation {ta.duration_secs:.3f}s")


Since this might be common. I'm wondering if we could have a function that we could call that returns us a chronological list of strings. Like:

events = observer.chronological_events()

So people could just iterate and print.

Ok, pushing a commit with this function.

aconchillo · 2026-03-02T23:18:20Z

Approving! This is great. Just added a comment to simplify examples/apps a bit.

…server report

markbackman added a commit that referenced this pull request Mar 1, 2026

Add changelog for #3885

f14d107

markbackman force-pushed the mb/latency-breakdown branch 2 times, most recently from cf75eb3 to 776ca5c Compare March 1, 2026 12:55

markbackman added a commit that referenced this pull request Mar 1, 2026

Add changelog for #3885

858404e

markbackman force-pushed the mb/latency-breakdown branch from 776ca5c to 858404e Compare March 1, 2026 14:14

markbackman changed the base branch from main to mb/startup-observer March 1, 2026 14:14

markbackman requested review from aconchillo, filipi87 and kompfner March 1, 2026 14:15

markbackman force-pushed the mb/latency-breakdown branch from 858404e to f6ed923 Compare March 1, 2026 15:01

markbackman force-pushed the mb/startup-observer branch from c2e0e01 to 58d0bfc Compare March 2, 2026 15:53

markbackman force-pushed the mb/latency-breakdown branch from de0ac63 to 8b41398 Compare March 2, 2026 16:03

markbackman changed the base branch from mb/startup-observer to main March 2, 2026 16:22

markbackman changed the base branch from main to mb/startup-observer March 2, 2026 16:22

markbackman force-pushed the mb/startup-observer branch from 4270f12 to 75669b1 Compare March 2, 2026 18:10

Base automatically changed from mb/startup-observer to main March 2, 2026 21:07

markbackman added 6 commits March 2, 2026 16:11

Add changelog for #3881

98bd530

Add first-bot-speech latency to UserBotLatencyObserver

ddba1b8

Measure time from ClientConnectedFrame to first BotStartedSpeakingFrame, emitting a one-time on_first_bot_speech_latency event with breakdown.

Add function call latency tracking to LatencyBreakdown

a738a4d

Convert observer data models to Pydantic BaseModel with timestamps

ff5b985

Enables .model_dump() serialization for Pipecat Cloud collection. All metrics now include start_time (Unix timestamp) for timeline plotting alongside duration_secs.

markbackman force-pushed the mb/latency-breakdown branch from 8b41398 to ff5b985 Compare March 2, 2026 21:11

Update changelog

8f66272

aconchillo reviewed Mar 2, 2026

View reviewed changes

aconchillo approved these changes Mar 2, 2026

View reviewed changes

Add chronological_events utility function to display UserBotLatencyOb…

7dbb130

…server report

markbackman merged commit aad1211 into main Mar 3, 2026
5 checks passed

markbackman deleted the mb/latency-breakdown branch March 3, 2026 00:27

markbackman mentioned this pull request Mar 3, 2026

docs: update for pipecat PR #3885 pipecat-ai/docs#570

Merged

This was referenced Apr 9, 2026

Add latency breakdown to OTEL turn span #4262

Open

Add function_call OpenTelemetry spans for LLM service tool execution #4266

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add latency breakdown to UserBotLatencyObserver#3885

Add latency breakdown to UserBotLatencyObserver#3885
markbackman merged 8 commits intomainfrom
mb/latency-breakdown

markbackman commented Mar 1, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Mar 1, 2026 •

edited

Loading

Uh oh!

aconchillo Mar 2, 2026

Uh oh!

markbackman Mar 2, 2026

Uh oh!

markbackman Mar 3, 2026

Uh oh!

aconchillo commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

markbackman commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Summary

Testing

Uh oh!

codecov Bot commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

aconchillo Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

markbackman Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

markbackman Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

aconchillo commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

markbackman commented Mar 1, 2026 •

edited

Loading

codecov Bot commented Mar 1, 2026 •

edited

Loading