Add PatternPairAggregator by markbackman · Pull Request #1387 · pipecat-ai/pipecat

markbackman · 2025-03-18T03:20:01Z

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

Extends the BaseTextAggregator with one that's aimed at:

Removing text from the LLM output before it is provided to the TTS service
Includes a handler to provide the content between the pattern pair, so that the content can be used to do things, like change voices, which I show in a demo

codecov · 2025-03-18T03:21:39Z

Codecov Report

Attention: Patch coverage is 94.66667% with 4 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/pipecat/utils/text/pattern_pair_aggregator.py	94.66%	4 Missing ⚠️

Files with missing lines	Coverage Δ
src/pipecat/utils/text/pattern_pair_aggregator.py	`94.66% <94.66%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

filipi87 · 2025-03-18T10:22:41Z

+        transport = DailyTransport(
+            room_url,
+            token,
+            "Storytelling Bot",


Nit: should we change to "Multiple voices Bot" or anything like this.

filipi87 · 2025-03-18T10:34:28Z

When I first looked at the example, I thought,: Why aren’t we using function calling for this ? 😅

But after reading the description, it made total sense.

That makes me wonder if we could improve the example name, maybe something like 35-multiple-voices-bot or another name that clearly indicates the bot will automatically play multiple characters with different voices.

The benefit of using this approach is that function calls are slow, adding noticeable latency to the response. With this approach, the LLM can output many encoded instructions in a single turn. Also, the applications extend beyond just voice switching. You can now encode any information into the LLM response and just parse it out. Two other common cases I've heard are DTMF codes and thinking tokens.

Yep, I think it makes a lot of sense. 👍

filipi87 · 2025-03-18T10:36:34Z

+- Added foundational example `35-voice-switching.py` showing how to use the new
+  `PatternPairAggregator`.


Maybe leave the description more complete:

Added foundational example 35-voice-switching.py showing how to use the new
PatternPairAggregator to make the bot automatically play multiple characters with different voices.

I'm taking a chance to educate a bit:

- Added foundational example `35-voice-switching.py` showing how to use the new `PatternPairAggregator`. This example shows how to encode information for the LLM to instruct TTS voice changes, but this can be used to encode any information into the LLM response, which you want to parse and use in other parts of your application.

I'll make this clear in docs too.

filipi87 · 2025-03-18T10:40:40Z

+            pattern_id: Unique identifier for this pattern pair.
+            start_pattern: Pattern that marks the beginning of content.
+            end_pattern: Pattern that marks the end of content.
+            remove_match: Whether to remove the matched content from the text.


This is a nice one.

filipi87 · 2025-03-18T10:51:14Z

+        pattern_aggregator.on_pattern_match("voice_tag", on_voice_tag)
+
+        # Set the pattern aggregator on the TTS service
+        tts._text_aggregator = pattern_aggregator


I guess we should pass the text_aggregator when creating the CartesiaTTSService, because otherwise, it feels like we are modifying a private variable.

Totally! Clearly the late night was getting to me.

I've updated to:

tts = CartesiaTTSService( api_key=os.getenv("CARTESIA_API_KEY"), voice_id=VOICE_IDS["narrator"], text_aggregator=pattern_aggregator, )

markbackman · 2025-03-18T11:48:34Z

Thanks for the quick review @filipi87! This should be ready again.

filipi87

Pretty cool. 🚀

aconchillo · 2025-03-18T15:33:01Z

+            voice_name = match.content.strip().lower()
+            if voice_name in VOICE_IDS:
+                voice_id = VOICE_IDS[voice_name]
+                tts.set_voice(voice_id)


I think this is fine since the processor executing this code is actually the TTS. In general, we would want to use frames. Maybe it's worth adding a comment about this.

This needs to execute very quickly, so the method is the best way to go, I think. Even with the method, sometimes it's not fast enough.

aconchillo · 2025-03-18T15:50:33Z

+                    processed_text = processed_text.replace(full_match, "", 1)
+                    modified = True
+
+        return processed_text, modified


Maybe this could have gone to utils.string. The function signature would allow passing the set of pairs (start_tag, end_tag, remove_match).

Happy to move it if you feel strongly. I don't really have a preference.

markbackman requested review from aconchillo and filipi87 March 18, 2025 03:30

filipi87 reviewed Mar 18, 2025

View reviewed changes

markbackman added 4 commits March 18, 2025 07:30

Add PairPatternAggregator

e731a0d

Add foundational example 35

ddcc1fb

Add CHANGELOG entries

6ec4052

Add unit tests

2dee882

markbackman force-pushed the mb/pattern-aggregator branch from 8e17917 to 8bbb856 Compare March 18, 2025 11:47

Code review feedback

b282764

markbackman force-pushed the mb/pattern-aggregator branch from 8bbb856 to b282764 Compare March 18, 2025 11:50

markbackman mentioned this pull request Mar 18, 2025

Add PatternPairAggregator, move MarkdownTextFitler docs pipecat-ai/docs#156

Merged

filipi87 approved these changes Mar 18, 2025

View reviewed changes

markbackman merged commit 4677c34 into main Mar 18, 2025
6 checks passed

markbackman deleted the mb/pattern-aggregator branch March 18, 2025 12:46

aconchillo reviewed Mar 18, 2025

View reviewed changes

sohampirale mentioned this pull request Dec 11, 2025

🔄 Modernize storytelling-chatbot Example with Latest Pipecat Patterns pipecat-ai/pipecat-examples#126

Open

7 tasks

		- Added foundational example `35-voice-switching.py` showing how to use the new
		`PatternPairAggregator`.

Conversation

markbackman commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

Uh oh!

codecov bot commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

filipi87 Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markbackman commented Mar 18, 2025

Uh oh!

filipi87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

markbackman commented Mar 18, 2025 •

edited

Loading

codecov bot commented Mar 18, 2025 •

edited

Loading

filipi87 Mar 18, 2025 •

edited

Loading