Fix RTVIObserver missing upstream-only frames#3774
Conversation
RTVIObserver previously filtered out all upstream frames to avoid duplicate messages from broadcasted frames. This caused upstream-only frames to be silently ignored. Instead, add a `broadcasted` field to the Frame base class that is set by broadcast_frame() and broadcast_frame_instance(), and only skip upstream copies of broadcasted frames.
Codecov Report❌ Patch coverage is
... and 7 files with indirect coverage changes 🚀 New features to boost your workflow:
|
| id: int = field(init=False) | ||
| name: str = field(init=False) | ||
| pts: Optional[int] = field(init=False) | ||
| broadcasted: bool = field(init=False) |
There was a problem hiding this comment.
@markbackman, after discussing this with @kompfner and @aconchillo, we agreed that instead we should store broadcasted_sibling_id, which will reference the frame ID of the other frame that has been broadcast. This will be useful for debugging in the future.
So, I will apply this change so we can include this PR in the next release.
|
Related: I had an issue with ParallelPipeline sending multiple RTVI events, because of how observers work. I wonder if it makes sense to fix this behavior in this PR, too. Claude came up with this very hacky workaround that unsets the "extra" RTVI observers: https://github.com/pipecat-ai/gradient-bang/blob/8d3c573259887eb7a6ea78ea223391b9284f5795/src/gradientbang/pipecat_server/bot.py#L361 |
Hi @kwindla, in this case it wasn’t that the RTVI events were duplicated, but that you wanted to filter out RTVI events from a specific branch (source). Is that right ? |
| id: Unique identifier for the frame instance. | ||
| name: Human-readable name combining class name and instance count. | ||
| pts: Presentation timestamp in nanoseconds. | ||
| broadcasted_sibling_id: ID of the paired frame when this frame was |
There was a problem hiding this comment.
Searching for the correct past tense, it seems broadcast is more common that broadcasted.
There was a problem hiding this comment.
Also, wondering if this should be a string instead. 🤔 . I'm just thinking mainly for Whisker. The id is hard to track, but not the name. For example, we could have id 1376434 but frame name UserSpeakingFrame#56. So, it would be broadcast_sibling_name, WDYT? I'm not sure though.
There was a problem hiding this comment.
I think that right now this is mostly for debugging, so if we think that using the name would make things easier to debug, I can do that.
However, the id still feels more natural for referencing something than the name. Maybe Whisker could simply create a link between the frames using the ID and retrieve any other information as needed ?
There was a problem hiding this comment.
I have renamed to broadcast_sibling_id, but let me know in case you think we should use the name instead.
There was a problem hiding this comment.
OK, let's use id for now.
|
LGTM |
|
@kwindla, I will push a follow up PR to filter out RTVI events from a specific branch, so we can include this one in today’s release. |
|
Follow up PR for filtering RTVI events from a specific source: |
Summary
RTVIObserversilently ignoring upstream-only frames. The observer previously filtered out all upstream frames to prevent duplicate RTVI messages from broadcasted frames (frames pushed in both directions). This meant any frame that only traveled upstream was never converted to an RTVI client message.broadcastedboolean field to the baseFrameclass, automatically set toTruebybroadcast_frame()andbroadcast_frame_instance().RTVIObserverto only skip the upstream copy of broadcasted frames, allowing upstream-only frames to be observed normally.Testing
🤖 Generated with Claude Code