feat: handle exceptions for BaseOpenAILLMService#3529
feat: handle exceptions for BaseOpenAILLMService#3529markbackman merged 2 commits intopipecat-ai:mainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
... and 11 files with indirect coverage changes 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Good catch!
It sounds like you want an exception raised (or in Pipecat terms, an ErrorFrame pushed) when an exception is caught as part of a standard completion. That makes sense. I don't think I'd integrate it with the retry system though.
We could just update the process_frame() method to handle the exception:
async def process_frame(self, frame: Frame, direction: FrameDirection):
"""Process frames for LLM completion requests.
Handles OpenAILLMContextFrame, LLMContextFrame, LLMMessagesFrame,
and LLMUpdateSettingsFrame to trigger LLM completions and manage
settings.
Args:
frame: The frame to process.
direction: The direction of frame processing.
"""
await super().process_frame(frame, direction)
context = None
if isinstance(frame, OpenAILLMContextFrame):
# Handle OpenAI-specific context frames
context = frame.context
elif isinstance(frame, LLMContextFrame):
# Handle universal (LLM-agnostic) LLM context frames
context = frame.context
elif isinstance(frame, LLMMessagesFrame):
# NOTE: LLMMessagesFrame is deprecated, so we don't support the newer universal
# LLMContext with it
context = OpenAILLMContext.from_messages(frame.messages)
elif isinstance(frame, LLMUpdateSettingsFrame):
await self._update_settings(frame.settings)
else:
await self.push_frame(frame, direction)
if context:
try:
await self.push_frame(LLMFullResponseStartFrame())
await self.start_processing_metrics()
await self._process_context(context)
except httpx.TimeoutException:
await self._call_event_handler("on_completion_timeout")
except Exception as e:
await self.push_error(error_msg=f"Error during completion: {e}", exception=e)
finally:
await self.stop_processing_metrics()
await self.push_frame(LLMFullResponseEndFrame())
The new code being:
except Exception as e:
await self.push_error(error_msg=f"Error during completion: {e}", exception=e)
That is, an exception would be raised as a result of the completion failure. That would be caught in process_frame, emitting an ErrorFrame which will allow application code to catch the error. GoogleLLMService follows a similar pattern.
WDYT?
7038c0d to
61ba2d1
Compare
61ba2d1 to
ff0eb6d
Compare
lukepayyapilli
left a comment
There was a problem hiding this comment.
@markbackman Thanks for the review and feedback! Your suggested approach of catching exceptions in process_frame() and pushing an ErrorFrame is a better fix than my original proposal.
I've intentionally kept the scope narrow to address the original issue (#3481) which specifically requested timeout handling for LLMSwitcher failover. The change only catches httpx.TimeoutException and emits an error for that case. Happy to broaden it to handle other exceptions (e.g., API errors, rate limits) if you'd like.
Also added tests to verify:
-
ErrorFrameis pushed with correct message on timeout. -
LLMFullResponseEndFrameis still pushed in the finally block. -
on_completion_timeouthandler is still called.
markbackman
left a comment
There was a problem hiding this comment.
Very nice! I like your idea about TimeoutException. Just one suggestion about broadening the handling for Exception.
Also, thanks for tests 🙏
There was a problem hiding this comment.
@markbackman - done! Added the catch-all Exception handler as suggested. Also added a test to verify the general exception handling works correctly. Please let me know if you'd like me to change anything else and thank you for the suggestions!
Summary
Enables
retry_timeout_secsto cause failure (not just retry) whenretry_on_timeout=False, allowing integration withLLMSwitcherfor slow response handling.Closes #3481
Changes
retry_timeout_secsis explicitly set, the timeout now applies regardless ofretry_on_timeout.retry_on_timeout=False, theTimeoutErrorpropagates (enabling LLMSwitcher).Behavior Matrix
retry_timeout_secsretry_on_timeoutNone(default)False(default)NoneTrueFalseTrueUsage
Design Considerations
The parameter name
retry_timeout_secsis admittedly misleading since it now controls non-retry behavior too. I considered renaming totimeout_secsbut opted for backwards compatibility. A future cleanup could:retry_timeout_secsin favor of clearertimeout_secs.