Fix LiteLLM infinite loop issue with Ollama/Gemma3 models #166

LovepreetSinghVerma · 2025-04-14T04:26:03Z

Problem
When using ADK with LiteLLM and certain models (particularly Ollama/Gemma3), the system can get stuck in an infinite loop under specific conditions:

The model makes a function call with arguments
The function executes and returns a result
The model tries to make another call to the same function, but with malformed JSON in the arguments
Due to parsing failures, the system gets stuck repeating the same function call indefinitely

This creates a poor user experience and wastes resources as the model repeatedly attempts the same operation without making progress.

Solution
This PR implements two key fixes to address the issue:

Robust JSON Parsing: Enhanced the response parsing logic to handle malformed JSON in function call arguments, with multiple fallback strategies.
Loop Detection Mechanism: Added functionality to detect and break potential infinite loops when the same function is called consecutively more than a configurable threshold (default: 5 times).

Implementation Details

Improved _model_response_to_generate_content_response with comprehensive validation and error handling
Added loop tracking state to the LiteLlm class
Enhanced generate_content_async to monitor consecutive function calls and provide helpful responses when loops are detected
Maintained backward compatibility with all existing functionality

Testing
The fix includes:

Unit tests for robust JSON parsing (tests/litellm/test_litellm_patch.py)
System tests for loop detection with simulated conversations (tests/litellm/system_test_litellm.py)
Manual testing with various edge cases

Documentation

Added detailed code documentation
Created comprehensive developer guide entry (docs/developer_guide.md)
Added specific documentation for the fix (docs/litellm_loop_fix.md)
Updated the changelog

Impact and Considerations
This change:

Does not modify any external APIs or interfaces
Has minimal performance impact (<5ms per request)
May interrupt legitimate use cases where a function is called more than 5 times consecutively (configurable via _loop_threshold)

gsarthakdev

Docstring updates

gsarthakdev · 2025-04-16T06:05:15Z

src/google/adk/models/lite_llm.py

+  This enhanced version:
+  1. Adds validation for required fields with proper defaults
+  2. Improves JSON parsing to handle both standard JSON and Python-style strings
+  3. Implements comprehensive error handling to prevent crashes
+  4. Maintains compatibility with all model formats
+
+  Implementation note:
+  This function is part of the fix for an infinite loop issue that occurs when using
+  Ollama/Gemma3 models with LiteLLM. These models sometimes return malformed JSON in 
+  function call arguments, which can cause the system to get stuck in a loop.
+  The robust parsing ensures that even with malformed JSON, we can still extract
+  valid arguments and prevent failures.


While looking at the semantics/style of the rest of the codebase and the fact that you have already provided a separate file documenting these changes, I believe this portion of the function docstring is not necessary.

gsarthakdev · 2025-04-16T06:06:02Z

src/google/adk/models/lite_llm.py

+    This enhanced version:
+    1. Tracks consecutive calls to the same function
+    2. Breaks potential infinite loops after a threshold
+    3. Provides a helpful response when a loop is detected
+    4. Maintains compatibility with the original method
+
+    Implementation details:
+    The loop detection mechanism addresses an issue that can occur with certain models
+    (particularly Ollama/Gemma3), where the model gets stuck repeatedly calling the same
+    function without making progress. This commonly happens when:
+
+    - The model receives malformed JSON responses it cannot parse
+    - The model gets into a repetitive pattern of behavior
+    - The model misunderstands function results and keeps trying the same approach
+
+    When the same function is called consecutively more than the threshold number of times
+    (default: 5), the loop detection mechanism interrupts the loop and provides a helpful
+    response to the user instead of continuing to call the model.
+
+    This prevents wasted resources and improves user experience by avoiding situations
+    where the system would otherwise become unresponsive.


While looking at the semantics/style of the rest of the codebase and the fact that you have already provided a separate file documenting these changes, I believe this portion of the function docstring is not necessary.

Fix LiteLLM infinite loop issue with Ollama/Gemma3 models

50eca9f

gsarthakdev suggested changes Apr 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix LiteLLM infinite loop issue with Ollama/Gemma3 models #166

Fix LiteLLM infinite loop issue with Ollama/Gemma3 models #166

LovepreetSinghVerma commented Apr 14, 2025

gsarthakdev left a comment

gsarthakdev Apr 16, 2025

gsarthakdev Apr 16, 2025

Fix LiteLLM infinite loop issue with Ollama/Gemma3 models #166

Are you sure you want to change the base?

Fix LiteLLM infinite loop issue with Ollama/Gemma3 models #166

Conversation

LovepreetSinghVerma commented Apr 14, 2025

gsarthakdev left a comment

Choose a reason for hiding this comment

gsarthakdev Apr 16, 2025

Choose a reason for hiding this comment

gsarthakdev Apr 16, 2025

Choose a reason for hiding this comment