Skip to content

Implement rate limit handling in execute_graph_generation with exponential backoff #7

@chigwell

Description

@chigwell

User Story
As a software developer using the eknowledge package,
I want the execute_graph_generation retry logic to handle LLM API rate limits
so that the service avoids temporary bans during periods of heavy usage.


Background
The current retry logic in eknowledge/main.py (lines 84-145) blindly retries failed LLM API calls without accounting for rate limits. This creates significant risk during high-volume graph generation, as repeated rapid retries could trigger API bans from providers like LLM7. The sleep_time parameter (line 144) uses a fixed delay that doesn't adapt to API feedback.

Key risks:

  • Service disruptions from API bans
  • Wasted compute resources during rate-limited states
  • No integration with standard rate-limit headers (e.g., X-RateLimit-Remaining)

Acceptance Criteria

  • Modify execute_graph_generation in eknowledge/main.py to:
    • Implement exponential backoff with jitter for retries
    • Parse standard rate-limit headers from LLM API responses
    • Add 429 Too Many Requests error handling in the exception block (line 129)
  • Add rate limit tracking that:
    • Resets the retry counter when receiving Retry-After headers
    • Calculates dynamic delays using min(sleep_time * 2^retry_count, max_backoff)
  • Validate through tests that:
    • Mocked rate-limited responses trigger appropriate delays
    • Concurrent executions stay below 80% of allowed API limits
    • Warnings log when approaching rate limits (verbose mode)
  • Update USER_PROMPT documentation in prompts.py to mention rate limit awareness
  • Add unit tests in tests/test_eknowledge.py verifying:
    • Header parsing logic
    • Backoff calculation correctness
    • Retry suspension during rate-limited states

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions