Implement rate limit handling in execute_graph_generation with exponential backoff

**User Story**  
As a software developer using the eknowledge package,  
I want the `execute_graph_generation` retry logic to handle LLM API rate limits  
so that the service avoids temporary bans during periods of heavy usage.  

---

**Background**  
The current retry logic in `eknowledge/main.py` (lines 84-145) blindly retries failed LLM API calls without accounting for rate limits. This creates significant risk during high-volume graph generation, as repeated rapid retries could trigger API bans from providers like LLM7. The `sleep_time` parameter (line 144) uses a fixed delay that doesn't adapt to API feedback.  

Key risks:  
- Service disruptions from API bans  
- Wasted compute resources during rate-limited states  
- No integration with standard rate-limit headers (e.g., `X-RateLimit-Remaining`)  

---

**Acceptance Criteria**  
- [ ] Modify `execute_graph_generation` in `eknowledge/main.py` to:  
  - Implement exponential backoff with jitter for retries  
  - Parse standard rate-limit headers from LLM API responses  
  - Add `429 Too Many Requests` error handling in the exception block (line 129)  
- [ ] Add rate limit tracking that:  
  - Resets the retry counter when receiving `Retry-After` headers  
  - Calculates dynamic delays using `min(sleep_time * 2^retry_count, max_backoff)`  
- [ ] Validate through tests that:  
  - Mocked rate-limited responses trigger appropriate delays  
  - Concurrent executions stay below 80% of allowed API limits  
  - Warnings log when approaching rate limits (verbose mode)  
- [ ] Update `USER_PROMPT` documentation in `prompts.py` to mention rate limit awareness  
- [ ] Add unit tests in `tests/test_eknowledge.py` verifying:  
  - Header parsing logic  
  - Backoff calculation correctness  
  - Retry suspension during rate-limited states

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement rate limit handling in execute_graph_generation with exponential backoff #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Implement rate limit handling in execute_graph_generation with exponential backoff #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions