Skip to content

Commit 936964f

Browse files
committed
feat(responder): implement OpenTelemetry tracing integration for comprehensive observability
- Add design document for Responder OTel integration with detailed architecture - Create new tracing-related private methods in Responder class - Implement context creation, success, and error tracing methods - Add methods for extracting usage, calculating costs, and preparing trace metadata - Prepare foundational structure for non-invasive telemetry in async response handling - Support multiple OTel clients with flexible tracing context management - Enhance observability without compromising core API functionality
1 parent 72d8c62 commit 936964f

File tree

15 files changed

+2570
-82
lines changed

15 files changed

+2570
-82
lines changed

.kiro/specs/responder-otel-integration/design.md

Lines changed: 508 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# Requirements Document
2+
3+
## Introduction
4+
5+
This feature adds OpenTelemetry (OTel) observability integration to the `Responder` class in Agentle. The integration will enable tracing, metrics collection, and cost tracking for API calls made through the Responder, similar to how the `GenerationProvider` class implements observability through the `@observe` decorator. However, this implementation will be done directly within the class methods rather than using a decorator pattern.
6+
7+
## Glossary
8+
9+
- **Responder**: The Agentle class responsible for making API calls to OpenRouter/OpenAI Responses API
10+
- **OtelClient**: Abstract interface for OpenTelemetry clients that handle tracing and observability
11+
- **Trace Context**: A context object that tracks a complete operation from start to finish
12+
- **Generation Context**: A context object that tracks a specific AI model invocation
13+
- **Response API**: OpenRouter/OpenAI's Responses API for structured AI interactions
14+
- **Structured Output**: Parsed Pydantic model output from AI responses
15+
- **Streaming Response**: Server-sent events (SSE) based response delivery
16+
- **Non-streaming Response**: Single HTTP response with complete data
17+
18+
## Requirements
19+
20+
### Requirement 1
21+
22+
**User Story:** As a developer using Agentle, I want the Responder class to automatically trace all API calls, so that I can monitor performance, costs, and errors in my observability platform.
23+
24+
#### Acceptance Criteria
25+
26+
1. WHEN a Responder instance is created with otel_clients, THE Responder SHALL store the clients for use in tracing operations
27+
2. WHEN respond_async is called, THE Responder SHALL create trace and generation contexts for each configured OtelClient
28+
3. WHEN an API call completes successfully, THE Responder SHALL update the trace and generation contexts with response data, usage metrics, and cost information
29+
4. WHEN an API call fails, THE Responder SHALL record the error in the trace and generation contexts
30+
5. WHERE multiple OtelClients are configured, THE Responder SHALL send telemetry data to all clients without blocking the main execution
31+
32+
### Requirement 2
33+
34+
**User Story:** As a developer, I want cost and usage metrics automatically calculated and tracked, so that I can monitor the financial impact of my API usage.
35+
36+
#### Acceptance Criteria
37+
38+
1. WHEN a non-streaming response is received, THE Responder SHALL extract token usage from the response
39+
2. WHEN token usage is available, THE Responder SHALL calculate input and output costs based on the model pricing
40+
3. WHEN costs are calculated, THE Responder SHALL include them in the trace metadata with currency information
41+
4. WHEN usage details are available, THE Responder SHALL include token counts in the generation context
42+
5. THE Responder SHALL handle missing usage data gracefully without failing the request
43+
44+
### Requirement 3
45+
46+
**User Story:** As a developer, I want streaming responses to be traced with accumulated metrics, so that I can observe the complete interaction even when data arrives incrementally.
47+
48+
#### Acceptance Criteria
49+
50+
1. WHEN streaming is enabled, THE Responder SHALL create trace and generation contexts before streaming begins
51+
2. WHILE streaming events are processed, THE Responder SHALL accumulate text content for structured output parsing
52+
3. WHEN a response.completed event is received, THE Responder SHALL update contexts with final metrics and parsed output
53+
4. WHEN streaming fails, THE Responder SHALL record the error with accumulated data up to the failure point
54+
5. THE Responder SHALL ensure contexts are properly closed after streaming completes or fails
55+
56+
### Requirement 4
57+
58+
**User Story:** As a developer, I want the tracing implementation to be non-blocking and resilient, so that observability failures don't impact my application's core functionality.
59+
60+
#### Acceptance Criteria
61+
62+
1. WHEN an OtelClient operation fails, THE Responder SHALL log the error and continue execution
63+
2. WHEN multiple OtelClients are configured, THE Responder SHALL process each client independently
64+
3. IF one OtelClient fails, THEN THE Responder SHALL continue processing remaining clients
65+
4. THE Responder SHALL use fire-and-forget patterns for non-critical telemetry operations
66+
5. THE Responder SHALL ensure all contexts are cleaned up even when errors occur
67+
68+
### Requirement 5
69+
70+
**User Story:** As a developer, I want detailed metadata captured in traces, so that I can debug issues and understand the context of each API call.
71+
72+
#### Acceptance Criteria
73+
74+
1. WHEN creating a trace context, THE Responder SHALL include model name, provider, and request parameters
75+
2. WHEN a response includes structured output, THE Responder SHALL include the parsed data in trace metadata
76+
3. WHEN reasoning is used, THE Responder SHALL track reasoning tokens separately from output tokens
77+
4. WHEN tool calls are made, THE Responder SHALL include tool usage information in the trace
78+
5. THE Responder SHALL include request and response timestamps for latency calculation
79+
80+
### Requirement 6
81+
82+
**User Story:** As a developer, I want consistent tracing behavior between streaming and non-streaming modes, so that I can analyze both types of requests uniformly.
83+
84+
#### Acceptance Criteria
85+
86+
1. THE Responder SHALL use the same trace structure for streaming and non-streaming requests
87+
2. THE Responder SHALL calculate the same metrics (cost, usage, latency) for both modes
88+
3. THE Responder SHALL handle structured output parsing consistently in both modes
89+
4. THE Responder SHALL apply the same error handling patterns in both modes
90+
5. THE Responder SHALL ensure trace metadata format is identical regardless of streaming mode
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
# Implementation Plan
2+
3+
- [x] 1. Create PricingService infrastructure
4+
- Create `agentle/responses/pricing/pricing_service.py` with abstract PricingService class
5+
- Create `agentle/responses/pricing/default_pricing_service.py` with static pricing dictionary
6+
- Create `agentle/responses/pricing/__init__.py` to export classes
7+
- _Requirements: 2.2, 2.3_
8+
9+
- [x] 2. Implement tracing helper methods in Responder
10+
- [x] 2.1 Implement `_prepare_trace_input_data` method
11+
- Extract relevant fields from request_payload (input, model, tools, reasoning, etc.)
12+
- Return structured dictionary for trace context
13+
- _Requirements: 5.1, 5.3, 5.4_
14+
15+
- [x] 2.2 Implement `_prepare_trace_metadata` method
16+
- Extract model, provider, base_url
17+
- Merge custom metadata from generation_config
18+
- Return metadata dictionary
19+
- _Requirements: 5.1, 5.2_
20+
21+
- [x] 2.3 Implement `_extract_usage_from_response` method
22+
- Extract token usage from Response object
23+
- Handle missing usage data gracefully
24+
- Return usage dictionary with input/output/total tokens
25+
- _Requirements: 2.1, 2.5_
26+
27+
- [x] 2.4 Implement `_calculate_costs` method
28+
- Use PricingService to get model pricing
29+
- Calculate input and output costs
30+
- Return cost dictionary with breakdown
31+
- Handle unknown models gracefully
32+
- _Requirements: 2.2, 2.3, 2.4_
33+
34+
- [x] 3. Implement tracing context management methods
35+
- [x] 3.1 Implement `_create_tracing_contexts` method
36+
- Iterate through all otel_clients
37+
- Create trace context for each client
38+
- Create generation context for each client
39+
- Store contexts in list of dictionaries
40+
- Handle context creation errors gracefully
41+
- _Requirements: 1.2, 1.5, 4.1, 4.2, 4.3_
42+
43+
- [x] 3.2 Implement `_update_tracing_success` method
44+
- Extract usage from response
45+
- Calculate costs using PricingService
46+
- Update generation contexts with output, usage, and costs
47+
- Update trace contexts with success status and metadata
48+
- Handle structured output parsing
49+
- Use fire-and-forget for non-critical operations
50+
- _Requirements: 1.3, 2.1, 2.2, 2.3, 2.4, 4.4, 5.2, 5.3, 5.5_
51+
52+
- [x] 3.3 Implement `_update_tracing_error` method
53+
- Update generation contexts with error information
54+
- Update trace contexts with failure status
55+
- Handle errors in error handling gracefully
56+
- _Requirements: 1.4, 4.1, 4.2, 4.3_
57+
58+
- [x] 3.4 Implement `_cleanup_tracing_contexts` method
59+
- Close all generation context generators
60+
- Close all trace context generators
61+
- Handle cleanup errors gracefully
62+
- _Requirements: 3.3, 4.5_
63+
64+
- [x] 4. Integrate tracing into non-streaming flow
65+
- [x] 4.1 Modify `_respond_async` method for non-streaming
66+
- Add start_time tracking
67+
- Initialize active_contexts list
68+
- Call `_create_tracing_contexts` if otel_clients present
69+
- Wrap API call in try-except-finally
70+
- Call `_update_tracing_success` on success
71+
- Call `_update_tracing_error` on error
72+
- Call `_cleanup_tracing_contexts` in finally block
73+
- _Requirements: 1.1, 1.2, 1.3, 1.4, 4.5, 6.1, 6.2, 6.3, 6.4, 6.5_
74+
75+
- [x] 5. Integrate tracing into streaming flow
76+
- [x] 5.1 Create `_stream_events_with_tracing` wrapper method
77+
- Accept content_lines, text_format, active_contexts, start_time, model
78+
- Wrap `_stream_events_from_buffer` generator
79+
- Accumulate text deltas for metrics
80+
- Track final ResponseCompletedEvent
81+
- Call `_update_tracing_success` on completion
82+
- Call `_update_tracing_error` on error
83+
- _Requirements: 3.1, 3.2, 3.3, 6.1, 6.2, 6.3, 6.4, 6.5_
84+
85+
- [x] 5.2 Modify `_respond_async` method for streaming
86+
- Create tracing contexts before streaming
87+
- Pass contexts to `_stream_events_with_tracing`
88+
- Ensure cleanup happens after streaming completes
89+
- _Requirements: 3.1, 3.3, 4.5_
90+
91+
- [x] 6. Add PricingService to Responder initialization
92+
- [x] 6.1 Add `pricing_service` parameter to `__init__`
93+
- Make it optional with default DefaultPricingService
94+
- Store as instance attribute
95+
- _Requirements: 2.2, 2.3_
96+
97+
- [x] 6.2 Update `from_openrouter` and `from_openai` class methods
98+
- Pass pricing_service parameter through
99+
- _Requirements: 2.2, 2.3_
100+
101+
- [x] 7. Add logging and error handling
102+
- [x] 7.1 Add debug logging for tracing operations
103+
- Log context creation
104+
- Log context updates
105+
- Log context cleanup
106+
- _Requirements: 4.1, 4.2, 4.3_
107+
108+
- [x] 7.2 Add error logging for tracing failures
109+
- Log context creation failures
110+
- Log update failures
111+
- Log cleanup failures
112+
- Ensure errors don't propagate to caller
113+
- _Requirements: 4.1, 4.2, 4.3, 4.4_
114+
115+
- [x] 8. Update type hints and imports
116+
- [x] 8.1 Add necessary imports
117+
- Import datetime
118+
- Import fire_and_forget from rsb.coroutines
119+
- Import OtelClient types
120+
- Import PricingService
121+
- _Requirements: All_
122+
123+
- [x] 8.2 Update type hints
124+
- Add type hints to all new methods
125+
- Ensure consistency with existing code style
126+
- _Requirements: All_
127+
128+
- [x] 9. Documentation and examples
129+
- [x] 9.1 Add docstrings to all new methods
130+
- Document parameters
131+
- Document return values
132+
- Document error handling behavior
133+
- _Requirements: All_
134+
135+
- [x] 9.2 Update Responder class docstring
136+
- Document otel_clients parameter
137+
- Document pricing_service parameter
138+
- Document tracing behavior
139+
- _Requirements: All_
140+
141+
- [x] 9.3 Create usage example
142+
- Create example showing Responder with OtelClient
143+
- Show both streaming and non-streaming usage
144+
- Demonstrate cost tracking
145+
- _Requirements: All_

agentle/responses/definitions/image_gen_tool.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,8 @@ class ImageGenTool(BaseModel):
4444
description="The output format of the generated image. One of `png`, `webp`, or\n`jpeg`. Default: `png`.\n",
4545
)
4646
output_compression: Optional[Annotated[int, Field(ge=0, le=100)]] = Field(
47-
default=100, description="Compression level for the output image. Default: 100.\n"
47+
default=100,
48+
description="Compression level for the output image. Default: 100.\n",
4849
)
4950
moderation: Optional[Moderation] = Field(
5051
default=Moderation.auto,

agentle/responses/definitions/input_message.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,9 @@
1818

1919

2020
class InputMessage(BaseModel):
21-
type: Literal["InputMessage"] = Field(
22-
..., description="The type of the message input. Always set to `message`.\n"
21+
type: Literal["input_message"] = Field(
22+
...,
23+
description="The type of the message input. Always set to `input_message`.\n",
2324
)
2425
role: Role1 = Field(
2526
...,

agentle/responses/definitions/message.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,9 @@ class Message(BaseModel):
2727
description="Text, image, or audio input to the model, used to generate a response.\nCan also contain previous assistant responses.\n",
2828
)
2929

30-
type: Literal["message"] = Field(
31-
default="message",
32-
description="The type of the message input. Always `message`.\n",
30+
type: Literal["message_param"] = Field(
31+
default="message_param",
32+
description="The type of the message input. Always `message_param`.\n",
3333
)
3434

3535
@classmethod

agentle/responses/definitions/text_response_format_configuration.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,6 @@
1818

1919

2020
TextResponseFormatConfiguration = Annotated[
21-
Union[
22-
ResponseFormatText, TextResponseFormatJsonSchema, ResponseFormatJsonObject
23-
],
21+
Union[ResponseFormatText, TextResponseFormatJsonSchema, ResponseFormatJsonObject],
2422
Field(discriminator="type"),
2523
]

agentle/responses/pricing/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)