paragon-intelligence
diff --git a/‎.kiro/specs/responder-otel-integration/design.md‎
Lines changed: 508 additions & 0 deletions b/‎.kiro/specs/responder-otel-integration/design.md‎
Lines changed: 508 additions & 0 deletions
diff --git a/‎.kiro/specs/responder-otel-integration/requirements.md‎
Lines changed: 90 additions & 0 deletions b/‎.kiro/specs/responder-otel-integration/requirements.md‎
Lines changed: 90 additions & 0 deletions
diff --git a/‎.kiro/specs/responder-otel-integration/tasks.md‎
Lines changed: 145 additions & 0 deletions b/‎.kiro/specs/responder-otel-integration/tasks.md‎
Lines changed: 145 additions & 0 deletions
diff --git a/‎agentle/responses/definitions/image_gen_tool.py‎
Lines changed: 2 additions & 1 deletion b/‎agentle/responses/definitions/image_gen_tool.py‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎agentle/responses/definitions/input_message.py‎
Lines changed: 3 additions & 2 deletions b/‎agentle/responses/definitions/input_message.py‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎agentle/responses/definitions/message.py‎
Lines changed: 3 additions & 3 deletions b/‎agentle/responses/definitions/message.py‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎agentle/responses/definitions/text_response_format_configuration.py‎
Lines changed: 1 addition & 3 deletions b/‎agentle/responses/definitions/text_response_format_configuration.py‎
Lines changed: 1 addition & 3 deletions
diff --git a/‎agentle/responses/pricing/__init__.py‎ b/‎agentle/responses/pricing/__init__.py‎
@@ -0,0 +1,90 @@
+# Requirements Document
+
+## Introduction
+
+This feature adds OpenTelemetry (OTel) observability integration to the `Responder` class in Agentle. The integration will enable tracing, metrics collection, and cost tracking for API calls made through the Responder, similar to how the `GenerationProvider` class implements observability through the `@observe` decorator. However, this implementation will be done directly within the class methods rather than using a decorator pattern.
+
+## Glossary
+
+- **Responder**: The Agentle class responsible for making API calls to OpenRouter/OpenAI Responses API
+- **OtelClient**: Abstract interface for OpenTelemetry clients that handle tracing and observability
+- **Trace Context**: A context object that tracks a complete operation from start to finish
+- **Generation Context**: A context object that tracks a specific AI model invocation
+- **Response API**: OpenRouter/OpenAI's Responses API for structured AI interactions
+- **Structured Output**: Parsed Pydantic model output from AI responses
+- **Streaming Response**: Server-sent events (SSE) based response delivery
+- **Non-streaming Response**: Single HTTP response with complete data
+
+## Requirements
+
+### Requirement 1
+
+**User Story:** As a developer using Agentle, I want the Responder class to automatically trace all API calls, so that I can monitor performance, costs, and errors in my observability platform.
+
+#### Acceptance Criteria
+
+1. WHEN a Responder instance is created with otel_clients, THE Responder SHALL store the clients for use in tracing operations
+2. WHEN respond_async is called, THE Responder SHALL create trace and generation contexts for each configured OtelClient
+3. WHEN an API call completes successfully, THE Responder SHALL update the trace and generation contexts with response data, usage metrics, and cost information
+4. WHEN an API call fails, THE Responder SHALL record the error in the trace and generation contexts
+5. WHERE multiple OtelClients are configured, THE Responder SHALL send telemetry data to all clients without blocking the main execution
+
+### Requirement 2
+
+**User Story:** As a developer, I want cost and usage metrics automatically calculated and tracked, so that I can monitor the financial impact of my API usage.
+
+#### Acceptance Criteria
+
+1. WHEN a non-streaming response is received, THE Responder SHALL extract token usage from the response
+2. WHEN token usage is available, THE Responder SHALL calculate input and output costs based on the model pricing
+3. WHEN costs are calculated, THE Responder SHALL include them in the trace metadata with currency information
+4. WHEN usage details are available, THE Responder SHALL include token counts in the generation context
+5. THE Responder SHALL handle missing usage data gracefully without failing the request
+
+### Requirement 3
+
+**User Story:** As a developer, I want streaming responses to be traced with accumulated metrics, so that I can observe the complete interaction even when data arrives incrementally.
+
+#### Acceptance Criteria
+
+1. WHEN streaming is enabled, THE Responder SHALL create trace and generation contexts before streaming begins
+2. WHILE streaming events are processed, THE Responder SHALL accumulate text content for structured output parsing
+3. WHEN a response.completed event is received, THE Responder SHALL update contexts with final metrics and parsed output
+4. WHEN streaming fails, THE Responder SHALL record the error with accumulated data up to the failure point
+5. THE Responder SHALL ensure contexts are properly closed after streaming completes or fails
+
+### Requirement 4
+
+**User Story:** As a developer, I want the tracing implementation to be non-blocking and resilient, so that observability failures don't impact my application's core functionality.
+
+#### Acceptance Criteria
+
+1. WHEN an OtelClient operation fails, THE Responder SHALL log the error and continue execution
+2. WHEN multiple OtelClients are configured, THE Responder SHALL process each client independently
+3. IF one OtelClient fails, THEN THE Responder SHALL continue processing remaining clients
+4. THE Responder SHALL use fire-and-forget patterns for non-critical telemetry operations
+5. THE Responder SHALL ensure all contexts are cleaned up even when errors occur
+
+### Requirement 5
+
+**User Story:** As a developer, I want detailed metadata captured in traces, so that I can debug issues and understand the context of each API call.
+
+#### Acceptance Criteria
+
+1. WHEN creating a trace context, THE Responder SHALL include model name, provider, and request parameters
+2. WHEN a response includes structured output, THE Responder SHALL include the parsed data in trace metadata
+3. WHEN reasoning is used, THE Responder SHALL track reasoning tokens separately from output tokens
+4. WHEN tool calls are made, THE Responder SHALL include tool usage information in the trace
+5. THE Responder SHALL include request and response timestamps for latency calculation
+
+### Requirement 6
+
+**User Story:** As a developer, I want consistent tracing behavior between streaming and non-streaming modes, so that I can analyze both types of requests uniformly.
+
+#### Acceptance Criteria
+
+1. THE Responder SHALL use the same trace structure for streaming and non-streaming requests
+2. THE Responder SHALL calculate the same metrics (cost, usage, latency) for both modes
+3. THE Responder SHALL handle structured output parsing consistently in both modes
+4. THE Responder SHALL apply the same error handling patterns in both modes
+5. THE Responder SHALL ensure trace metadata format is identical regardless of streaming mode
@@ -0,0 +1,145 @@
+# Implementation Plan
+
+- [x] 1. Create PricingService infrastructure
+  - Create `agentle/responses/pricing/pricing_service.py` with abstract PricingService class
+  - Create `agentle/responses/pricing/default_pricing_service.py` with static pricing dictionary
+  - Create `agentle/responses/pricing/__init__.py` to export classes
+  - _Requirements: 2.2, 2.3_
+
+- [x] 2. Implement tracing helper methods in Responder
+- [x] 2.1 Implement `_prepare_trace_input_data` method
+  - Extract relevant fields from request_payload (input, model, tools, reasoning, etc.)
+  - Return structured dictionary for trace context
+  - _Requirements: 5.1, 5.3, 5.4_
+
+- [x] 2.2 Implement `_prepare_trace_metadata` method
+  - Extract model, provider, base_url
+  - Merge custom metadata from generation_config
+  - Return metadata dictionary
+  - _Requirements: 5.1, 5.2_
+
+- [x] 2.3 Implement `_extract_usage_from_response` method
+  - Extract token usage from Response object
+  - Handle missing usage data gracefully
+  - Return usage dictionary with input/output/total tokens
+  - _Requirements: 2.1, 2.5_
+
+- [x] 2.4 Implement `_calculate_costs` method
+  - Use PricingService to get model pricing
+  - Calculate input and output costs
+  - Return cost dictionary with breakdown
+  - Handle unknown models gracefully
+  - _Requirements: 2.2, 2.3, 2.4_
+
+- [x] 3. Implement tracing context management methods
+- [x] 3.1 Implement `_create_tracing_contexts` method
+  - Iterate through all otel_clients
+  - Create trace context for each client
+  - Create generation context for each client
+  - Store contexts in list of dictionaries
+  - Handle context creation errors gracefully
+  - _Requirements: 1.2, 1.5, 4.1, 4.2, 4.3_
+
+- [x] 3.2 Implement `_update_tracing_success` method
+  - Extract usage from response
+  - Calculate costs using PricingService
+  - Update generation contexts with output, usage, and costs
+  - Update trace contexts with success status and metadata
+  - Handle structured output parsing
+  - Use fire-and-forget for non-critical operations
+  - _Requirements: 1.3, 2.1, 2.2, 2.3, 2.4, 4.4, 5.2, 5.3, 5.5_
+
+- [x] 3.3 Implement `_update_tracing_error` method
+  - Update generation contexts with error information
+  - Update trace contexts with failure status
+  - Handle errors in error handling gracefully
+  - _Requirements: 1.4, 4.1, 4.2, 4.3_
+
+- [x] 3.4 Implement `_cleanup_tracing_contexts` method
+  - Close all generation context generators
+  - Close all trace context generators
+  - Handle cleanup errors gracefully
+  - _Requirements: 3.3, 4.5_
+
+- [x] 4. Integrate tracing into non-streaming flow
+- [x] 4.1 Modify `_respond_async` method for non-streaming
+  - Add start_time tracking
+  - Initialize active_contexts list
+  - Call `_create_tracing_contexts` if otel_clients present
+  - Wrap API call in try-except-finally
+  - Call `_update_tracing_success` on success
+  - Call `_update_tracing_error` on error
+  - Call `_cleanup_tracing_contexts` in finally block
+  - _Requirements: 1.1, 1.2, 1.3, 1.4, 4.5, 6.1, 6.2, 6.3, 6.4, 6.5_
+
+- [x] 5. Integrate tracing into streaming flow
+- [x] 5.1 Create `_stream_events_with_tracing` wrapper method
+  - Accept content_lines, text_format, active_contexts, start_time, model
+  - Wrap `_stream_events_from_buffer` generator
+  - Accumulate text deltas for metrics
+  - Track final ResponseCompletedEvent
+  - Call `_update_tracing_success` on completion
+  - Call `_update_tracing_error` on error
+  - _Requirements: 3.1, 3.2, 3.3, 6.1, 6.2, 6.3, 6.4, 6.5_
+
+- [x] 5.2 Modify `_respond_async` method for streaming
+  - Create tracing contexts before streaming
+  - Pass contexts to `_stream_events_with_tracing`
+  - Ensure cleanup happens after streaming completes
+  - _Requirements: 3.1, 3.3, 4.5_
+
+- [x] 6. Add PricingService to Responder initialization
+- [x] 6.1 Add `pricing_service` parameter to `__init__`
+  - Make it optional with default DefaultPricingService
+  - Store as instance attribute
+  - _Requirements: 2.2, 2.3_
+
+- [x] 6.2 Update `from_openrouter` and `from_openai` class methods
+  - Pass pricing_service parameter through
+  - _Requirements: 2.2, 2.3_
+
+- [x] 7. Add logging and error handling
+- [x] 7.1 Add debug logging for tracing operations
+  - Log context creation
+  - Log context updates
+  - Log context cleanup
+  - _Requirements: 4.1, 4.2, 4.3_
+
+- [x] 7.2 Add error logging for tracing failures
+  - Log context creation failures
+  - Log update failures
+  - Log cleanup failures
+  - Ensure errors don't propagate to caller
+  - _Requirements: 4.1, 4.2, 4.3, 4.4_
+
+- [x] 8. Update type hints and imports
+- [x] 8.1 Add necessary imports
+  - Import datetime
+  - Import fire_and_forget from rsb.coroutines
+  - Import OtelClient types
+  - Import PricingService
+  - _Requirements: All_
+
+- [x] 8.2 Update type hints
+  - Add type hints to all new methods
+  - Ensure consistency with existing code style
+  - _Requirements: All_
+
+- [x] 9. Documentation and examples
+- [x] 9.1 Add docstrings to all new methods
+  - Document parameters
+  - Document return values
+  - Document error handling behavior
+  - _Requirements: All_
+
+- [x] 9.2 Update Responder class docstring
+  - Document otel_clients parameter
+  - Document pricing_service parameter
+  - Document tracing behavior
+  - _Requirements: All_
+
+- [x] 9.3 Create usage example
+  - Create example showing Responder with OtelClient
+  - Show both streaming and non-streaming usage
+  - Demonstrate cost tracking
+  - _Requirements: All_
@@ -44,7 +44,8 @@ class ImageGenTool(BaseModel):
         description="The output format of the generated image. One of `png`, `webp`, or\n`jpeg`. Default: `png`.\n",
     )
     output_compression: Optional[Annotated[int, Field(ge=0, le=100)]] = Field(
-        default=100, description="Compression level for the output image. Default: 100.\n"
+        default=100,
+        description="Compression level for the output image. Default: 100.\n",
     )
     moderation: Optional[Moderation] = Field(
         default=Moderation.auto,
 
@@ -18,8 +18,9 @@
 
 
 class InputMessage(BaseModel):
-    type: Literal["InputMessage"] = Field(
-        ..., description="The type of the message input. Always set to `message`.\n"
+    type: Literal["input_message"] = Field(
+        ...,
+        description="The type of the message input. Always set to `input_message`.\n",
     )
     role: Role1 = Field(
         ...,
 
@@ -27,9 +27,9 @@ class Message(BaseModel):
         description="Text, image, or audio input to the model, used to generate a response.\nCan also contain previous assistant responses.\n",
     )
 
-    type: Literal["message"] = Field(
-        default="message",
-        description="The type of the message input. Always `message`.\n",
+    type: Literal["message_param"] = Field(
+        default="message_param",
+        description="The type of the message input. Always `message_param`.\n",
     )
 
     @classmethod
 
@@ -18,8 +18,6 @@
 
 
 TextResponseFormatConfiguration = Annotated[
-    Union[
-        ResponseFormatText, TextResponseFormatJsonSchema, ResponseFormatJsonObject
-    ],
+    Union[ResponseFormatText, TextResponseFormatJsonSchema, ResponseFormatJsonObject],
     Field(discriminator="type"),
 ]
Original file line number	Diff line number	Diff line change
`@@ -44,7 +44,8 @@ class ImageGenTool(BaseModel):`
`44`	`44`	description="The output format of the generated image. One of `png`, `webp`, or\n`jpeg`. Default: `png`.\n",
`45`	`45`	`)`
`46`	`46`	`output_compression: Optional[Annotated[int, Field(ge=0, le=100)]] = Field(`
`47`		`- default=100, description="Compression level for the output image. Default: 100.\n"`
	`47`	`+ default=100,`
	`48`	`+ description="Compression level for the output image. Default: 100.\n",`
`48`	`49`	`)`
`49`	`50`	`moderation: Optional[Moderation] = Field(`
`50`	`51`	`default=Moderation.auto,`
Original file line number	Diff line number	Diff line change
`@@ -27,9 +27,9 @@ class Message(BaseModel):`
`27`	`27`	`description="Text, image, or audio input to the model, used to generate a response.\nCan also contain previous assistant responses.\n",`
`28`	`28`	`)`
`29`	`29`
`30`		`- type: Literal["message"] = Field(`
`31`		`- default="message",`
`32`		- description="The type of the message input. Always `message`.\n",
	`30`	`+ type: Literal["message_param"] = Field(`
	`31`	`+ default="message_param",`
	`32`	+ description="The type of the message input. Always `message_param`.\n",
`33`	`33`	`)`
`34`	`34`
`35`	`35`	`@classmethod`
Original file line number	Diff line number	Diff line change
`@@ -18,8 +18,6 @@`
`18`	`18`
`19`	`19`
`20`	`20`	`TextResponseFormatConfiguration = Annotated[`
`21`		`- Union[`
`22`		`- ResponseFormatText, TextResponseFormatJsonSchema, ResponseFormatJsonObject`
`23`		`- ],`
	`21`	`+ Union[ResponseFormatText, TextResponseFormatJsonSchema, ResponseFormatJsonObject],`
`24`	`22`	`Field(discriminator="type"),`
`25`	`23`	`]`