Skip to content

feat: Add OpenTelemetry integration design proposal #597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

JAORMX
Copy link
Collaborator

@JAORMX JAORMX commented Jun 3, 2025

This proposal outlines a comprehensive approach to add OpenTelemetry observability to ToolHive's MCP server proxies through middleware-based instrumentation.

Key Features

  • Leverages existing middleware system for clean integration
  • Supports both SSE and stdio transport modes
  • Provides traces, metrics, and structured logging
  • Includes MCP-specific instrumentation beyond HTTP metrics
  • Supports multiple OTEL backends and Prometheus integration
  • Maintains backward compatibility with zero performance regression

What's Included

  • Detailed technical design with implementation phases
  • Configuration options and CLI integration
  • Data model examples for traces and metrics
  • Concrete example of tools/call instrumentation
  • Prometheus integration pathways
  • Security considerations and production readiness

The design follows KISS and DRY principles by leveraging existing patterns and infrastructure, making it maintainable and extensible for future observability needs.

Related-to: #474

Environment variable support:
```bash
TOOLHIVE_OTEL_ENABLED=true
TOOLHIVE_OTEL_ENDPOINT=https://api.honeycomb.io

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JAORMX @lujunsan Is this the variable we can set to send traces to a custom OTel Collector endpoint?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! that's the idea; or a command line flag for this would be good too.

This proposal outlines a comprehensive approach to add OpenTelemetry
observability to ToolHive's MCP server proxies through middleware-based
instrumentation.

Key features:
- Leverages existing middleware system for clean integration
- Supports both SSE and stdio transport modes
- Provides traces, metrics, and structured logging
- Includes MCP-specific instrumentation beyond HTTP metrics
- Supports multiple OTEL backends and Prometheus integration
- Maintains backward compatibility with zero performance regression

The design includes detailed examples of traces and metrics for
tools/call operations, showing rich observability into MCP protocol
interactions.

Related-to: #474
Signed-off-by: Juan Antonio Osorio <[email protected]>
@JAORMX JAORMX force-pushed the feat/otel-integration-proposal branch from 2ef2ba2 to 19f3ec4 Compare June 10, 2025 08:12
JAORMX added a commit that referenced this pull request Jun 10, 2025
- Add complete OpenTelemetry instrumentation with OTLP and Prometheus exporters
- Implement HTTP middleware with MCP protocol-aware tracing and metrics
- Add CLI flags for telemetry configuration (--otel-endpoint, --otel-metrics-port, etc.)
- Expose /metrics endpoints on both HTTPSSEProxy and TransparentProxy
- Include comprehensive observability documentation
- Support dual export architecture (OTLP push + Prometheus pull)
- Add tool-specific metrics and argument sanitization for security
- Integrate seamlessly with existing middleware chain

Implements: #597
Fixes: #474
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants