OpenTelemetry Observability Dashboard

A comprehensive observability solution using OpenTelemetry, Grafana, Prometheus, and ElasticSearch to monitor distributed applications with full-stack visibility into metrics, traces, and logs.

📂 Project Submission Links

Course Deliverables - Quick Access

#	Deliverable	Link
1	IT-Capstone-Opentelemetry-WeeklyLog	View Document
2	Milestone 3 Presentation	View Slides
3	OpenTelemetry Observability Dashboard Report(docx)	View Report DOCX
4	OpenTelemetry Observability DashboardReport(Pdf)	View Report PDF
5	OpenTelemetryProjectPlan-3	View Plan MPP
6	OpenTelemetryProjectPlan-3	View Plan PDF
7	OpenTelemetryProjectPlan Resource Overview	View Overview
8	Updated Project Plan	View Document
9	Updated Project Plan	View PDF
10	Proof of KSU Writing Center	View Image
11	Demo Video - Milestone 4	Watch Demo

Note: All deliverables meet the project requirements as outlined in the course syllabus.

Overview

This project implements a production-ready OpenTelemetry observability stack that provides complete visibility into distributed applications. Built on Docker and deployed on RHEL9, it demonstrates enterprise-grade monitoring using open-source tools.

What This Project Offers

Full-Stack Observability: Unified view of metrics, logs, and traces
Real-Time Monitoring: Live dashboards with sub-second refresh rates
Distributed Tracing: End-to-end request path visualization
Container-Ready: Fully containerized with Docker Compose
Production Patterns: Best practices for observability implementation

Architecture

Our observability stack follows the OpenTelemetry specification with multiple backend integrations:

┌─────────────────────────────────────────────────────────────┐
│                    OTel Demo Application                     │
│              (Microservices + Frontend)                      │
└────────────────────────┬────────────────────────────────────┘
                         │
                         │ OpenTelemetry Protocol (OTLP)
                         │
                         ▼
          ┌──────────────────────────────┐
          │   OpenTelemetry Collector    │
          │    (Receive, Process, Export) │
          └──────────────┬───────────────┘
                         │
            ┌────────────┼────────────┐
            │            │            │
            ▼            ▼            ▼
     ┌──────────┐  ┌──────────┐  ┌──────────┐
     │Prometheus│  │ Elastic  │  │ Elastic  │
     │ Metrics  │  │  Logs    │  │  Traces  │
     └─────┬────┘  └─────┬────┘  └─────┬────┘
           │             │              │
           └─────────────┼──────────────┘
                         │
                         ▼
                  ┌─────────────┐
                  │   Grafana   │
                  │  Dashboards │
                  └─────────────┘

Component Details

Component	Purpose	Port
OpenTelemetry Demo	Generates realistic telemetry data	8082
OTel Collector	Receives and exports telemetry	4317, 4318
Prometheus	Time-series metrics storage	9092
ElasticSearch	Log and trace storage	9202
Grafana	Visualization and dashboards	3002
Jaeger UI	Trace visualization	16686
Load Generator	Synthetic traffic generation	8089

Features

Unified Dashboards

Service Health Overview: Real-time status of all microservices
RED Metrics: Rate, Errors, Duration for each service
Resource Utilization: CPU, memory, and network metrics
Business Metrics: Custom application-level KPIs

Distributed Tracing

End-to-end request flow visualization
Span-level performance analysis
Service dependency mapping
Error propagation tracking

Metrics Collection

Automatic instrumentation metrics
Custom business metrics
Infrastructure metrics
Container metrics

Log Aggregation

Centralized log collection
Structured logging with context
Log correlation with traces
Full-text search capabilities

Prerequisites

Docker: Version 20.10+
Docker Compose: Version 2.0+
Linux Server: RHEL9, Ubuntu 20.04+, or similar
RAM: Minimum 8GB (16GB recommended)
CPU: 4+ cores recommended
Disk: 20GB+ free space

Quick Start

1. Clone the Repository

git clone https://github.com/dekema9924/Otel-demo-captsone.git
cd Otel-demo-captsone

2. Start the Stack

# Open or start Docker Desktop
# Start all services in detached mode
docker compose up -d

# Watch the logs (optional)
docker compose logs -f

3. Verify Deployment

# Check all containers are running
docker compose ps

# Expected output: All services should show "Up" status

4. Access the Interfaces

Service	URL	Default Credentials
Grafana	http://localhost:3002	admin / Danielekema#7
Prometheus	http://localhost:9092	-
ElasticSearch	http://localhost:9202	-
OTel Demo	http://localhost:8082	-
Load Generator	http://localhost:8089	-

5. Import Dashboards

Navigate to Grafana at http://localhost:3002
Go to Dashboards → Import
Upload dashboards from grafana/dashboards/ directory
Select Prometheus and ElasticSearch as data sources

Dashboard Screenshots

Service Overview Dashboard

Real-time health and performance metrics for all microservices.

Distributed Tracing View

Visualize complete request paths across services.

RED Metrics Dashboard

Monitor Rate, Errors, and Duration for critical services.

Log Analysis

Centralized log exploration with filtering and correlation.

Infrastructure Metrics

Container and host-level resource monitoring.

Configuration

OpenTelemetry Collector

Located at otelcol/otelcol-config.yml:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

exporters:
  prometheus:
    endpoint: "prometheus:9092"
  
  elasticsearch:
    endpoints: ["http://elasticsearch:9202"]
    logs_index: otel-logs
    traces_index: otel-traces

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus]
    
    traces:
      receivers: [otlp]
      exporters: [elasticsearch]
    
    logs:
      receivers: [otlp]
      exporters: [elasticsearch]

Prometheus Configuration

Located at prometheus/prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'otel-collector'
    static_configs:
      - targets: ['otelcol:8888']
  
  - job_name: 'demo-services'
    static_configs:
      - targets: ['frontend:8082']

Docker Compose Networking

All services are connected via a custom bridge network for inter-container communication:

networks:
  otel-network:
    driver: bridge

Trace Testing

We use TraceTest to validate trace completeness and correctness.

Running Trace Tests

# Execute trace tests
docker compose run tracetest test run \
  --definition-file /app/test/tracetesting/tracetest-provision.yaml

What We Test

Span creation and propagation
Trace context preservation
Service-to-service correlation
Attribute completeness
Error propagation
Performance thresholds

Example Test Output

✓ Checkout Flow Test
  ✓ Trace has expected number of spans (12)
  ✓ All spans have parent relationships
  ✓ Critical path under 500ms
  ✓ No error spans detected

Troubleshooting

Common Issues

Containers Won't Start

# Check logs for specific service
docker compose logs elasticsearch

# Restart specific service
docker compose restart elasticsearch

Port Conflicts

# Check if ports are in use
sudo netstat -tulpn | grep -E '3002|9092|9202|8082|16686|8089'

# Modify ports in docker-compose.yaml if needed

ElasticSearch Memory Issues

# Increase vm.max_map_count on Linux
sudo sysctl -w vm.max_map_count=262144

# Make permanent
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf

No Data in Grafana

Verify data sources are configured correctly
Check Prometheus is scraping metrics: http://localhost:9092/targets
Verify ElasticSearch has indices: curl http://localhost:9202/_cat/indices
Generate traffic to the demo app: http://localhost:8082

Health Checks

# Check all service health
docker compose ps

# Test Prometheus
curl http://localhost:9092/-/healthy

# Test ElasticSearch
curl http://localhost:9202/_cluster/health

# Test Grafana
curl http://localhost:3002/api/health

# Test Jaeger
curl http://localhost:16686/

Team Contributions

This project was a collaborative effort with contributions across multiple domains:

Infrastructure Setup: Server provisioning and Docker configuration
OTel Integration: Collector setup and instrumentation
Backend Configuration: Prometheus and ElasticSearch setup
Dashboard Design: Grafana dashboard creation and optimization
Testing Framework: Trace validation and test automation
Documentation: Technical writing and user guides

Future Roadmap

Short Term

Add Jaeger UI as alternative trace viewer
Implement Prometheus alerting rules
Create custom metrics exporter for business KPIs
Add SSL/TLS for production deployment

Medium Term

Migrate to Kubernetes deployment
Implement Grafana Tempo for scalable tracing
Add synthetic monitoring with k6
Create CI/CD pipeline for dashboard updates

Long Term

Multi-cluster observability
Machine learning-based anomaly detection
Cost optimization recommendations
Auto-remediation workflows

Additional Resources

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenTelemetry Community for the demo application
Grafana Labs for visualization tools
Elastic for search and analytics
Prometheus CNCF project

Built with care for modern observability

For questions or support, please open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github		.github
docs/submissions		docs/submissions
internal/tools		internal/tools
kubernetes		kubernetes
pb		pb
src		src
test		test
.dockerignore		.dockerignore
.env		.env
.env.arm64		.env.arm64
.env.override		.env.override
.gitattributes		.gitattributes
.gitignore		.gitignore
.licenserc.json		.licenserc.json
.linkspector.yml		.linkspector.yml
.markdownlint.yaml		.markdownlint.yaml
.yamlignore		.yamlignore
.yamllint		.yamllint
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
ERROR		ERROR
LICENSE		LICENSE
Makefile		Makefile
Prototype Steps.docx		Prototype Steps.docx
README.md		README.md
[llm]		[llm]
[product-reviews]		[product-reviews]
[shipping		[shipping
buildkitd.toml		buildkitd.toml
docker-compose-tests.yml		docker-compose-tests.yml
docker-compose-tests_include-override.yml		docker-compose-tests_include-override.yml
docker-compose.minimal.yml		docker-compose.minimal.yml
docker-compose.yml		docker-compose.yml
docker-gen-proto.sh		docker-gen-proto.sh
generate_test_errors.sh		generate_test_errors.sh
ide-gen-proto.sh		ide-gen-proto.sh
package-lock.json		package-lock.json
package.json		package.json

License

dekema9924/Otel-demo-captsone

Folders and files

Latest commit

History

Repository files navigation

OpenTelemetry Observability Dashboard

📂 Project Submission Links

Table of Contents

Overview

What This Project Offers

Architecture

Component Details

Features

Unified Dashboards

Distributed Tracing

Metrics Collection

Log Aggregation

Prerequisites

Quick Start

1. Clone the Repository

2. Start the Stack

3. Verify Deployment

4. Access the Interfaces

5. Import Dashboards

Dashboard Screenshots

Service Overview Dashboard

Distributed Tracing View

RED Metrics Dashboard

Log Analysis

Infrastructure Metrics

Configuration

OpenTelemetry Collector

Prometheus Configuration

Docker Compose Networking

Trace Testing

Running Trace Tests

What We Test

Example Test Output

Troubleshooting

Common Issues

Containers Won't Start

Port Conflicts

ElasticSearch Memory Issues

No Data in Grafana

Health Checks

Team Contributions

Future Roadmap

Short Term

Medium Term

Long Term

Additional Resources

Contributing

License

Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages