Microservices Platform on AWS EKS - Lab Environment
| Deliverable | Status | Link |
|---|---|---|
| Technical Document | Complete | Technical Documentation |
| Presentation Deck | Complete | Presentation Slides |
| Source Code | Complete | This Repository |
tbyte/
├── .github/ # GitHub Actions CI/CD pipelines
├── apps/ # Helm charts for Kubernetes applications
├── argocd-apps/ # ArgoCD application definitions (GitOps)
├── docs/ # Technical documentation and assessment tasks
├── scripts/ # Automation scripts for setup and deployment
├── src/ # Application source code (frontend + backend)
└── terragrunt/ # Infrastructure as Code (Terraform modules)
# Required tools
aws --version # AWS CLI v2
kubectl version # Kubernetes CLI
terragrunt --version # Terragrunt v0.50+Note: This option allows you to test the solution independently in your own AWS accounts without any dependencies on the original repository.
For Reviewers: If you prefer to test directly from this repository without forking, please provide your:
- Email address or GitHub username
- I'll add you as a collaborator to enable GitHub Actions triggers and testing
# 1. Fork the repository
git clone https://github.com/YOUR_USERNAME/tbyte.git
cd tbyte
# 2. Update account IDs in configuration
sed -i 's/111111111111/YOUR_DEV_ACCOUNT_ID/g' terragrunt/environments/dev/terragrunt.hcl
sed -i 's/222222222222/YOUR_STAGING_ACCOUNT_ID/g' terragrunt/environments/staging/terragrunt.hcl
sed -i 's/333333333333/YOUR_PROD_ACCOUNT_ID/g' terragrunt/environments/prod/terragrunt.hcl# Option 1: Use the multi-account OIDC setup script (recommended)
./scripts/setup-multi-account-oidc.sh
# This script will:
# 1. Create GitHub OIDC providers in all accounts (dev, staging, prod)
# 2. Create GitHubActionsEKSRole in each account
# 3. Attach AdministratorAccess policies
# 4. Create S3 state buckets for Terragrunt
# 5. Output role ARNs for GitHub secrets
# Option 2: Use environment-specific roles script
./scripts/setup-multi-env-roles.sh
# This creates environment-specific role names:
# - TByteDevGitHubActionsRole
# - TByteStagingGitHubActionsRole
# - TByteProdGitHubActionsRole
# Prerequisites: AWS Organizations access via oth_infra profile# Current GitHub secrets configured (check with: gh secret list)
# Account IDs
AWS_ACCOUNT_ID_DEV: "111111111111"
AWS_ACCOUNT_ID_STAGING: "222222222222"
AWS_ACCOUNT_ID_PRODUCTION: "333333333333"
AWS_ACCOUNT_ID_ROOT: "your-root-account-id"
# Role ARNs (depends on which setup script you used)
# If using setup-multi-account-oidc.sh:
AWS_ROLE_ARN_DEV: "arn:aws:iam::111111111111:role/GitHubActionsEKSRole"
AWS_ROLE_ARN_STAGING: "arn:aws:iam::222222222222:role/GitHubActionsEKSRole"
AWS_ROLE_ARN_PRODUCTION: "arn:aws:iam::333333333333:role/GitHubActionsEKSRole"
# If using setup-multi-env-roles.sh:
# AWS_ROLE_ARN_DEV: "arn:aws:iam::111111111111:role/TByteDevGitHubActionsRole"
# AWS_ROLE_ARN_STAGING: "arn:aws:iam::222222222222:role/TByteStagingGitHubActionsRole"
# AWS_ROLE_ARN_PRODUCTION: "arn:aws:iam::333333333333:role/TByteProdGitHubActionsRole"
# ArgoCD GitHub App credentials
ARGOCD_APP_ID: "github-app-id"
ARGOCD_APP_INSTALLATION_ID: "installation-id"
ARGOCD_APP_PRIVATE_KEY: "private-key-pem"
# Repository configuration
GIT_REPO_URL: "https://github.com/chiju/tbyte.git"
# To add secrets to your forked repository:
gh secret set AWS_ACCOUNT_ID_DEV --body "your-dev-account-id"
gh secret set AWS_ROLE_ARN_DEV --body "arn:aws:iam::YOUR_ACCOUNT:role/GitHubActionsEKSRole"# 1. Push changes to trigger infrastructure deployment
git add .
git commit -m "Setup multi-account configuration"
git push origin main
# 2. Monitor GitHub Actions:
# - Infrastructure Deployment: .github/workflows/terragrunt.yml
# - Application Deployment: .github/workflows/app-cicd.yml
# 3. Deployments run automatically on:
# - Push to main branch
# - Manual trigger via GitHub Actions UI# Check EKS cluster
aws eks describe-cluster --name tbyte-dev --region eu-central-1
# Check RDS instance
aws rds describe-db-instances --region eu-central-1
# Check VPC and subnets
aws ec2 describe-vpcs --filters "Name=tag:Name,Values=tbyte-dev-vpc" --region eu-central-1# Check pods are running
kubectl get pods -n tbyte
kubectl get pods -n monitoring
kubectl get pods -n opentelemetry
# Check services and ingress
kubectl get svc -n tbyte
kubectl get rollout -n tbyte# Get the load balancer URL
LB_URL=$(kubectl get svc -n istio-system istio-gateway -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
echo "Load Balancer URL: $LB_URL"
# Get IP address and add to hosts file
LB_IP=$(nslookup $LB_URL | grep 'Address:' | tail -1 | awk '{print $2}')
echo "$LB_IP tbyte.local" | sudo tee -a /etc/hosts
# Test backend API health endpoint
curl http://tbyte.local/api/health | jq .
# Expected response:
# {
# "status": "healthy",
# "timestamp": "2025-12-14T21:40:48.701Z",
# "service": "tbyte-backend",
# "version": "1.0.0"
# }
# Test users API (connects to PostgreSQL RDS)
curl http://tbyte.local/api/users | jq .
# Expected response:
# {
# "success": true,
# "data": [
# {
# "id": 1,
# "name": "John Doe",
# "email": "john@tbyte.com",
# "created_at": "2025-12-14T12:10:14.297Z"
# }
# ],
# "count": 3
# }
# Test frontend in browser
open http://tbyte.local
# Or test frontend via curl
curl http://tbyte.local | head -10
# Expected: HTML page with "TByte Microservices" title# Port forward to access UIs
kubectl port-forward svc/monitoring-grafana -n monitoring 3000:80 &
kubectl port-forward svc/monitoring-kube-prometheus-prometheus -n monitoring 9090:9090 &
# Access URLs:
# Grafana: http://localhost:3000 (admin / prom-operator)
# Prometheus: http://localhost:9090# Trigger a new deployment
kubectl patch rollout tbyte-microservices-frontend -n tbyte --type merge \
-p '{"spec":{"restartAt":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"}}'
# Watch canary deployment progress
kubectl get rollout tbyte-microservices-frontend -n tbyte -w
# Check analysis runs
kubectl get analysisrun -n tbyte
# Verify no downtime during deployment
while true; do
curl -s -H "Host: tbyte.local" http://$LB_URL/api/health | jq -r .status
sleep 1
donegraph TB
subgraph "GitHub"
REPO[Repository]
ACTIONS[GitHub Actions]
SECRETS[GitHub Secrets]
end
subgraph "AWS Account (111111111111)"
subgraph "IAM"
OIDC[GitHub OIDC Provider]
ROLES[IAM Roles & Policies]
end
subgraph "VPC (10.0.0.0/16)"
subgraph "Public Subnets (10.0.1.0/24, 10.0.2.0/24)"
IGW[Internet Gateway]
ALB[Application Load Balancer]
NAT1[NAT Gateway AZ-1a]
NAT2[NAT Gateway AZ-1b]
end
subgraph "Private Subnets (10.0.3.0/24, 10.0.4.0/24)"
subgraph "EKS Cluster (tbyte-dev)"
subgraph "Control Plane"
API[EKS API Server]
ETCD[etcd]
end
subgraph "Worker Nodes (t3.medium)"
subgraph "System Pods"
ISTIO[Istio Service Mesh]
ARGOCD[ArgoCD]
PROM[Prometheus]
GRAF[Grafana]
OTEL[OpenTelemetry]
ESO[External Secrets]
KARP[Karpenter]
end
subgraph "Application Pods"
FE[Frontend Pods]
BE[Backend Pods]
end
end
end
RDS[(RDS PostgreSQL<br/>db.t3.micro)]
end
end
subgraph "Container Registry"
ECR[ECR Repositories<br/>Frontend & Backend]
end
subgraph "Observability"
CW[CloudWatch Logs]
SM[Secrets Manager]
end
subgraph "Storage"
S3[S3 Terraform State]
end
end
subgraph "External Access"
USER[Users]
DOMAIN[tbyte.local]
end
%% CI/CD Flow
REPO --> ACTIONS
ACTIONS --> OIDC
OIDC --> ROLES
ACTIONS --> ECR
ACTIONS --> S3
%% GitOps Flow
ARGOCD --> REPO
ARGOCD --> FE
ARGOCD --> BE
%% Network Flow
USER --> DOMAIN
DOMAIN --> ALB
ALB --> ISTIO
ISTIO --> FE
ISTIO --> BE
BE --> RDS
%% Infrastructure Dependencies
FE --> ECR
BE --> ECR
BE --> SM
ESO --> SM
GRAF --> CW
%% Monitoring Flow
FE --> OTEL
BE --> OTEL
OTEL --> PROM
PROM --> GRAF
%% Security
ROLES --> EKS
ROLES --> RDS
ROLES --> SM
style EKS fill:#ff9999
style RDS fill:#99ccff
style ECR fill:#99ff99
style ARGOCD fill:#ffcc99
style ISTIO fill:#cc99ff
| Component | Technology | Purpose |
|---|---|---|
| Infrastructure | Terragrunt + Terraform | Infrastructure as Code |
| Orchestration | AWS EKS | Managed Kubernetes |
| GitOps | ArgoCD | Continuous Deployment |
| Monitoring | Prometheus + Grafana | Metrics & Dashboards |
| Tracing | OpenTelemetry + Jaeger | Distributed Tracing |
| Deployments | Argo Rollouts | Canary Deployments |
| CI/CD | GitHub Actions | Build & Test Pipeline |
- Infrastructure: VPC, EKS, RDS, ElastiCache deployed
- Applications: All ArgoCD apps Synced and Healthy
- Monitoring: Prometheus collecting metrics, Grafana dashboards accessible
- Tracing: OpenTelemetry collector running, Jaeger UI accessible
- Deployments: Canary rollouts working with analysis
- Security: RBAC, Network Policies, Pod Security Standards enabled
Issue: kubectl cannot connect to cluster
# Fix: Update kubeconfig
aws eks update-kubeconfig --region eu-central-1 --name tbyte-devIssue: ArgoCD applications stuck in "OutOfSync"
# Fix: Force sync
kubectl patch application tbyte-microservices -n argocd --type merge -p '{"operation":{"sync":{}}}'Issue: Rollout analysis failing
# Check analysis run details
kubectl describe analysisrun -n tbyte $(kubectl get analysisrun -n tbyte --sort-by=.metadata.creationTimestamp -o name | tail -1)For detailed technical documentation, architecture decisions, and troubleshooting guides:
# Destroy infrastructure (when testing is complete)
cd terragrunt/environments/dev
terragrunt run-all destroy --terragrunt-non-interactive
# This will remove all AWS resources and associated costsAWS Accounts: The AWS account IDs referenced in this repository were used for demonstration purposes only and have been closed after testing completion. No active AWS resources or credentials are associated with these accounts.