Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions docs/user/ai_model_serving.md
Original file line number Diff line number Diff line change
Expand Up @@ -442,6 +442,19 @@ ovms_requests_success{api="KServe",interface="REST",method="ModelReady",name="ov
ovms_requests_success{api="KServe",interface="REST",method="ModelMetadata",name="ovms-resnet50",version="1"} 1
```

##### Scraping model server's metrics with microshift-observability (OTEL)

In order to scrape model server's metrics with microshift-observability:
1. Your OTEL configuration needs to include prometheus receiver (see [opentelemetry-collector-large.yaml](/packaging/observability/opentelemetry-collector-large.yaml) preset of example)
1. Your InferenceService CR needs to include following annotation which will be passed-through to the Pod:
```yaml
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
annotations:
prometheus.io/scrape: "true"
```

#### Other Inference Protocol endpoints

To learn more about kserve endpoints see upstream documentation:
Expand Down
29 changes: 29 additions & 0 deletions packaging/observability/opentelemetry-collector-large.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# - Kubernetes Events (Warnings only)
# - Host's CPU, Mem, Disk, and Network Metrics
# - System journals for selected MicroShift services and dependencies, priority < Info
# - Individual Pods' metrics if they have "prometheus.io/scrape": "true" annotation

receivers:
kubeletstats:
Expand Down Expand Up @@ -36,6 +37,30 @@ receivers:
- openvswitch.service
- ovsdb-server.service
- ovs-vswitchd.service
prometheus:
config:
scrape_configs:
- job_name: k8s
scrape_interval: 10s
kubernetes_sd_configs:
- kubeconfig_file: /var/lib/microshift/resources/observability-client/kubeconfig
role: pod
relabel_configs:
# Only scrape Pods with annotation "prometheus.io/scrape": "true"
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
# Use value of "prometheus.io/path" annotation for scraping
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
# Use value of "prometheus.io/port" annotation for scraping
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__

processors:
batch:
Expand Down Expand Up @@ -82,6 +107,10 @@ service:
receivers: [ journald ]
processors: [ resourcedetection/system ]
exporters: [ otlp ]
metrics/pods:
receivers: [ prometheus ]
processors: [ batch ]
exporters: [ otlp ]
telemetry:
metrics:
readers:
Expand Down