Skip to content

Resource Optimization For Openshift: Architecture Overview

Suraj Patil edited this page Apr 24, 2024 · 9 revisions

Introduction

The Red Hat Insights resource optimization for OpenShift service enables OpenShift users to monitor their cluster usage and to take action to optimize the usage based on service recommendation.

ROS for Openshift Arch Diagram

Workflow

  1. Cost management operator runs promql queries and collects container metrics data from the Prometheus server. It creates a tar-archive from the query results
  2. Operator uploads created tar-archive to the console.redhat.com with header set to hccm
  3. When ingress service receives the archive it uploads it to s3 bucket and sends the request metadata and s3 url of archive to platform.upload.announce kafka topic
  4. Koku listener (cost-management) application reads the messages from announce topic, downloads the report from s3 bucket and extracts the metric files from the report.
  5. Once the data/metrics needed by ros-ocp is extracted from the archive, koku uploads it to another s3 bucket and sends the request metadata and s3 url of metrics file to hccm.ros.events kafka topic.
  6. ROS-OCP processor consumes messages from hccm.ros.events kafka topic and downloads the files from the s3 url mentioned in the message.
  7. ROS-OCP processor parse the downloaded file and aggregates the metrics needed by Kruize application. ros-ocp processor do POST /createExperiment to create the experiment in kruize for a particular workload and then does the POST /updateResults to forward workload metrics to Kruize service. At the end processor push message to a kafka topic rosocp.kruize.recommendations which contains experiment_name and the max_end_timestamp.
  8. Recommendation-poller service pull these messages from rosocp.kruize.recommendations topic and hit POST /updateRecommendation API to generate the recommendation for that particular experiment. This is done in the loop over all the experiments(k8s deployments).
  9. ros-ocp-backend also contains an API server which serves the incoming requests from users.

Cost Management report/archive structure

Cost management operator creates an archive with two types of files: 1st manifest.json and 2nd uuid.csv files.

Manifest.json files contain metadata like cluster_id, cr_status, start and end time etc. Full example of manifest.json can be found - here

csv files contain metrics collected from prometheus instance. Example of csv file can be found in cost-mgmt.tar.gz

Clone this wiki locally