Add autoscaling and set higher cpu for raster-eoapi service #42

dakotabenjamin · 2025-11-05T17:23:47Z

What type of PR is this? (check all applicable)

Describe this PR

The raster renderer has been crashing with a single user (me) attempting to use calls such as

[https://api.imagery.hotosm.org/raster/searches/867568a0734dd1f665da93a5a5e2a5ec/tiles/WebMercatorQuad/{z}/{x}/{y}@1x.png?assets=visual](https://api.imagery.hotosm.org/raster/searches/867568a0734dd1f665da93a5a5e2a5ec/tiles/WebMercatorQuad/%7Bz%7D/%7Bx%7D/%7By%[email protected]?assets=visual)

Default cpu/mem config for the pods:

     Limits:
      cpu:     768m
      memory:  4Gi
    Requests:
      cpu:      256m
      memory:   3Gi

0.25 of a vCPU is probably way too small, new config (below) should fix most of the problems.

      requests:
        cpu: "1024m"
        memory: "3Gi"
      limits:
        cpu: "2048m"
        memory: "4Gi"

Additionally, we enable autoscaling for the raster-eoapi pod. Current scaling target is 85% CPU usage- This needs discussion, and potentially load-testing.

Screenshots

grafana dashboard showing some of the metrics leading to these changes:

github-actions · 2025-11-05T17:24:43Z

tofu plan -chdir=terraform -var-file=vars/production.tfvars

No changes. Your infrastructure matches the configuration.

By @dakotabenjamin at 2025-11-05T17:23:47Z (view log).

No changes. Your infrastructure matches the configuration.

OpenTofu has compared your real infrastructure against your configuration and
found no differences, so no changes are needed.

spwoodcock · 2025-11-06T06:47:11Z

kubernetes/helm/eoapi-values.yaml

  enabled: false

+raster:
+  autoscaling:


This configures autoscaling for the 'raster' pod instances in eoAPI, but note I dont think we have any autoscaling for the underlying nodes yet.

So it will try to run scale the pods, but likely run out of resources on the single worker node we have attached.

We probably need Karpenter installed to autoscale nodes in AWS 👍

spwoodcock · 2025-11-06T06:50:21Z

kubernetes/helm/eoapi-values.yaml

+    resources:
+      requests:
+        cpu: "1024m"
+        memory: "3Gi"


Note that each pod will require 3GB available for scaling to work. The only way to scale to 10 is to have 30GB RAM available between worker nodes.

It could be OK once we have a lot more nodes and resources, but I would keep the reservations low and the limits higher

Add autoscaling and set higher cpu for raster-eoapi service

5a1128d

spwoodcock reviewed Nov 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add autoscaling and set higher cpu for raster-eoapi service #42

Add autoscaling and set higher cpu for raster-eoapi service #42

Uh oh!

dakotabenjamin commented Nov 5, 2025

Uh oh!

github-actions bot commented Nov 5, 2025

By @dakotabenjamin at 2025-11-05T17:23:47Z (view log).

Uh oh!

spwoodcock Nov 6, 2025

Uh oh!

spwoodcock Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add autoscaling and set higher cpu for raster-eoapi service #42

Are you sure you want to change the base?

Add autoscaling and set higher cpu for raster-eoapi service #42

Uh oh!

Conversation

dakotabenjamin commented Nov 5, 2025

What type of PR is this? (check all applicable)

Describe this PR

Screenshots

Uh oh!

github-actions bot commented Nov 5, 2025

By @dakotabenjamin at 2025-11-05T17:23:47Z (view log).

Uh oh!

spwoodcock Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

spwoodcock Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants