Skip to content

feat: support for creating and managing gpu cluster #1437

feat: support for creating and managing gpu cluster

feat: support for creating and managing gpu cluster #1437

Triggered via pull request August 25, 2025 15:22
Status Failure
Total duration 35m 43s
Artifacts

build-and-push-images.yaml

on: pull_request
Matrix: Build and Publish Images
Fit to window
Zoom out
Zoom in

Annotations

2 errors
Build and Publish Images (model-initializer, cmd/initializers/model/Dockerfile, linux/amd64,linux...
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
Build and Publish Images (torchtune-trainer, cmd/trainers/torchtune/Dockerfile, linux/amd64,linux...
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.