feat: support for creating and managing gpu cluster #1437
build-and-push-images.yaml
on: pull_request
Matrix: Build and Publish Images
Annotations
2 errors
|
Build and Publish Images (model-initializer, cmd/initializers/model/Dockerfile, linux/amd64,linux...
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
|
|
Build and Publish Images (torchtune-trainer, cmd/trainers/torchtune/Dockerfile, linux/amd64,linux...
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
|