Skip to content

CI

CI #4772

Triggered via schedule August 23, 2025 09:31
Status Failure
Total duration 6h 37m 45s
Artifacts 43

ci.yaml

on: schedule
metadata
3s
metadata
bump-manifest
12s
bump-manifest
Matrix: amd64 / test-distribution
Matrix: arm64 / test-distribution
amd64  /  ...  /  build-base
3m 31s
amd64 / build-base / build-base
arm64  /  ...  /  build-base
4m 31s
arm64 / build-base / build-base
amd64  /  ...  /  build-mpi-operator-compatible-base
1m 39s
amd64 / test-nccl / build-mpi-operator-compatible-base
amd64  /  ...  /  build-nccl-gke
2m 15s
amd64 / test-nccl / nccl-test-gke / build-nccl-gke
arm64  /  ...  /  build-mpi-operator-compatible-base
arm64 / test-nccl / build-mpi-operator-compatible-base
arm64  /  ...  /  build-nccl-gke
arm64 / test-nccl / nccl-test-gke / build-nccl-gke
Matrix: amd64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Matrix: amd64 / test-jax / run-unit-test
Matrix: amd64 / test-te-a100 / run-unit-test
Matrix: amd64 / test-te-h100 / te-test-h100
amd64  /  ...  /  launch-slurm-runner
34m 42s
amd64 / test-jax / runner / launch-slurm-runner
amd64  /  test-nsys-jax-eks
13m 29s
amd64 / test-nsys-jax-eks
amd64  /  ...  /  launch-slurm-runner
45m 55s
amd64 / test-te-a100 / runner / launch-slurm-runner
amd64  /  build-upstream-t5x
6m 41s
amd64 / build-upstream-t5x
amd64  /  build-axlearn
5m 3s
amd64 / build-axlearn
Matrix: amd64 / test-nsys-jax / run-unit-test
amd64  /  ...  /  launch-slurm-runner
59m 38s
amd64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: amd64 / test-nccl / nccl-test
Matrix: amd64 / test-nccl / nccl-test-gke / nccl-gke
Matrix: arm64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Waiting for pending jobs
Matrix: arm64 / test-jax / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-a100 / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-h100 / te-test-h100
Waiting for pending jobs
arm64  /  test-nsys-jax-eks
0s
arm64 / test-nsys-jax-eks
arm64  /  ...  /  launch-slurm-runner
arm64 / test-jax / runner / launch-slurm-runner
arm64  /  ...  /  launch-slurm-runner
arm64 / test-te-a100 / runner / launch-slurm-runner
arm64  /  build-upstream-t5x
9m 16s
arm64 / build-upstream-t5x
arm64  /  build-axlearn
7m 5s
arm64 / build-axlearn
Matrix: arm64 / test-nsys-jax / run-unit-test
Waiting for pending jobs
arm64  /  ...  /  launch-slurm-runner
arm64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: arm64 / test-nccl / nccl-test
Waiting for pending jobs
Matrix: arm64 / test-nccl / nccl-test-gke / nccl-gke
Waiting for pending jobs
amd64  /  ...  /  maxtext-gke-xpk
1m 9s
amd64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: amd64 / test-maxtext / maxtext-multinode
Matrix: amd64 / test-maxtext / single-process-multi-device
amd64  /  ...  /  build-rosetta
14m 6s
amd64 / build-rosetta-t5x / build-rosetta
amd64  /  test-axlearn-eks
6h 0m
amd64 / test-axlearn-eks
amd64  /  test-axlearn-fuji-models-eks
3h 7m
amd64 / test-axlearn-fuji-models-eks
Matrix: amd64 / test-nsys-jax-archive
arm64  /  ...  /  maxtext-gke-xpk
arm64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: arm64 / test-maxtext / maxtext-multinode
Waiting for pending jobs
Matrix: arm64 / test-maxtext / single-process-multi-device
Waiting for pending jobs
arm64  /  ...  /  build-rosetta
15m 19s
arm64 / build-rosetta-t5x / build-rosetta
arm64  /  test-axlearn-eks
0s
arm64 / test-axlearn-eks
arm64  /  test-axlearn-fuji-models-eks
0s
arm64 / test-axlearn-fuji-models-eks
Matrix: arm64 / test-nsys-jax-archive
amd64  /  ...  /  test-maxtext-metrics
20s
amd64 / test-maxtext / test-maxtext-metrics
amd64  /  collect-docker-tags
2s
amd64 / collect-docker-tags
Matrix: amd64 / test-rosetta-t5x / vit-multi-gpu-multi-node
arm64  /  ...  /  test-maxtext-metrics
arm64 / test-maxtext / test-maxtext-metrics
arm64  /  collect-docker-tags
8s
arm64 / collect-docker-tags
Matrix: arm64 / test-rosetta-t5x / vit-multi-gpu-multi-node
Waiting for pending jobs
amd64  /  ...  /  sitrep
12s
amd64 / test-maxtext / test-maxtext-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-summary
2s
amd64 / test-rosetta-t5x / test-t5x-rosetta-summary
amd64  /  ...  /  test-t5x-rosetta-metrics
18s
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
arm64  /  ...  /  sitrep
arm64 / test-maxtext / test-maxtext-sitrep / sitrep
arm64  /  ...  /  test-t5x-rosetta-summary
arm64 / test-rosetta-t5x / test-t5x-rosetta-summary
arm64  /  ...  /  test-t5x-rosetta-metrics
arm64 / test-rosetta-t5x / test-t5x-rosetta-metrics
amd64  /  ...  /  test-maxtext-outcome
2s
amd64 / test-maxtext / test-maxtext-outcome
amd64  /  ...  /  sitrep
6s
amd64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
arm64  /  ...  /  test-maxtext-outcome
arm64 / test-maxtext / test-maxtext-outcome
arm64  /  ...  /  sitrep
arm64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-outcome
4s
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
arm64  /  ...  /  test-t5x-rosetta-outcome
arm64 / test-rosetta-t5x / test-t5x-rosetta-outcome
make-publish-configs
5s
make-publish-configs
merge-new-manifest
8s
merge-new-manifest
Matrix: publish-containers
finalize  /  workflow-badge
9s
finalize / workflow-badge
finalize  /  report
22s
finalize / report
finalize  /  upload-badge
10s
finalize / upload-badge
finalize  /  publish-badge
6s
finalize / publish-badge
Fit to window
Zoom out
Zoom in

Annotations

18 errors, 2 warnings, and 1 notice
amd64 / test-maxtext-gke / maxtext-gke-xpk
Process completed with exit code 1.
amd64 / test-nccl / nccl-test-gke / nccl-gke (reduce_scatter_perf_mpi)
The strategy configuration was canceled because "amd64.test-nccl.nccl-test-gke.nccl-gke.broadcast_perf_mpi" failed
amd64 / test-nccl / nccl-test-gke / nccl-gke (broadcast_perf_mpi)
Process completed with exit code 1.
amd64 / test-nccl / nccl-test-gke / nccl-gke (all_reduce_perf_mpi)
The strategy configuration was canceled because "amd64.test-nccl.nccl-test-gke.nccl-gke.broadcast_perf_mpi" failed
amd64 / test-nccl / nccl-test-gke / nccl-gke (all_gather_perf_mpi)
The strategy configuration was canceled because "amd64.test-nccl.nccl-test-gke.nccl-gke.broadcast_perf_mpi" failed
amd64 / test-te-h100 / te-test-h100 (unittest, 8)
Process completed with exit code 1.
amd64 / test-maxtext / test-maxtext-outcome
Process completed with exit code 1.
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
Process completed with exit code 1.
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
Process completed with exit code 1.
amd64 / test-nccl / nccl-test (all_reduce_perf_mpi)
Process completed with exit code 1.
amd64 / test-nccl / nccl-test (reduce_scatter_perf_mpi)
The operation was canceled.
amd64 / test-nccl / nccl-test (reduce_scatter_perf_mpi)
The strategy configuration was canceled because "amd64.test-nccl.nccl-test.all_reduce_perf_mpi" failed
amd64 / test-nccl / nccl-test (all_gather_perf_mpi)
The operation was canceled.
amd64 / test-nccl / nccl-test (all_gather_perf_mpi)
The strategy configuration was canceled because "amd64.test-nccl.nccl-test.all_reduce_perf_mpi" failed
amd64 / test-nccl / nccl-test (broadcast_perf_mpi)
The operation was canceled.
amd64 / test-nccl / nccl-test (broadcast_perf_mpi)
The strategy configuration was canceled because "amd64.test-nccl.nccl-test.all_reduce_perf_mpi" failed
amd64 / test-axlearn-eks
The operation was canceled.
amd64 / test-axlearn-eks
The job has exceeded the maximum execution time of 6h0m0s
merge-new-manifest
Unexpected input(s) 'owner_and_repo', valid inputs are ['route', 'mediaType']
merge-new-manifest
Unexpected input(s) 'owner_and_repo', 'head', 'base', 'body', 'title', 'draft', valid inputs are ['route', 'mediaType']
amd64 / test-nsys-jax-archive (macOS-latest)
The macos-latest label will migrate to macOS 15 beginning August 4, 2025. For more information see https://github.com/actions/runner-images/issues/12520

Artifacts

Produced during runtime
Name Size Digest
artifact-axlearn-build-amd64
568 Bytes
sha256:82572e5fa20fb6b519a42e00a96a77c49799bf38ed02f5010d47fbfd042e3a69
artifact-axlearn-build-arm64
566 Bytes
sha256:f059156fdcf053649dbff87deef45d26297d069975f0a5d04bd33270203f0a86
artifact-base-build-amd64
568 Bytes
sha256:f3908e66dd10a562936a462ffb1c8d0534e8872f8dc300178bd125dcbb00bcbb
artifact-base-build-arm64
566 Bytes
sha256:6026d8e29c4d78df398f177204b3ed9c558e14e6fa287ef132f20660b4bb4455
artifact-equinox-build-amd64
571 Bytes
sha256:36a733fb866f2bd76af01a11d36abd9f5510a191bde09446d1865d35c3854d9c
artifact-equinox-build-arm64
568 Bytes
sha256:525fae017d96c0418cae39e7d0f118f80225431351c7831875ba1fa188187255
artifact-final-report
3.64 KB
sha256:5f29467d3fa4eff249c49ce19274b1f15470e4fd130bd5686290f5e67c7c6d93
artifact-jax-build-amd64
554 Bytes
sha256:cc583a5ec0e1cb16cdc6783735ec44906ce0e407a56ff9d4347811b3e243a591
artifact-jax-build-arm64
555 Bytes
sha256:f05e79fcf041bb4e2ab79c4a46d5db50977a7934b1dfa6bae5c5851b9a6ab611
artifact-maxtext-build-amd64
567 Bytes
sha256:590513012c8298562a71c5b014f259faac08f06a5da9bab96894dd2de52efe1b
artifact-maxtext-build-arm64
568 Bytes
sha256:9e478df41e924d43a0e4121544e5e453cba34ccf9d041d6351072d84d5c0206a
artifact-maxtext-test
1.46 KB
sha256:90bad42b0b03d8a35844f8a65e7ed704eb5b196459a845c493129ce470f7d784
artifact-mpi-operator-compatible-base-build-amd64
639 Bytes
sha256:754172d97cb8e2439ae657ae864fd63db1576ab5db59ca6777d14123998f027d
artifact-nccl-gke-build-amd64
571 Bytes
sha256:4a9e745b37eceb78388f4a6e2d5d17146d7b118795c0f302852fff461f1c5bd0
artifact-rosetta-build-t5x-amd64
585 Bytes
sha256:2b026073f5d71d58fb29a67786fc085e01dce7eafd2cf4a28c0d2be1284ee689
artifact-rosetta-build-t5x-arm64
586 Bytes
sha256:8538dbb419e5f7ba8aa2e3dd7631b3006682c2a9cc89244905e978f8b0bc11a8
artifact-rosetta-t5x-mgmn-test
624 Bytes
sha256:7298a6055202915f5fb5b752177a2fa70dbd8dbd35b8811324c60ed2ca5b3265
artifact-t5x-build-amd64
569 Bytes
sha256:8757e4b0aa3f2c0af3172a9da62b1d508f89d4b16ca823091a4e93168f5b3240
artifact-t5x-build-arm64
568 Bytes
sha256:6f37a4874648d507464650f7187d1a4af791b49e3e8c3ea5e7d0111045e2a4e1
artifact-workflow-metadata
278 Bytes
sha256:ab66d176c10b0ac461d16bd1697f9ed5e1d346d3f8cf7d8cd0012206eb0268ba
bumped-manifest
47 KB
sha256:88630ecad2cb244c9e944f5760edd7684e2ae4d08cec4b3e62fc3f604f063296
final-axlearn
258 Bytes
sha256:ff951c8f13e680a452bb53610fabb90f42e413915eca5b4718902aed7e58f2a8
final-base
249 Bytes
sha256:23a83a8bb8dd798fd547bb4f121a8bcc3422eed58776b817c76ed16f968e1e4e
final-equinox
258 Bytes
sha256:0dc72348e1d44e5203e62406c8afd12c4a54a1f1e6509ff76e8b4ffe744a61ab
final-jax
246 Bytes
sha256:d6bc8bcaf0ecf96837a92fbfe7d2b4e9e2ecee4c1f48ab222e88dfbd34420e28
final-maxtext
258 Bytes
sha256:c969109a9ccfd598cdd3918f2ad464d678a3ccaedbdc2f1b86658505bada4a66
final-t5x
246 Bytes
sha256:24a7cd5d668ce87aa7cb2cc5f38e9ec27b68eb230986d4d72f16062a8083e694
final-upstream-t5x
273 Bytes
sha256:e7e357d7089e554b5f6c6c4c75965ce12f30bf3792d52b127972e6cd7b18e7c0
jax-cutlass-test-H100
1.24 KB
sha256:70df000b37fc45c47754ce2bd68300ad6f17b16f390a75aee9d91ae246a109a9
jax-unit-test-A100
22 KB
sha256:af84ae43cd52d6b5f63484a654f7ff8b1814679f0653333c05437339d02b466e
mealkit-axlearn
269 Bytes
sha256:9b5b6d4dcdbb783f9e960aa19d5879cc275eb770df267e4d851c1a3f07d7b3df
mealkit-equinox
269 Bytes
sha256:9465bbde76907c89cb0c857ffe9a4116ad5c7006ad3a043bbe226b73dc577531
mealkit-jax
256 Bytes
sha256:ed2a400173cf29b3e32870cdaab09f953489b0103253b598dc4e415413d0ab83
mealkit-maxtext
269 Bytes
sha256:41d5d3a6fb308da4e4730af143d6e0e3f253cbe22f09c7ede31a3e8f2e171bee
mealkit-t5x
258 Bytes
sha256:d0330e7a036bead514964573bb5d6c76b856387e59c277af2f40124239846a97
mealkit-upstream-t5x
283 Bytes
sha256:ff8cd97210ab8f82fa9ffa4815a73dcbf57661f2d8e20ccddd607a0c1da46bfb
nsys-jax-unit-test-A100
32 MB
sha256:690cbef45052542d7f51e41e060e599a42190f2e8354174e25c403c87c08dd86
rosetta-t5x-vit-17173919654-VIT8G1N
14.7 KB
sha256:f7692029ccb2a1a49bb03b3aed68bfb3a22cc2287238da8df75817981891b5ff
te-unit-test-A100
1.36 MB
sha256:5863ae3324e910a985b98d0c93712646ead32fa2a9f9a1cc7af7b5f5b509a405
te-unit-test-H100
1.74 MB
sha256:abf7ff3babd010b7caa9434824c8dd943c035097c6a03f046056ccf64db4e8e0
upstream-maxtext-17173919654-1DP2FSDP4TP1PP_single_process
21.5 KB
sha256:38875cc938fa475a2172287b682ef6c7897a1e0c95a31f3751100392687f0464
upstream-maxtext-17173919654-2DP2FSDP2TP1PP
28 KB
sha256:d9ba729a50802dcb01d41b0fb29d7f23b5d0c175f2464bd9a1d036090e0eec5d
upstream-maxtext-metrics-test-log
2.51 KB
sha256:2a345a10374376248d287832a0a665b14d8dca45fff75a575c646a6c657998c8