Closed
Description
The Macs are down again:
https://farmer.golang.org/status/macs
# "macs" status: MacStadium Mac VMs
# Notes: https://github.com/golang/build/tree/master/env/darwin/macstadium
Warn: macstadium_host01a missing, not seen for 46h18m23s
Warn: macstadium_host01b missing, not seen for 54h25m13s
Warn: macstadium_host02a missing, not seen for 54h25m0s
Warn: macstadium_host02b missing, not seen for 48h3m37s
Warn: macstadium_host04b missing, not seen for 47h55m10s
Warn: macstadium_host07a missing, not seen for 46h17m36s
Warn: macstadium_host08a missing, not seen for 48h0m48s
Warn: macstadium_host08b missing, not seen for 46h9m34s
Warn: macstadium_host09a missing, not seen for 46h23m44s
Warn: macstadium_host10a missing, not seen for 112h46m24s
Warn: macstadium_host10b missing, not seen for 112h47m30s
Error: 11 machines missing, 55% of capacity
Looking at the macstadiumd host's logs:
gopher@godns:~$ sudo journalctl -f -u makemac
-- Logs begin at Wed 2019-06-05 07:30:30 PDT. --
Jun 05 08:24:56 godns makemac[2341]: 2019/06/05 08:24:56 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:24:56 godns makemac[2341]: 2019/06/05 08:24:56 getting VMWare state: Reading /MacStadium-ATL/host/MacMini_Cluster: EOF
Jun 05 08:24:57 godns makemac[2341]: 2019/06/05 08:24:57 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:24:57 godns makemac[2341]: 2019/06/05 08:24:57 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:24:58 godns makemac[2341]: 2019/06/05 08:24:58 getting VMWare state: Reading /MacStadium-ATL/host/MacMini_Cluster: EOF
Jun 05 08:24:59 godns makemac[2341]: 2019/06/05 08:24:59 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:24:59 godns makemac[2341]: 2019/06/05 08:24:59 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:00 godns makemac[2341]: 2019/06/05 08:25:00 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:00 godns makemac[2341]: 2019/06/05 08:25:00 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:01 godns makemac[2341]: 2019/06/05 08:25:01 getting VMWare state: Reading /MacStadium-ATL/host/MacMini_Cluster: EOF
Jun 05 08:25:02 godns makemac[2341]: 2019/06/05 08:25:02 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:03 godns makemac[2341]: 2019/06/05 08:25:03 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:03 godns makemac[2341]: 2019/06/05 08:25:03 getting VMWare state: Reading /MacStadium-ATL/host/MacMini_Cluster: EOF
Jun 05 08:25:03 godns makemac[2341]: 2019/06/05 08:25:03 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:04 godns makemac[2341]: 2019/06/05 08:25:04 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:05 godns makemac[2341]: 2019/06/05 08:25:05 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:05 godns makemac[2341]: 2019/06/05 08:25:05 getting VMWare state: Reading /MacStadium-ATL/host/MacMini_Cluster: EOF
Jun 05 08:25:06 godns makemac[2341]: 2019/06/05 08:25:06 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:06 godns makemac[2341]: 2019/06/05 08:25:06 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:07 godns makemac[2341]: 2019/06/05 08:25:07 getting VMWare state: Reading /MacStadium-ATL/host/MacMini_Cluster: EOF
Jun 05 08:25:07 godns makemac[2341]: 2019/06/05 08:25:07 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:08 godns makemac[2341]: 2019/06/05 08:25:08 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:09 godns makemac[2341]: 2019/06/05 08:25:09 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:09 godns makemac[2341]: 2019/06/05 08:25:09 getting VMWare state: Reading /MacStadium-ATL/host/MacMini_Cluster: EOF
Jun 05 08:25:10 godns makemac[2341]: 2019/06/05 08:25:10 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Jun 05 08:25:10 godns makemac[2341]: 2019/06/05 08:25:10 served cached buildlet of "97a16ac063b06959ba54c187354b7f12"
Something's wrong with the cluster.
Related: since the coordinator now polls the makemac JSON status URL (and it's currently reporting healthy), we should include errors like getting VMWare state: Reading /MacStadium-ATL/host/MacMini_Cluster: EOF
in the makemac daemon's status response JSON, so they can be shown in the coordinator health output.