While debugging #23668 for Windows, I ran into some weird UX when the executor process crashes. The Task Event we write is an error that bubbles up from the go-plugin connection to the executor.
Recent Events:
Time Type Description
2024-10-14T18:41:59Z Restarting Task restarting in 15.884342135s
2024-10-14T18:41:59Z Terminated Exit Code: 0, Exit Message: "executor: error waiting on process: rpc error: code = Unavailable desc = error reading from server: read tcp 127.0.0.1:50900->127.0.0.1:14000: wsarecv: An existing connection was forcibly closed by the remote host."
2024-10-14T18:33:59Z Started Task started by client
2024-10-14T18:33:59Z Task Setup Building Task Directory
On Linux the error message is similar:
2024-10-15T12:19:18-04:00 Terminated Exit Code: 0, Exit Message: "executor: error waiting on process: rpc error: code =
Unavailable desc = error reading from server: EOF"
There are two problems here:
- The exit code should not be displayed as 0.
- Although we should include the detailed error message because it's helpful to us as Nomad developers, it would be nice if we could make the "executor: error waiting on process" explain what just happened.
While debugging #23668 for Windows, I ran into some weird UX when the executor process crashes. The Task Event we write is an error that bubbles up from the go-plugin connection to the executor.
On Linux the error message is similar:
There are two problems here: