Finetuning for gemma dies suddenly after about 12 hours. There are no warnings or messages in the output logs, the process is just killed.

The script https://ai.google.dev/gemma/docs/distributed_tuning was being run using nohup.
What could be some possible debugging steps or is this a server-side problem?
I experienced the same behaviour in v3-8 devices.