-
Notifications
You must be signed in to change notification settings - Fork 8
Karabo Dask Util incompatible with Calculon #676
Copy link
Copy link
Open
Labels
CalculonThings related to our HPC cluster.Things related to our HPC cluster.bugSomething isn't workingSomething isn't working
Description
Summary
Some code in karabo/util/dask.py seems to assume that Slurm node names are numbers;
ValueError: invalid literal for int() with base 10: 'g-009'
This error is shown on Calculon. On Daint.Alps you get:
Dashboard link: http://172.28.15.84:8787/status
<Client: 'inproc://172.28.15.84/210171/1' processes=1 threads=288, memory=849.61 GiB>
This is the expected output, though the numbers may vary.
How to reproduce
- Sign in to Calculon
- Make sure you have Karabo installed (using a conda environment)
- Activate the environment
- Run the following script with
srun -p performance python dask_hello_karabo.py
from karabo.util.dask import DaskHandler
client = DaskHandler.get_dask_client()
print(client)
- You shoud see the follwoing error (the node name may vary):
Traceback (most recent call last):
File "/mnt/nas05/clusterdata01/home2/andreas/tests/python-hpc-workshop/dask/dask_hello_karabo.py", line 9, in <module>
client = DaskHandler.get_dask_client()
File "/home2/andreas/.conda/envs/karabo/lib/python3.10/site-packages/karabo/util/dask.py", line 680, in get_dask_client
return cls._handler.get_dask_client()
File "/home2/andreas/.conda/envs/karabo/lib/python3.10/site-packages/karabo/util/dask.py", line 250, in get_dask_client
if not cls._setup_called and cls.is_first_node():
File "/home2/andreas/.conda/envs/karabo/lib/python3.10/site-packages/karabo/util/dask.py", line 586, in is_first_node
return cls.get_node_id() == cls._get_lowest_node_id()
File "/home2/andreas/.conda/envs/karabo/lib/python3.10/site-packages/karabo/util/dask.py", line 568, in get_node_id
return int(slurmd_nodename[-len_id:])
ValueError: invalid literal for int() with base 10: 'g-009'
srun: error: calc-g-009: task 0: Exited with exit code 1
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
CalculonThings related to our HPC cluster.Things related to our HPC cluster.bugSomething isn't workingSomething isn't working