Releases · NVIDIA-NeMo/Evaluator

05 Feb 01:37

ko3n1g

nemo-evaluator-v0.1.77

9ce6084

nemo-evaluator-v0.1.77

Merge branch 'deploy-release/bef4b952-c0f3-40fa-b5fd-320d86b86e37'

Assets 2

05 Feb 01:37

ko3n1g

nemo-evaluator-launcher-v0.1.78

9ce6084

NVIDIA NeMo Evaluator Launcher 0.1.78 Latest

Latest

nemo-evaluator-launcher-v0.1.78

Merge branch 'deploy-release/bef4b952-c0f3-40fa-b5fd-320d86b86e37'

Assets 2

04 Feb 01:37

ko3n1g

nemo-evaluator-v0.1.76

b4261b2

NVIDIA NeMo Evaluator 0.1.76

nemo-evaluator-v0.1.76

feat(slurm): add launcher_install_cmd option for custom auto-export i…

Assets 2

04 Feb 01:37

ko3n1g

nemo-evaluator-launcher-v0.1.77

b4261b2

NVIDIA NeMo Evaluator Launcher 0.1.77

nemo-evaluator-launcher-v0.1.77

feat(slurm): add launcher_install_cmd option for custom auto-export i…

Assets 2

03 Feb 01:37

ko3n1g

nemo-evaluator-v0.1.75

089cc9f

NVIDIA NeMo Evaluator 0.1.75

chore: Fix max_walltime docs (#685)

Signed-off-by: Wojciech Prazuch <[email protected]>

Assets 2

03 Feb 01:38

ko3n1g

nemo-evaluator-launcher-v0.1.76

089cc9f

NVIDIA NeMo Evaluator Launcher 0.1.76

chore: Fix max_walltime docs (#685)

Signed-off-by: Wojciech Prazuch <[email protected]>

Assets 2

02 Feb 01:38

ko3n1g

nemo-evaluator-v0.1.74

406923b

NVIDIA NeMo Evaluator 0.1.74

nemo-evaluator-v0.1.74

ci: Fix integration test by avoid writing to read-only test directory…

Assets 2

02 Feb 01:38

ko3n1g

nemo-evaluator-launcher-v0.1.75

406923b

NVIDIA NeMo Evaluator Launcher 0.1.75

nemo-evaluator-launcher-v0.1.75

ci: Fix integration test by avoid writing to read-only test directory…

Assets 2

29 Jan 01:36

ko3n1g

nemo-evaluator-v0.1.73

6a9803a

NVIDIA NeMo Evaluator 0.1.73

fix(slurm): node_array undefined (#671)

## Summary

When running the launcher on Slurm with `deployment.type: none`, the
generated sbatch script could fail at runtime with:

- `line N: nodes_array[0]: unbound variable`

This was triggered by `set -u` (nounset) and an unconditional
`--nodelist ${nodes_array[0]}` in the evaluation client `srun`.

## Impact

- **Configs affected**: any Slurm run with `deployment.type=none` (e.g.,
“target-only” evaluation).
- **Failure mode**: sbatch script exits before launching the evaluation
client.
- **Where observed**: Slurm job log (`slurm_script` / `slurm-%A.log`).



## Direct cause

- The sbatch script enables:
  - `set -u` (treat unset variables as an error)
- The evaluation client `srun` was emitted as:
  - `srun ... --nodelist ${nodes_array[0]} ...`
- `nodes_array` was only defined inside the deployment block (`if
cfg.deployment.type != "none": ...`).
- Therefore, for `deployment.type=none`, `nodes_array` was undefined and
`${nodes_array[0]}` crashed under nounset.

## Secondary risks (also addressed)

Even when deployment is enabled, `${nodes_array[0]}` can still fail if:

- `$SLURM_JOB_NODELIST` is unset/empty (non-standard environment) or
only `$SLURM_NODELIST` is present.
- `scontrol` is unavailable on the node or not in `PATH`.
- `scontrol show hostnames ...` returns an empty list.

Any of these can result in an empty/unset array index under `set -u`.

## Solution

### Approach

Introduce a **single, always-defined** “node pinning” variable for
single-node sruns:

- `PRIMARY_NODE`

This is resolved at runtime in the sbatch script with safe fallbacks:

1. Prefer `SLURM_JOB_NODELIST`
2. Fallback to `SLURM_NODELIST`
3. Fallback to local `hostname`

---------

Signed-off-by: Alex Gronskiy <[email protected]>

Assets 2

28 Jan 12:31

ko3n1g

nemo-evaluator-v0.1.72

193483d

NVIDIA NeMo Evaluator 0.1.72

fix: restore support for running tasks not listed in FDF (#667)

We have improved our validation in the spirit of failing early. However,
this lead to unwanted side effect - we've lost support for running tasks
not listed in FDF with `harness.task` syntax. Calling evaluation with
this syntax was resulting in
`nemo_evaluator.core.utils.MisconfigurationError: Unknown evaluation
xxx`

It stopped working because:
* we run validation (everything passes here)
* then we prepare the config, extracting `task` from `harness.task` and
using in as evaluation `type`
* we run 2nd validation and it fails because we no longer use
`harness.task` syntax and there's no evaluation called
`task` in FDF

This PR uses `harness.task` as `type` to make sure it's always valid +
adds test verifying custom task support. It also removes one redundant
validation

---------

Signed-off-by: Marta Stepniewska-Dziubinska <[email protected]>

Assets 2

Releases: NVIDIA-NeMo/Evaluator

NVIDIA NeMo Evaluator 0.1.77

Uh oh!

NVIDIA NeMo Evaluator Launcher 0.1.78

Uh oh!

NVIDIA NeMo Evaluator 0.1.76

Uh oh!

NVIDIA NeMo Evaluator Launcher 0.1.77

Uh oh!

NVIDIA NeMo Evaluator 0.1.75

Uh oh!

NVIDIA NeMo Evaluator Launcher 0.1.76

Uh oh!

NVIDIA NeMo Evaluator 0.1.74

Uh oh!

NVIDIA NeMo Evaluator Launcher 0.1.75

Uh oh!

NVIDIA NeMo Evaluator 0.1.73

Uh oh!

NVIDIA NeMo Evaluator 0.1.72

Uh oh!