TerminalBenchGenerator: logprobs + session ID by li-boxuan · Pull Request #448 · NovaSky-AI/SkyRL

li-boxuan · 2025-10-10T03:49:13Z

With laude-institute/harbor#45, Sandboxes now returns logprobs for terminus agent, so TerminalBenchGenerator could leverage it if applicable. This PR doesn't enable trainer.algorithm.use_tis=true so it should be a no-op.

With laude-institute/harbor#50, Terminus agent now accepts "session_id" parameter, and it will show up in the litellm request body. TerminalBenchGenerator could leverage this for better routing.

Note that I don't have access to GPU yet so this is not tested.

gemini-code-assist

Code Review

This pull request adds support for logprobs and session_id in TerminalBenchGenerator. The changes look good overall, but I have identified a few areas for improvement. Specifically, I've suggested clarifying an error message, adding a validation check to prevent data misalignment between tokens and logprobs, and fixing a potential bug related to how logprobs are handled when they are empty. These changes will improve the robustness and debuggability of the new functionality.

skyrl-train/examples/terminal_bench/generator/terminal_bench_generator.py

tyler-griggs

Awesome, thanks.

Fixes ci failure after #448

Fixes ci failure after NovaSky-AI#448

With laude-institute/harbor#45, Sandboxes now returns logprobs for terminus agent, so TerminalBenchGenerator could leverage it if applicable. This PR doesn't enable `trainer.algorithm.use_tis=true` so it should be a no-op. With laude-institute/harbor#50, Terminus agent now accepts "session_id" parameter, and it will show up in the litellm request body. TerminalBenchGenerator could leverage this for better routing. Note that I don't have access to GPU yet so this is not tested. --------- Co-authored-by: Tyler Griggs <131809874+tyler-griggs@users.noreply.github.com>

Fixes ci failure after NovaSky-AI#448

li-boxuan added 3 commits October 9, 2025 20:30

TerminalBenchGenerator: logprobs + session ID

1bfa233

Fix type

d30721f

prettify

0365ec7

gemini-code-assist bot reviewed Oct 10, 2025

View reviewed changes

address gemini reviews

08e2d19

tyler-griggs approved these changes Oct 14, 2025

View reviewed changes

Update terminal_bench_generator.py

a8e75b6

tyler-griggs merged commit fe08892 into NovaSky-AI:main Oct 14, 2025
1 check failed

erictang000 mentioned this pull request Oct 14, 2025

[bug] run linter for t-bench generator #476

Merged

erictang000 added a commit that referenced this pull request Oct 14, 2025

[bug] run linter for t-bench generator (#476)

60ba98a

Fixes ci failure after #448

li-boxuan pushed a commit to li-boxuan/SkyRL that referenced this pull request Nov 23, 2025

[bug] run linter for t-bench generator (NovaSky-AI#476)

1ce13e8

Fixes ci failure after NovaSky-AI#448

dzorlu pushed a commit to fleet-ai/SkyRL that referenced this pull request Feb 4, 2026

[bug] run linter for t-bench generator (NovaSky-AI#476)

2a78454

Fixes ci failure after NovaSky-AI#448

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TerminalBenchGenerator: logprobs + session ID#448

TerminalBenchGenerator: logprobs + session ID#448
tyler-griggs merged 5 commits intoNovaSky-AI:mainfrom
li-boxuan:tbench-generator-session-and-logprobs

li-boxuan commented Oct 10, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tyler-griggs left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

li-boxuan commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tyler-griggs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

li-boxuan commented Oct 10, 2025 •

edited

Loading