Implement forward-only pass and populate metrics by tyler-griggs · Pull Request #1046 · NovaSky-AI/SkyRL

tyler-griggs · 2026-02-07T18:45:50Z

Summary

Implement forward() in SkyRL-Train backend (was raising NotImplementedError)
forward_backward() now returns training metrics (total_loss:sum, pg_loss:sum, entropy_loss:sum, num_tokens:sum) extracted from the dispatch worker return dict
optim_step() now returns grad_norm and learning_rate in its metrics field
Added metrics field to OptimStepOutput type

pcmoritz · 2026-02-07T18:57:05Z

skyrl-tx/tx/tinker/engine.py

-        for sample_idx in range(request_data.num_samples):
+        for _ in range(request_data.num_samples):
            all_prompts.append(prompt_tokens)
-            # Derive a unique seed per sample so that num_samples > 1 produces


This is an important fix, don't forget to revert the change before merging the PR

Oh thank you, I was just hacking around a bit with Claude, it will be reverted

forward_backward() now returns total_loss, pg_loss, entropy_loss, and num_tokens from the dispatch worker. optim_step() returns grad_norm and learning_rate. These are consumed by tinker-cookbook scripts via OptimStepResponse.metrics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request implements the forward pass in the SkyRL-Train backend and adds metric reporting to forward_backward and optim_step. The changes are well-structured and align with the goals outlined in the description. I've suggested a minor refactoring in _extract_metrics to improve maintainability by using a dictionary for metric mapping, which will make it easier to add new metrics in the future. Overall, this is a solid contribution.

gemini-code-assist · 2026-02-07T20:09:31Z

skyrl-tx/tx/tinker/backends/skyrl_train.py

+    def _extract_metrics(self, data: dict) -> dict[str, float]:
+        """Extract training metrics from dispatch return dict.
+
+        Workers return metrics like 'loss', 'policy_loss', 'policy_entropy', etc.
+        We convert to Tinker's colon-suffixed format (e.g. 'total_loss:sum').
+        """
+        metrics: dict[str, float] = {}
+
+        # SFT path returns 'loss'; RL path returns 'final_loss' / 'policy_loss'
+        if "loss" in data:
+            metrics["total_loss:sum"] = float(data["loss"])
+        elif "final_loss" in data:
+            metrics["total_loss:sum"] = float(data["final_loss"])
+
+        if "policy_loss" in data:
+            metrics["pg_loss:sum"] = float(data["policy_loss"])
+        if "policy_entropy" in data:
+            metrics["entropy_loss:sum"] = float(data["policy_entropy"])
+        if "response_length" in data:
+            metrics["num_tokens:sum"] = float(data["response_length"])
+
+        return metrics


The current implementation of _extract_metrics uses a series of if statements to map metric names. While this works, it can become less maintainable as the number of metrics grows. Using a dictionary to define the mapping from source metric names to target metric names would make the code more scalable and easier to read.

def _extract_metrics(self, data: dict) -> dict[str, float]: """Extract training metrics from dispatch return dict. Workers return metrics like 'loss', 'policy_loss', 'policy_entropy', etc. We convert to Tinker's colon-suffixed format (e.g. 'total_loss:sum'). """ metrics: dict[str, float] = {} metric_mapping = { "policy_loss": "pg_loss:sum", "policy_entropy": "entropy_loss:sum", "response_length": "num_tokens:sum", } # SFT path returns 'loss'; RL path returns 'final_loss' / 'policy_loss' if "loss" in data: metrics["total_loss:sum"] = float(data["loss"]) elif "final_loss" in data: metrics["total_loss:sum"] = float(data["final_loss"]) for source_key, target_key in metric_mapping.items(): if source_key in data: metrics[target_key] = float(data[source_key]) return metrics

vercel bot deployed to Preview February 7, 2026 18:46 View deployment

pcmoritz reviewed Feb 7, 2026

View reviewed changes

tyler-griggs force-pushed the tyler/populate-metrics-clean branch from 7acd89f to 98e7500 Compare February 7, 2026 19:06

vercel bot deployed to Preview February 7, 2026 19:06 View deployment

tyler-griggs force-pushed the tyler/populate-metrics-clean branch from 98e7500 to 16fc973 Compare February 7, 2026 19:08

vercel bot deployed to Preview February 7, 2026 19:09 View deployment

Implement forward-only pass in Tinker SkyRL backend

4bcb652

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel bot deployed to Preview February 7, 2026 19:16 View deployment

tyler-griggs changed the title ~~Populate metrics in forward_backward and optim_step~~ Implement forward-only pass and populate metrics Feb 7, 2026

tyler-griggs marked this pull request as ready for review February 7, 2026 20:06

tyler-griggs merged commit 590685f into main Feb 7, 2026
6 checks passed

gemini-code-assist bot reviewed Feb 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement forward-only pass and populate metrics#1046

Implement forward-only pass and populate metrics#1046
tyler-griggs merged 2 commits intomainfrom
tyler/populate-metrics-clean

tyler-griggs commented Feb 7, 2026 •

edited

Loading

Uh oh!

pcmoritz Feb 7, 2026

Uh oh!

tyler-griggs Feb 7, 2026

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tyler-griggs commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

pcmoritz Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

tyler-griggs Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tyler-griggs commented Feb 7, 2026 •

edited

Loading