Skip to content

Conversation

@Darktex
Copy link
Contributor

@Darktex Darktex commented Dec 5, 2025

Summary

Introduces RFC 004: Rubric System—a composable, nn.Module-inspired abstraction for computing rewards in OpenEnv environments.

Key design decisions:

  • Environment authors implement init and forward(action, observation) -> float
  • Child rubrics auto-register when assigned as attributes
  • Sync forward() + async evaluate() for batch parallelism (no async knowledge required from authors)
  • Hooks for observability without polluting the base class

What's included:

  • Rubric base class with PyTorch-like API
  • Container rubrics: Sequential, Gate, WeightedSum, RubricList, LLMJudge
  • evaluate_batch() helper for parallel evaluation in training loops

Design informed by:

  • RLTF (hierarchical gating)
  • Rubicon (multi-dimensional rubrics)
  • AdvancedIF (all-or-nothing aggregation)
  • OpenRubrics (gatekeeper mechanism)

Test plan

  • Review RFC for clarity and completeness
  • Gather feedback on API design
  • Validate against existing environment implementations

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 5, 2025
@Darktex Darktex marked this pull request as draft December 5, 2025 21:39
Builds on RFC 003 to standardize reward computation:
- Rubrics are INTERNAL (not exposed to agent)
- Rubrics USE MCP to call external services (LLM judges, DBs)
- Observation.metadata["reward_components"] for per-rubric logging
- POST /config endpoint for dynamic reward shaping
- SDK helpers: RubricComposer, RewardNormalizer

Key insight: MCP tools are for agent actions (RFC 003).
Rubrics use MCP internally for external RPC, but are not tools themselves.
@Darktex Darktex force-pushed the rfc-004-reward-pipelines branch from 602b743 to 6c2acb0 Compare December 17, 2025 01:24
@Darktex Darktex force-pushed the rfc-004-reward-pipelines branch from 6c2acb0 to d6b15e9 Compare December 17, 2025 01:24
@Darktex Darktex changed the title [RFC] Add RFC 004: Reward Pipelines [RFC] Add RFC 004: Rubrics Dec 17, 2025
@Darktex Darktex marked this pull request as ready for review December 17, 2025 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants