You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Directive's skills are authored as prose step sequences with RFC2119 notation (!=MUST, ~=SHOULD, etc.). Agents read the skill, interpret the steps, and make sequential tool-selection decisions at execution time. The plan is generated upfront (SPEC→PLAN gate, vBRIEF plan.items) — but the plan itself is in prose/JSON form, and executing it still involves turn-by-turn interpretation.
Directive has one existing example of a different approach: templates/swarm-greptile-poller-prompt.md. This is a str.format() parameterized template — the agent instantiates the full execution structure in one pass by filling in typed placeholders ({pr_number}, {repo}, {poll_interval_seconds}, etc.), and then executes it. The result is more deterministic and less susceptible to step-by-step drift than prose instructions.
"Code generation is the model's native capability. Tool selection from JSON menus is a bolted-on behavior."
BFCL V4 data: multi-turn tool orchestration drops 30–60 points below single-turn for all major models. SWE-bench: models at 40–55% on complex multi-file code generation — the same models that fail multi-turn tool selection. Models generate coherent multi-step code better than they sequence discrete tool choices.
The question
Should directive generalize the poller template pattern into a named, theorized alternative to prose step sequences — a code-first parameterized skill template format?
The current format (prose steps):
### Step 3: Fix batch
- ! Read the full current Greptile review
- ! Fix all P0 and P1 issues in a single batch commit
- ⊗ Push a fix commit addressing fewer findings than the review surfaces
The alternative (parameterized template, agent fills in one pass):
# Fix batch for PR #{pr_number} on {repo}
HEAD SHA: {head_sha}
Findings to address: {p0_count} P0, {p1_count} P1
Steps:
1. Read review at {review_url}
2. Apply fixes to: {affected_files}
3. Commit: "fix(review): address P0/P1 findings on {head_sha[:7]}"
4. Push and confirm CI
The agent instantiates the template in one inference pass (filling in the typed slots from context), then executes deterministically. No per-step decision making about what to do next.
Angles to explore
When does the code-first format beat prose? The poller template works well for loops and poll cycles. Does it work for open-ended implementation tasks where the agent needs to discover context? Or is it better suited to structured, known-shape workflows (review cycles, release steps, scope lifecycle operations)?
Hybrid approach: Could directive skills have a prose "orientation" section (what context to gather, what decisions to make) followed by a parameterized template section (the execution plan the agent fills in once oriented)?
Slot typing: The article emphasizes typed entities as a hallucination prevention mechanism. Should parameterized skill templates encode expected types for each slot ({pr_number: int}, {head_sha: str[40]}, {affected_files: list[str]})? Does this meaningfully reduce agent errors?
Naming and discoverability: If code-first templates become a recognized format, how should they be organized? Alongside skills? As a separate templates/ convention?
Limits of generalization: The prose step format has real advantages — it handles ambiguity, allows agent judgment, and scales to open-ended tasks. When should directive use prose vs. template? Is there a principled distinction based on workflow shape?
Related
templates/swarm-greptile-poller-prompt.md — the one existing example; the pattern to generalize
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Background
Directive's skills are authored as prose step sequences with RFC2119 notation (!=MUST, ~=SHOULD, etc.). Agents read the skill, interpret the steps, and make sequential tool-selection decisions at execution time. The plan is generated upfront (SPEC→PLAN gate, vBRIEF plan.items) — but the plan itself is in prose/JSON form, and executing it still involves turn-by-turn interpretation.
Directive has one existing example of a different approach:
templates/swarm-greptile-poller-prompt.md. This is astr.format()parameterized template — the agent instantiates the full execution structure in one pass by filling in typed placeholders ({pr_number},{repo},{poll_interval_seconds}, etc.), and then executes it. The result is more deterministic and less susceptible to step-by-step drift than prose instructions.The article "MCP Is Not Enough: Why Code DSLs Will Replace JSON Tool Calling" (https://mukulsingh105.github.io/articles/mcp-code-dsl-tool-calling.html) provides the theoretical grounding for why this matters:
BFCL V4 data: multi-turn tool orchestration drops 30–60 points below single-turn for all major models. SWE-bench: models at 40–55% on complex multi-file code generation — the same models that fail multi-turn tool selection. Models generate coherent multi-step code better than they sequence discrete tool choices.
The question
Should directive generalize the poller template pattern into a named, theorized alternative to prose step sequences — a code-first parameterized skill template format?
The current format (prose steps):
The alternative (parameterized template, agent fills in one pass):
The agent instantiates the template in one inference pass (filling in the typed slots from context), then executes deterministically. No per-step decision making about what to do next.
Angles to explore
When does the code-first format beat prose? The poller template works well for loops and poll cycles. Does it work for open-ended implementation tasks where the agent needs to discover context? Or is it better suited to structured, known-shape workflows (review cycles, release steps, scope lifecycle operations)?
Hybrid approach: Could directive skills have a prose "orientation" section (what context to gather, what decisions to make) followed by a parameterized template section (the execution plan the agent fills in once oriented)?
Slot typing: The article emphasizes typed entities as a hallucination prevention mechanism. Should parameterized skill templates encode expected types for each slot (
{pr_number: int},{head_sha: str[40]},{affected_files: list[str]})? Does this meaningfully reduce agent errors?Naming and discoverability: If code-first templates become a recognized format, how should they be organized? Alongside skills? As a separate
templates/convention?Limits of generalization: The prose step format has real advantages — it handles ambiguity, allows agent judgment, and scales to open-ended tasks. When should directive use prose vs. template? Is there a principled distinction based on workflow shape?
Related
templates/swarm-greptile-poller-prompt.md— the one existing example; the pattern to generalizeBeta Was this translation helpful? Give feedback.
All reactions