Skip to content

mazzucci/tool-as-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

7 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Tool-as-Guide

An architectural pattern for building reliable, auditable AI agents by separating workflow control from execution

License: MIT


๐ŸŽฌ See It In Action

โ–ถ๏ธ Watch: Pizza Ordering with Claude (60 sec) โ€ข โ–ถ๏ธ Watch: Medical Triage Agent (15 sec)

๐Ÿ• Pizza Ordering - Chat Interface

Pizza Ordering Demo

What you just saw:

  • User asks "what do you recommend?" โ†’ Claude explains each option with reasoning
  • User says "pick something for me" โ†’ Claude intelligently selects and explains why
  • Questions follow a strict order (crust โ†’ category โ†’ toppings โ†’ size)
  • Natural, helpful conversation, but protocol-enforced

The guide controls the workflow. Claude provides the intelligence.

Explore the code โ†’


๐Ÿฅ Medical Triage - Autonomous Agent

Medical Triage Demo

What you just saw:

  • Agent autonomously screens for emergency symptoms (protocol requirement)
  • Checks medical history, classifies severity
  • Makes escalation decision and saves audit trail
  • All while the guide enforces clinical protocols

The guide ensures compliance. The agent does the work.

Explore the code โ†’


๐ŸŽฏ The Problem

Autonomous AI agents are powerful but unreliable for critical workflows:

  • โŒ Skip important steps
  • โŒ Inconsistent behavior across runs
  • โŒ Hard to audit or debug
  • โŒ Can't guarantee protocol compliance
  • โŒ Unsuitable for regulated domains

Current solutions don't solve this:

  • Prompts: Too fragile ("please follow these steps...")
  • Traditional workflows: LLM is passive, no agency
  • Agent frameworks: Too unpredictable for critical systems

๐Ÿ’ก The Solution: Tool-as-Guide Pattern

TL;DR: The Tool-as-Guide pattern is a workflow engine with inversion of control, acting as a protocol-driven supervisor for agentic systems.

๐Ÿ—บ๏ธ Think of It Like GPS Navigation

When you're driving through a city:

  • You're the intelligent, capable driver - You navigate traffic, make decisions, handle unexpected situations
  • GPS provides turn-by-turn instructions - Tells you which route to follow, which turn to take next
  • GPS doesn't drive for you - But it ensures you don't miss critical turns or get lost

Tool-as-Guide works the same way:

  • Agent is the intelligent worker - Capable of reasoning, adapting, executing tasks
  • Guide provides step-by-step instructions - What to do next, what protocol to follow
  • Guide doesn't do the work - But it ensures critical steps aren't skipped and protocols are followed

The agent stays intelligent and autonomous. The guide ensures reliability and compliance.


How It Works in Practice

Key insight: The workflow engine is a tool that the LLM actively queries.

sequenceDiagram
    participant User
    participant LLM as LLM/Agent
    participant Guide as Workflow Guide (Tool)
    
    User->>LLM: "I need help with X"
    LLM->>Guide: guide.start_workflow()
    Guide-->>LLM: {instruction: "Do step 1", required_data: [...]}
    LLM->>User: Executes step 1
    User->>LLM: Provides response
    LLM->>Guide: guide.continue(session_id, response)
    Guide-->>LLM: {instruction: "Do step 2", validation: "passed"}
    LLM->>User: Executes step 2
    Note over LLM,Guide: LLM drives execution<br/>Guide enforces protocol
Loading

The pattern in three points:

  1. LLM/Agent has autonomy - Drives conversation, executes tasks, makes decisions
  2. Guide provides instructions - What to do next, what data is needed, what validates
  3. Protocol is enforced - Can't skip steps, consistent behavior, auditable trail

๐Ÿง  The LLM Isn't Just Following Orders

Important: The LLM/Agent maintains its intelligence and flexibility:

  • Interprets user intent - "I want something spicy" โ†’ translates to pepperoni/jalapeรฑos
  • Generates natural responses - Not just relaying template text
  • Handles unexpected input - Clarifies, asks for details, adapts phrasing
  • Executes complex tasks - Queries databases, calls APIs, processes data
  • Reasons within bounds - Makes decisions at each step, guided by protocol

The Guide enforces the protocol. The LLM provides the intelligence.

It's like GPS: You're still the intelligent driver making decisionsโ€”GPS just ensures you don't miss critical turns.

Code Example

Here's what the interaction looks like:

# Agent queries the guide
response = guide.start_workflow()
# โ†’ {"instruction": "Ask user for pizza crust preference", 
#    "options": ["thin", "regular", "thick"], 
#    "required": true}

# Agent uses intelligence to interact with user
agent.ask_user("What kind of crust would you like? We have thin, regular, or thick.")
user_says("Make it crispy!")

# Agent interprets intent, sends to guide
response = guide.continue(session_id, "thin")
# โ†’ {"instruction": "Ask about toppings", 
#    "category": "vegetarian", ...}

# Guide controls WHAT steps. Agent controls HOW they're executed.

๐Ÿ“Š Example Comparison

Aspect Pizza Ordering Medical Triage
Interface Chat (Claude Desktop) Autonomous Agent (Jupyter)
Use Case Low stakes, conversational High stakes, critical system
AI Role Relays messages to user Executes tasks autonomously
Guide Role Returns prompts/questions Returns tasks & protocol decisions
Tools Used State machine only State machine + DB + Classifier + Monitor
LLM Claude (cloud) Gemma (local, open)
Domain Knowledge In guide (menu, options) Separate classifier tool
Purpose Show pattern basics Show pattern for autonomous agents
Audit Trail Session states Full protocol compliance log

Both demonstrate the same core pattern in different contexts, showing its versatility from simple chatbots to critical autonomous systems.


The Architectural Inversion

graph TB
    subgraph "Traditional Workflow Engine"
        WE[Workflow Engine<br/>Controls Everything] -->|executes| L1[LLM<br/>Passive Task]
        L1 -->|returns| WE
    end
    
    subgraph "Agent Framework"
        A[LLM Agent<br/>Controls Everything] -->|calls| T[Tools<br/>Passive Utilities]
        T -->|returns| A
    end
    
    subgraph "Tool-as-Guide (Novel)"
        L2[LLM/Agent<br/>Active Driver] -->|queries| G[Guide Tool<br/>Workflow Engine]
        G -->|returns instructions| L2
        L2 -->|executes & reports| G
    end
    
    style L2 fill:#90EE90
    style G fill:#87CEEB
Loading

Tool-as-Guide combines the best of both:

  • โœ… Agent autonomy (LLM drives interaction)
  • โœ… Workflow reliability (Guide enforces protocol)

๐Ÿš€ Quick Start

Choose an example to explore:

Chat interface demo with Claude Desktop or Cursor. Perfect for understanding the basics.

Autonomous agent demo with Jupyter and local LLM. Shows the pattern for critical systems.

๐Ÿ”ง cursor-extend (Coming Soon)

โš ๏ธ PREVIEW: A tool that implements this pattern to help Cursor IDE generate workflow guides. Stay tuned!

Each example includes complete setup instructions and usage guide.


๐Ÿ—๏ธ Pattern Architecture

Key Principles

  1. Separation of Concerns

    • Guide: WHAT to do (workflow logic, validation, protocol)
    • Agent: HOW to do it (execution, reasoning, tool calls)
  2. Inversion of Control

    • Agent queries the guide (not controlled by it)
    • Guide returns instructions (not commands)
  3. Stateful Sessions

    • Each workflow has a unique session
    • State persists across interactions
    • Audit trail is automatic
  4. Deterministic Protocols

    • Workflow logic lives in code
    • Same input โ†’ same workflow
    • Testable, verifiable, compliant
  5. Progressive Disclosure

    • Agent receives only current instruction
    • Next steps revealed when needed
    • Reduces context window, improves focus

๐ŸŒŸ Use Cases

This pattern is particularly powerful for:

Critical Systems

  • ๐Ÿฅ Healthcare - Enforce clinical protocols, emergency escalation
  • ๐Ÿ’ฐ Finance - Regulatory compliance, risk checks
  • โš–๏ธ Legal - Document review checklists, thoroughness requirements

Operational Workflows

  • ๐Ÿš€ Deployments - Quality gates, approval workflows
  • ๐Ÿ›ก๏ธ Security - Incident response protocols
  • ๐Ÿ“ž Customer Support - Troubleshooting procedures

Any Domain Where:

  • Protocol compliance is required
  • Steps can't be skipped
  • Behavior must be auditable
  • Consistency matters more than flexibility

Efficiency Benefits

Beyond reliability, this pattern also improves performance and reduces costs:

  • Smaller context windows: Guide provides only the current instruction, not the entire workflow tree
  • Progressive disclosure: Agent loads only what it needs at each step
  • Faster execution: Deterministic protocol logic runs in the guide (not LLM inference)
  • Lower token costs: Similar to Anthropic's code execution research, processing logic outside the model reduces token consumption by 90%+

The guide acts as an efficient coordinatorโ€”the LLM only sees "what to do next," not "all possible paths."


๐Ÿ†š How This Differs from Other Approaches

1. Prompt-Based Instructions (Fragile & Opaque)

Most LLM applications today rely on carefully crafted prompts ("think step by step...", "call tools as needed"). This approach is:

  • Brittle: Minor prompt changes, model updates, or LLM randomness can produce different results
  • Opaque: No clear audit trailโ€”workflow logic is implicit in prompts and model reasoning
  • Unpredictable: Can't guarantee every step will be followed in regulated processes

2. Traditional Workflow Engines (No LLM Intelligence)

Conventional workflow engines define steps externally but lack LLM adaptability:

  • Rigid: The engine holds all logicโ€”LLMs just fill in text generation
  • Limited Intelligence: Little opportunity for adaptive reasoning
  • Separated: Workflow and AI capabilities don't integrate well

3. Multi-Agent Supervision (Still Probabilistic)

Some architectures use supervising agents to check other agents' work. While this improves reliability:

  • Still LLM-driven: Error correction is probabilistic, supervisors can miss violations
  • Weakly auditable: Better than single-agent prompts, but no external testable flow
  • Complex: Multiple LLMs increase cost and latency

4. Tool-as-Guide (Hybrid Approach)

The Tool-as-Guide pattern combines the strengths of both:

  • Deterministic Protocol: Workflow state machine decides every stepโ€”LLM executes but doesn't decide
  • Context-Rich Guidance: Tool provides detailed instructions, templates, and context for each step
  • True Auditability: Code-based workflow with explicit state transitions
  • Debuggable: Failures isolated to specific logic/data steps, not buried in prompts
  • LLM Intelligence: Agent can still reason and adapt within the bounds of each step

In essence: The workflow is deterministic and auditable (like traditional engines), but the execution is intelligent and adaptive (like LLM agents).


๐Ÿค Contributing

Feedback is appreciated! Open a PR or issue to discuss your ideas.


๐Ÿ“œ License

MIT License

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published