Skip to content

feat(demo): Add explore mode for schema discovery#97

Merged
andreasronge merged 5 commits intomainfrom
91-add-explore-mode-demo
Dec 5, 2025
Merged

feat(demo): Add explore mode for schema discovery#97
andreasronge merged 5 commits intomainfrom
91-add-explore-mode-demo

Conversation

@andreasronge
Copy link
Copy Markdown
Owner

Summary

Adds an "explore mode" to the demo app where the LLM discovers data structure via introspection operations before writing queries.

  • Add --explore CLI flag to start in explore mode
  • Add /mode, /mode schema, /mode explore commands for runtime switching
  • In explore mode, system prompt lists dataset names but no field information
  • LLM must use keys and typeof operations to discover schema
  • Update README with explore mode documentation

Also includes:

  • Docs cleanup removing completed development phase references
  • Refactored result tracking through agent loop (cleaner than context extraction)
  • Character-based result truncation (200 chars) to preserve short lists like keys output

Closes #91

Test plan

  • All 549 tests pass
  • Manual verification: ./ptc_demo --explore starts in explore mode
  • /mode shows current mode
  • /mode schema and /mode explore switch modes
  • /reset returns to schema mode

🤖 Generated with Claude Code

andreasronge and others added 3 commits December 5, 2025 00:15
All implementation phases (1-5) have been completed, so references to
them are no longer relevant. This removes the Implementation Phases
section from architecture.md and updates phase-related text in the
guidelines documentation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Implements explore mode where LLM discovers data structure via
introspection operations (keys, typeof) instead of receiving full
schema upfront.

Changes:
- Add --explore CLI flag and /mode commands to switch data modes
- Refactor agent to use agentic loop (multi-turn with tool results)
- Add /programs, /result, /context commands for debugging
- Add PtcRunner.format_error/1 for LLM-friendly error messages
- Update README with explore mode documentation

Known issue: In explore mode, LLMs struggle to generate valid PTC
programs due to DSL inconsistencies:
- `get` requires `path` as list, but other ops use `field` string
- No `sort_by` or `max_by` operations for common "find row with max"
- Verbose patterns for simple field extraction

See issue #93 for proposed DSL improvements to address this.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Track last_exec through agent_loop instead of extracting from context
- Add truncation note to system prompt for LLM awareness
- Remove unused extract_last_execution/1 function
- Use character-based truncation (200 chars) instead of item count
  to preserve short lists like keys output

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@andreasronge andreasronge added the claude-review Add to PRs to trigger Claude automated review label Dec 5, 2025
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 5, 2025

PR Review: feat(demo): Add explore mode for schema discovery

Summary

This PR adds an "explore mode" to the demo app where the LLM discovers data structures via introspection operations (keys, typeof) before writing queries. It also refactors the agent to use an agentic loop pattern and switches from item-count-based to character-based result truncation. The core implementation is well-designed and accomplishes its goals.

What's Good

  • Clean agentic loop design - The agent_loop/6 function cleanly separates the iteration logic from the GenServer callbacks, with proper tracking of last execution via the last_exec tuple
  • Improved result truncation - Character-based truncation (200 chars) is more sensible than item-count-based truncation, especially for preserving keys output which is exactly what explore mode needs
  • Thoughtful system prompts - The explore mode prompt correctly instructs the LLM to output only ONE program per response and to discover structure via load <name> | first | keys
  • Good CLI ergonomics - The /mode command family and runtime switching work well
  • Proper state management - set_data_mode correctly resets context and clears last_program/last_result

Issues (Must Fix)

  1. Bug: test_runner.ex uses wrong parameter name - demo/lib/ptc_demo/test_runner.ex:176

    • Problem: The Agent was refactored to use data_mode: instead of mode:, but test_runner.ex:176 still passes mode: mode
    • Impact: Calling Agent.start_link(mode: :structured) will silently ignore the option (Elixir allows extra keys in keyword lists), defaulting to :schema mode
    • Fix: Change Agent.start_link(mode: mode) to Agent.start_link(data_mode: data_mode) and update the variable name
    • Complexity: Mechanical fix
  2. Bug: test_runner.ex documentation and API references outdated - demo/lib/ptc_demo/test_runner.ex:97-101

    • Problem: Documentation still says mode: :structured/:text but this concept was removed. The agent now has :schema/:explore data modes, not generation modes
    • Impact: Confusing API docs and broken test invocations
    • Fix: Update to data_mode: :schema/:explore if that's the intended parameter, or remove the mode option entirely if test_runner shouldn't control data mode
    • Complexity: Requires design decision - should test_runner support explore mode testing?
  3. Incomplete: README.md still documents removed --structured flag - demo/README.md:64-66, 71-80, 185-186, 200-203

    • Problem: The PR removes the :structured/:text generation mode from the Agent (the old 3-phase approach), but README.md still documents --structured CLI flag and the "Generation Modes" section
    • Impact: Users will get errors or unexpected behavior running ./ptc_demo --structured
    • Fix: Either (a) remove all --structured references from README, or (b) if structured mode should be preserved, it needs to be re-implemented in the new agentic loop
    • Complexity: Documentation cleanup if removing; moderate if preserving feature
  4. CLI ignores unrecognized --structured flag silently - demo/lib/ptc_demo/cli.ex:21

    • Problem: The CLI only checks for --explore, so --structured is silently ignored
    • Impact: Users following README documentation get unexpected behavior
    • Fix: Either add an error for unrecognized flags or document that --structured is no longer supported
    • Complexity: Mechanical

Suggestions (Optional)

  1. Consider adding /programs to README Commands table - Nice-to-have
    • The CLI code implements /programs but README Commands table doesn't list it
    • Benefit: Documentation completeness
    • Recommendation: Add row for /programs command

Security

No security concerns. The sandbox execution and timeout limits remain intact.

Documentation

Documentation needs updates to reflect the removal of structured mode:

  • README.md "Generation Modes" section is now stale
  • README.md still shows --structured CLI flag
  • test_runner.ex docstrings reference removed concepts

Verdict

Request Changes - The test_runner.ex bug would cause test runs to fail silently or use wrong mode. The README documentation inconsistencies would confuse users. These are mechanical fixes following the pattern established by the PR itself.

🤖 Generated with Claude Code

@github-actions github-actions bot added the auto-triage-pending Auto-triage in progress label Dec 5, 2025
@andreasronge
Copy link
Copy Markdown
Owner Author

@claude please fix these issues from PR review:

Issue 1: test_runner.ex uses wrong parameter name

Location: demo/lib/ptc_demo/test_runner.ex:176
Problem: Agent was refactored to use data_mode: instead of mode:, but test_runner.ex still passes mode: mode
Fix:

  1. Change line 176 from Agent.start_link(mode: mode) to Agent.start_link(data_mode: data_mode)
  2. Change variable name at line 172 from mode to data_mode
  3. Update line 106 to use data_mode: data_mode

Issue 2: test_runner.ex documentation references removed modes

Location: demo/lib/ptc_demo/test_runner.ex:97-103
Problem: Documentation still references :structured/:text modes which no longer exist. The agent now has :schema/:explore data modes.
Fix: Update the @doc comment and function to remove the mode parameter entirely, since test_runner should use the default schema mode. Remove:

  • Line 97: mode: :structured/:text (default :structured) - LLM generation mode
  • Lines 101-103: The mode variable and interpolation
  • Line 106: The mode: mode in ensure_agent_started call

If mode support is needed, update to use data_mode: :schema/:explore instead.

Issue 3: README.md still documents removed --structured flag

Location: demo/README.md:64-66, 71-80
Problem: The PR removes structured mode but README still documents --structured flag and has a "Generation Modes" section
Fix:

  1. Remove lines 64-66 (the --structured example)
  2. Remove lines 71-80 (the entire "Generation Modes" section)
  3. Remove any other references to --structured flag (lines 185-186, 200-203)

Issue 4: CLI silently ignores unrecognized --structured flag

Location: demo/lib/ptc_demo/cli.ex:21
Problem: CLI only checks for --explore, so users following stale docs will get unexpected behavior
Fix: After parsing data_mode (around line 21), add validation:

# Validate args - only --explore is supported
unknown_flags = Enum.filter(args, fn arg -> String.starts_with?(arg, "--") and arg != "--explore" end)
if unknown_flags != [] do
  IO.puts("Unknown flags: #{Enum.join(unknown_flags, ", ")}. Only --explore is supported.")
  System.halt(1)
end

Please fix ALL issues above in a SINGLE commit. Make minimal changes to resolve each issue.

@andreasronge
Copy link
Copy Markdown
Owner Author

Auto-Triage Summary

Decisions Made

# Issue Decision Action
1 test_runner.ex uses wrong parameter name (line 176) FIX_NOW Included in fix request above
2 test_runner.ex documentation references removed modes (lines 97-101) FIX_NOW Included in fix request above
3 README.md still documents removed --structured flag (multiple locations) FIX_NOW Included in fix request above
4 CLI silently ignores --structured flag (cli.ex:21) FIX_NOW Included in fix request above
5 Add /programs to README Commands table IGNORE Nice-to-have documentation enhancement unrelated to PR purpose (adding explore mode)

Reasoning

All four FIX_NOW items are mechanical cleanup tasks directly related to the PR's changes:

  • The PR refactored from mode: to data_mode: and removed structured mode support
  • test_runner.ex wasn't fully updated to reflect these changes
  • README still documents the removed --structured flag
  • CLI doesn't validate flags, causing silent failures

These are in-scope incomplete work - the PR established new patterns but didn't finish applying them everywhere. All fixes are mechanical (no design decisions needed).

Status

  • FIX_NOW items: 4 (batched in single fix comment above)
  • Issues created: 0
  • Items ignored: 1 (out of scope documentation enhancement)

@github-actions github-actions bot added auto-triage-cycle-1 Auto-triage cycle 1 auto-triage-complete Auto-triage completed and removed auto-triage-pending Auto-triage in progress labels Dec 5, 2025
This commit fixes four related issues from PR review:

1. test_runner.ex: Changed parameter from 'mode:' to 'data_mode:' to match
   the Agent.start_link/1 API. Removed the mode parameter entirely since
   TestRunner should use the default schema mode.

2. test_runner.ex: Removed outdated documentation that referenced
   ':structured/:text' modes which no longer exist. The agent now uses
   ':schema/:explore' data modes.

3. README.md: Removed all references to the '--structured' flag that no
   longer exists:
   - Removed '--structured' example from quick start section
   - Removed entire 'Generation Modes' section
   - Updated test runner examples to remove mode parameter
   - Updated example output

4. cli.ex: Added validation to reject unknown flags and prevent silent
   failures when users follow stale documentation using '--structured'.

Changes are minimal and focused on alignment with the refactored Agent API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@andreasronge
Copy link
Copy Markdown
Owner Author

Summary

All four issues from the PR review have been fixed in a single commit.

Changes Made

  1. test_runner.ex parameter fix: Changed line 176 from mode: mode to use the correct Agent API. Removed the mode parameter entirely since TestRunner should use the default schema mode.

  2. test_runner.ex documentation: Removed the outdated @doc comment referencing :structured/:text modes which no longer exist. Updated documentation to reflect that TestRunner uses default schema mode.

  3. README.md cleanup: Removed all references to the --structured flag:

    • Removed --structured example from quick start (lines 64-66)
    • Removed entire "Generation Modes" section (lines 71-80)
    • Updated test runner examples
    • Updated example output section
  4. CLI validation: Added validation in cli.ex to detect and reject unknown flags with a helpful error message. This prevents silent failures when users follow stale documentation.

Test Results

  • mix format passed
  • mix compile --warnings-as-errors passed
  • mix credo (372 mods/funs, no issues)
  • ✅ All 549 tests passed (5 doctests, 10 excluded)

Commit

Commit: cdedfbd - "fix: Remove outdated mode parameter and --structured flag references"

All changes have been pushed to the branch.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 5, 2025

PR Review: feat(demo): Add explore mode for schema discovery

Summary

Well-structured PR that adds explore mode to the demo app, allowing LLMs to discover data schemas via introspection operations. The implementation includes a refactored agentic loop that cleanly tracks execution state. The PR also removes the outdated --structured flag and adds comprehensive error formatting to the core library.

What's Good

  1. Clean agentic loop implementation - The new agent_loop/6 function elegantly handles multi-turn conversations with proper state threading for program/result tracking.

  2. Proper state management - The data_mode field is correctly integrated into the GenServer state with appropriate reset behavior.

  3. Comprehensive error formatting - The new PtcRunner.format_error/1 function provides clear, LLM-friendly error messages with good pattern matching for different error types.

  4. Good test coverage - The format_error/1 tests cover the major error categories effectively.

  5. Documentation updates - Most of the README has been properly updated to reflect the new schema/explore modes.

  6. Docs cleanup - Removal of completed "Implementation Phases" section from architecture.md and phase references from other docs is appropriate.

Issues (Must Fix)

  1. Outdated documentation - demo/README.md:161 and demo/README.md:208-209
    • Problem: The PR removes the --structured flag and changes the generation mode concept, but these lines still reference "text mode" vs "structured mode" which no longer exist.
    • Location: Line 161: "This demonstrates how text mode keeps token usage low compared to structured mode."
      Lines 208-209: "Structured mode: Uses JSON schema..." / "Text mode: Uses retry logic..."
    • Impact: Documentation is now incorrect and confusing since these modes no longer exist.
    • Fix: Remove or update these references. Line 161 could be deleted or changed to describe the current behavior. Lines 208-209 should be removed or replaced to describe the current single generation approach.
    • Complexity: Mechanical (same documentation pattern as rest of PR)

Suggestions (Optional)

  1. Nice-to-have: The @max_iterations 5 constant in agent.ex could potentially be made configurable via init/1 opts for testing flexibility, but the hardcoded value is reasonable for now.

Security

No concerns. The changes are focused on the demo app and don't introduce new attack surfaces. The format_error/1 function properly truncates error messages to prevent potential information leakage from large error payloads.

Documentation

  • Demo README is mostly updated but has the incomplete fix noted above
  • Architecture docs appropriately cleaned of phase references
  • Issue creation and planning guidelines updated consistently

Verdict

Request Changes - The outdated text/structured mode references in demo/README.md should be fixed before merge for documentation consistency. This is a mechanical fix following the same cleanup pattern the PR already establishes for mode terminology.

@github-actions github-actions bot added the auto-triage-pending Auto-triage in progress label Dec 5, 2025
@andreasronge
Copy link
Copy Markdown
Owner Author

@claude please fix these issues from PR review:

Issue 1: Outdated structured/text mode references in README.md

Location: demo/README.md:161, 208-209
Problem: Documentation still references "structured mode" and "text mode" which were removed in this PR. The agent now uses a single generation approach with data modes (schema/explore) instead.
Fix:

  1. Remove line 161: "This demonstrates how text mode keeps token usage low compared to structured mode."
  2. Remove lines 208-209 (the two bullet points about "Structured mode" and "Text mode" under the "Generate" step)
  3. Optionally update line 207 to just say "LLM creates a compact PTC program (~200 bytes)" without the bullet points, or keep the Generate step simple.

Please fix in a SINGLE commit with minimal changes.

@andreasronge
Copy link
Copy Markdown
Owner Author

Auto-Triage Summary (Cycle 2 of 3)

Decisions Made

# Issue Decision Action
1 Outdated structured/text mode references in README.md (lines 161, 208-209) FIX_NOW Included in fix request above

Reasoning

This is a mechanical documentation cleanup that is directly in-scope:

  • The PR removes the structured/text mode concepts and already cleaned up most references
  • These three lines were missed in the initial cleanup
  • No design decisions needed - just remove outdated text
  • Follows the same pattern established in the first round of fixes

Status

  • FIX_NOW items: 1 (batched in single fix comment above)
  • Issues created: 0
  • Items ignored: 0

The review suggested one nice-to-have improvement (making @max_iterations configurable) which I'm ignoring as it's out of scope and truly optional.

@github-actions github-actions bot added auto-triage-cycle-2 Auto-triage cycle 2 and removed auto-triage-cycle-1 Auto-triage cycle 1 auto-triage-pending Auto-triage in progress labels Dec 5, 2025
- Remove outdated text about 'text mode vs structured mode' (line 161)
- Remove bullet points about structured/text mode under Generate step (lines 208-209)
- The agent now uses a single generation approach with data modes (schema/explore)
@andreasronge
Copy link
Copy Markdown
Owner Author

Fix Summary

Fixed outdated documentation references in demo/README.md:

Changes made:

  • Removed line 161: "This demonstrates how text mode keeps token usage low compared to structured mode."
  • Removed bullet points about "Structured mode" and "Text mode" under the Generate step (formerly lines 208-209)
  • Simplified the "How It Works" section to reflect the current single generation approach with data modes (schema/explore)

Verification:

  • All quality checks passed: format, compilation, credo, tests (549 tests ✓)
  • Changes committed: 6e4b4d6 - "fix: Remove outdated mode references in demo README"

The documentation now accurately reflects that the agent uses a single generation approach with data modes rather than separate structured/text modes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 5, 2025

PR Review: feat(demo): Add explore mode for schema discovery

Summary

This PR adds an "explore mode" to the demo app where LLMs must discover data structures via introspection (keys, typeof) before writing queries. It also refactors the agent to use an agentic loop pattern and adds a well-designed format_error/1 function to the main library. The implementation is complete and meets all acceptance criteria from issue #91.

What's Good

  1. Clean agentic loop design - The new agent_loop/6 function in demo/lib/ptc_demo/agent.ex:227-270 properly handles multi-turn conversations with clear iteration limits and state tracking.

  2. Good separation of concerns - Two distinct system prompts (system_prompt(:schema) and system_prompt(:explore)) make the mode difference clear and maintainable.

  3. Useful format_error/1 addition - The new error formatting in lib/ptc_runner.ex:114-210 is well-designed with clear message types (ParseError, ValidationError, TypeError, KeyError, etc.) that help LLMs understand and fix their programs.

  4. Complete CLI implementation - /mode, /mode schema, /mode explore commands, the --explore flag, and updates to /reset all work as specified.

  5. Result tracking improvement - Tracking last_program and last_result through the agent loop (lib/ptc_demo/agent.ex:155,264) is cleaner than extracting from context.

  6. Character-based truncation - The @max_result_chars 200 approach preserves short lists like keys output, which is important for explore mode.

  7. Thorough docs cleanup - Removed outdated "Implementation Phases" section from architecture.md and cleaned up references to structured/text modes.

  8. Good test coverage - New format_error/1 tests cover all error types including edge cases like truncation.

Issues (Must Fix)

None. The implementation is complete and consistent.

Suggestions (Optional)

  1. OPTIONAL: Consider documenting format_error/1 in architecture.md - This new public API function is useful for integrators beyond the demo. A brief mention in the error handling section could help users discover it.

  2. Nice-to-have: Explicit agent loop termination reasons - Currently when max iterations is hit, the error is just "Max iterations reached". Could include context about what was being attempted, but this is minor since 5 iterations is generous.

Security

No concerns. The demo app only operates on in-memory sample data, and the sandbox limits (timeout, memory) remain in place.

Documentation

  • README.md updated with explore mode documentation ✓
  • /help output updated with new commands ✓
  • architecture.md cleaned up (removed outdated phases) ✓
  • Module docs updated appropriately ✓

Verdict

Approve - This is a well-implemented feature that meets all acceptance criteria. The code is clean, the agentic loop is properly bounded, tests pass, and documentation is updated. The format_error/1 addition to the main library is a nice bonus that improves the library's usability.

@github-actions github-actions bot added the auto-triage-pending Auto-triage in progress label Dec 5, 2025
@andreasronge
Copy link
Copy Markdown
Owner Author

Auto-Triage Summary (Cycle 2 of 3 - Final)

Review Result

The third review shows Approve verdict with no "Must Fix" issues.

Decisions Made

# Issue Decision Action
1 Consider documenting format_error/1 in architecture.md IGNORE Optional documentation enhancement out of PR scope
2 Add explicit agent loop termination reasons IGNORE Nice-to-have improvement marked as "minor" by reviewer

Reasoning

Both suggestions are truly optional improvements:

  • Issue 1: The PR's purpose is adding explore mode, not comprehensive API documentation. The format_error/1 function is working correctly and has good inline documentation.
  • Issue 2: Explicitly marked as "nice-to-have" and "minor" by the reviewer. The current max iterations error is clear and the 5-iteration limit is generous.

The PR successfully implements all acceptance criteria from issue #91:

  • ✅ Schema discovery via introspection operations
  • ✅ Agentic loop with proper state tracking
  • ✅ CLI commands (/mode, --explore flag)
  • ✅ Character-based result truncation
  • ✅ Comprehensive documentation updates
  • ✅ All tests passing (549 tests)

Status

  • FIX_NOW items: 0
  • Issues created: 0
  • Items ignored: 2 (both optional enhancements)

PR is ready to merge. Adding ready-to-merge label.

@andreasronge andreasronge added the ready-to-merge PR is ready to be merged label Dec 5, 2025
@github-actions github-actions bot added auto-triage-cycle-3 Auto-triage cycle 3 (max) and removed auto-triage-cycle-2 Auto-triage cycle 2 auto-triage-pending Auto-triage in progress labels Dec 5, 2025
@andreasronge andreasronge enabled auto-merge (squash) December 5, 2025 06:24
@andreasronge andreasronge merged commit 69e0d09 into main Dec 5, 2025
4 checks passed
@andreasronge andreasronge deleted the 91-add-explore-mode-demo branch December 5, 2025 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-triage-complete Auto-triage completed auto-triage-cycle-3 Auto-triage cycle 3 (max) claude-review Add to PRs to trigger Claude automated review ready-to-merge PR is ready to be merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add explore mode for demo app

1 participant