Closed
Conversation
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
andreasronge
added a commit
that referenced
this pull request
Dec 1, 2025
…x execution (#10) * feat: implement Phase 1 core interpreter with JSON parsing and sandbox execution Implements the foundational Phase 1 of PtcRunner including: - **Parser**: JSON string and map parsing with error handling - **Validator**: DSL schema validation for all Phase 1 operations - **Context**: Variable bindings management - **Operations**: Core data, control flow, collection, comparison, and aggregation operations - Data: literal, load, var - Control: pipe - Collections: filter, map, select - Comparison: eq - Aggregations: sum, count - **Interpreter**: AST evaluation with operation dispatch - **Sandbox**: Isolated BEAM process execution with timeout and resource monitoring All programs execute in isolated processes with configurable timeouts and memory limits, returning execution metrics (duration_ms, memory_bytes). Includes comprehensive test coverage for all operations, error cases, and edge cases. Fixes #7 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: resolve PR review issues for validation and sandbox - Fix validate_list/1: Use Enum.reduce_while instead of Enum.find to properly validate all nested operations (Issue #1) - Fix sandbox memory limits: Pass max_heap option to Process.spawn to enforce memory limits (Issue #2) - Add test for nested validation errors: Verify validation catches unknown operations inside pipe steps (Issue #3) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: resolve PR review issues for timeout and memory errors - Fix timeout error format to match architecture.md: return {:error, {:timeout, timeout_value}} instead of {:error, :timeout} - Fix memory exceeded error detection: distinguish between memory killed (:killed reason) and timeout kills - Update sandbox.ex:72-76 to handle :killed reason as memory exceeded with {:error, {:memory_exceeded, max_heap * 8}} - Update sandbox.ex:82 to return {:error, {:timeout, timeout}} for explicit timeout kills - Update test case to expect new timeout error format {:timeout, 0} instead of :timeout 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: update PM status - PR #10 Phase 1 implementation --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>
This was referenced Dec 1, 2025
andreasronge
added a commit
that referenced
this pull request
Dec 8, 2025
… PM workflow Restore PM workflow's ability to create issues from specification documents, adapted for multi-phase PTC-Lisp implementation with GitHub Project tracking. Changes: - Add spec document registry (api-refactor, parser, analyzer, eval, integration) - Integrate with GitHub Project #1 (auto-add issues, set Phase field) - Update project Status to "In Progress" when triggering implementation - Add duplicate issue prevention check before creating new issues - Create phase labels (phase:api-refactor, phase:parser, etc.) - Document project field IDs and option mappings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This was referenced Dec 9, 2025
This was referenced Dec 9, 2025
This was referenced Dec 22, 2025
claude bot
pushed a commit
that referenced
this pull request
Dec 27, 2025
Extract shared parenthesis-counting logic into a single `paren_balance/1` helper function that returns the balance state. Wrap it in three semantic predicates for different use cases: - `paren_balance/1`: Core helper returning balance count or :unbalanced - `balanced_parens?/1`: Check if parens are balanced (count == 0) - `has_unbalanced_parens?/1`: Check if unbalanced (count == :unbalanced) - `has_more_closing_than_opening?/1`: Count without halting early This consolidates nearly identical logic from: - has_more_closing_than_opening?/1 (lines 345-358) - new in PR #313 - balanced_parens?/1 (lines 449-460) - new in PR #313 - has_unbalanced_parens?/1 (lines 529-540) - pre-existing Fixes PR #314 review issue #1: consolidate duplicated parenthesis counting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
andreasronge
pushed a commit
that referenced
this pull request
Dec 27, 2025
* feat: add multi-line example extraction to SpecValidator (#313) Implement two-pass extraction approach: - Pass 1: Extract single-line examples (existing functionality) - Pass 2: Scan for lines with unbalanced closing parens and assemble backward to find the opening This enables extraction of multi-line PTC-Lisp code examples from the specification that previously were filtered out as "fragments". Examples like: (let [x 10 y (+ x 5)] (* x y)) ; => 150 Are now properly assembled and validated. Also marked 4 examples in the spec as skipped (with `; => ...`) because they require unimplemented features or context that isn't available in the test environment. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * fix: consolidate duplicated parenthesis counting logic in SpecValidator Extract shared parenthesis-counting logic into a single `paren_balance/1` helper function that returns the balance state. Wrap it in three semantic predicates for different use cases: - `paren_balance/1`: Core helper returning balance count or :unbalanced - `balanced_parens?/1`: Check if parens are balanced (count == 0) - `has_unbalanced_parens?/1`: Check if unbalanced (count == :unbalanced) - `has_more_closing_than_opening?/1`: Count without halting early This consolidates nearly identical logic from: - has_more_closing_than_opening?/1 (lines 345-358) - new in PR #313 - balanced_parens?/1 (lines 449-460) - new in PR #313 - has_unbalanced_parens?/1 (lines 529-540) - pre-existing Fixes PR #314 review issue #1: consolidate duplicated parenthesis counting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
This was referenced Dec 27, 2025
This was referenced Dec 31, 2025
This was referenced Jan 9, 2026
This was referenced Jan 24, 2026
andreasronge
added a commit
that referenced
this pull request
Mar 22, 2026
Adds an opt-in auto-return mode where println presence controls the multi-turn loop: println = exploration (continue), no println = answer (return last expression). This eliminates the return/fail interaction rules that were the #1 model failure mode in benchmarks. Key design decisions: - completion_mode: :explicit (default) preserves all existing behavior - completion_mode: :auto uses a simpler prompt (33% smaller) with one rule instead of three - Auto-return is disabled when a plan is present — plan agents use the regular multi_turn_journal prompt with explicit return/step-done - Plan always auto-enables journaling for progress checklist rendering - return/fail still work as escape hatches in auto mode Benchmark results (gemini-3.1-flash-lite-preview, 3 runs of 25 tests): - auto_return: 97.3% avg pass rate, 2214 avg tokens - multi_turn: 94.7% avg pass rate, 2794 avg tokens - 21% fewer tokens, 15% faster, same or better accuracy Also includes coordinator + worker delegation scripts for testing Claude Code-style agent composition patterns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
This PR tests the newly added GitHub Actions workflows.
🤖 Generated with Claude Code