Refactor analysis sandbox with document VFS and expanded search by ggozad · Pull Request #342 · ggozad/haiku.rag

ggozad · 2026-04-14T09:00:44Z

Added

Document virtual filesystem in analysis sandbox: Documents mounted at /documents/{id}/ with metadata.json (eager), content.txt (lazy), and items.jsonl (lazy). Standard Python pathlib.Path for browsing and reading document content and structure.
execute_code skill tool: Direct code execution in the sandbox, surfaced as individual AG-UI events in the chat TUI. Items VFS uses a lazy bulk cache (~1s for 1000 documents vs 60s+ per-document queries).
cite skill tool: Explicit citation registration with per-turn tracking via citation_index and citations fields in state
--skill flag for chat TUI: haiku-rag chat -s rag -s analysis to enable specific skills
--model overrides all agents: Chat, QA, research, and analysis agents all use the specified model
Collapsible program display in chat TUI: Analysis code execution results shown as expandable code blocks

Changed

BREAKING: Flatten skill architecture: Skill sub-agents now call search, execute_code, cite, list_documents, get_document directly — every tool call surfaces as an AG-UI event. Removes the 3rd agent layer where ask/analyze/research spawned inner agents whose tool calls were invisible.
BREAKING: Rename RLM agent to analysis agent throughout:
- agents/rlm/ → agents/analysis/, all classes renamed (RLMResult → AnalysisResult, etc.)
- client.rlm() → client.analyze()
- CLI: haiku-rag rlm → haiku-rag analyze
- MCP: rlm_question → analyze
- Config: rlm: → analysis: in YAML, RLMConfig → AnalysisConfig
- Skill entrypoint: rag-rlm → rag-analysis
Analysis sandbox search() returns expanded results with doc_item_refs and labels for cross-referencing with items.jsonl
list_documents skill tool takes no parameters — returns all documents
Per-turn citation tracking: citation_index: dict[str, Citation] (deduplicated) + citations: list[list[str]] (per-turn chunk IDs) replaces flat citation list
Search rate limiting: Skill search tool enforces config.qa.max_searches
Context expansion respects section boundaries: Sections within the char budget are returned whole regardless of item count. Too-large sections expand bounded by section edges. Adjacent sections no longer merge — only overlapping ranges do.
Visualization shows full expanded section: visualize_chunk expands context before resolving bounding boxes, so all pages the section spans get highlighted.

Removed

ask skill tool: Replaced by direct search + cite — the skill sub-agent searches and answers directly
analyze skill tool: Replaced by direct execute_code + search + cite
research skill tool: Removed from skill layer (still available via CLI haiku-rag research and MCP)
get_document(), get_docling_document(): Removed from analysis sandbox — replaced by VFS
get_chunk(): Removed from analysis sandbox — search results include expanded context
create_analysis_toolset(): Removed unused tools/analysis.py module
qa_history, reports from skill state: Conversational context handled by the outer chat agent
combine_filters, build_document_filter: Removed from public API
max_context_items: Removed from SearchConfig — max_context_chars is the sole expansion constraint
QAHistoryEntry, tools/qa.py: Removed unused QA history model and relevance threshold

codecov · 2026-04-14T09:08:27Z

Codecov Report

❌ Patch coverage is 96.88474% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.78%. Comparing base (1049586) to head (8e555ab).
⚠️ Report is 25 commits behind head on main.

Files with missing lines	Patch %	Lines
...slim/haiku/rag/store/repositories/document_item.py	73.68%	5 Missing ⚠️
haiku_rag_slim/haiku/rag/mcp.py	33.33%	2 Missing ⚠️
...aiku_rag_slim/haiku/rag/agents/analysis/sandbox.py	99.27%	1 Missing ⚠️
haiku_rag_slim/haiku/rag/client.py	95.65%	1 Missing ⚠️
..._rag_slim/haiku/rag/store/repositories/document.py	83.33%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #342      +/-   ##
==========================================
- Coverage   92.18%   91.78%   -0.41%     
==========================================
  Files          75       73       -2     
  Lines        3954     3942      -12     
==========================================
- Hits         3645     3618      -27     
- Misses        309      324      +15

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Replace get_document() and get_docling_document() with a VFS at /documents/{id}/ with metadata.json (eager), content.txt (lazy), and items.jsonl (lazy). Keep search(), list_documents() (now returns all), and llm() as external functions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ilter helpers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…tools

…eout

…_items

ggozad force-pushed the feat/analysis branch 3 times, most recently from 12561ed to 0fa7271 Compare April 16, 2026 11:00

ggozad changed the title ~~Rename RLM agent to analysis and replace get_chunk with get_context~~ Refactor analysis sandbox with document VFS and expanded search Apr 16, 2026

ggozad and others added 12 commits April 17, 2026 18:32

remove unused create_analysis_toolset and AnalysisResult

499a843

rename RLM agent to analysis throughout the codebase

d2b3ba1

add get_context() to analysis sandbox and improve prompt

44fd0c9

fold context expansion into sandbox search and remove get_context

4118533

expose doc_item_refs and labels in sandbox search results

bacc21b

add --skill flag to chat TUI for rag and analysis skills

2a0e89b

remove filter parameter from analyze skill tool and clean up unused f…

aa4ce07

…ilter helpers

When setting --model, set all subagents as well

cf89ff5

add citation support to analysis agent

7d98d0e

add collapsible program display to chat TUI

20e40d7

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

flatten skill architecture: replace ask/analyze/research with direct …

d52f453

…tools

ggozad force-pushed the feat/analysis branch from 7e61535 to d52f453 Compare April 17, 2026 15:38

ggozad added 10 commits April 20, 2026 09:48

Remove documents from state, fix tests

68a9f19

Docs & cl

fa87cf7

use MontyRepl for persistent variables across execute_code calls

4d75943

Improve analysis SKILL.md

27c5def

repository methods, module-level executor, read-only VFS

167c837

reset sandbox per skill invocation to prevent state leaks

9d921b1

revert REPL persistence: fresh sandbox per execute_code call

579e609

remove cited_chunks from analysis agent, hoist _deny_write out of loop

cd0c21c

bulk-fetch items.jsonl via lazy cache to avoid per-document query tim…

af7731f

…eout

fix context expansion: respect section boundaries, remove max_context…

ff8ad08

…_items

ggozad force-pushed the feat/analysis branch from 3e73a0c to ff8ad08 Compare April 20, 2026 12:18

remove dead code, add tool descriptions in frontend, update changelog

eb4e172

Improve coverage

8e555ab

ggozad merged commit 2cd4a85 into main Apr 20, 2026
4 of 5 checks passed

ggozad deleted the feat/analysis branch April 20, 2026 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor analysis sandbox with document VFS and expanded search#342

Refactor analysis sandbox with document VFS and expanded search#342
ggozad merged 24 commits intomainfrom
feat/analysis

ggozad commented Apr 14, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ggozad commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Added

Changed

Removed

Uh oh!

codecov Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ggozad commented Apr 14, 2026 •

edited

Loading

codecov Bot commented Apr 14, 2026 •

edited

Loading