Skip to content

Commit 88abf70

Browse files
authored
Prevent agent log loss during pier exec (#4)
2 parents 02ec929 + 36f8a0e commit 88abf70

File tree

8 files changed

+902
-69
lines changed

8 files changed

+902
-69
lines changed

DEVELOPER.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Developer Guide
2+
3+
## Setup
4+
5+
```bash
6+
git clone https://github.com/allenai/pier && cd pier
7+
make check # run tests, lint, and typecheck
8+
uv run pre-commit install # optional: auto-lint and format on commit
9+
uv tool install -e . # editable install — use `pier` from any directory
10+
```
11+
12+
## Testing
13+
14+
```bash
15+
uv run --extra dev pytest -rs # fast — skips Docker agent tests
16+
```
17+
18+
Docker-backed agent integration suite:
19+
20+
```bash
21+
PIER_RUN_DOCKER_INTEGRATION=1 uv run --extra dev pytest -rs pier/tests/test_agent_integration.py
22+
PIER_RUN_DOCKER_INTEGRATION=1 PIER_TEST_AGENTS=codex uv run --extra dev pytest -rs pier/tests/test_agent_integration.py
23+
PIER_RUN_DOCKER_INTEGRATION=1 PIER_TEST_IMAGE=ubuntu:24.04 uv run --extra dev pytest -rs pier/tests/test_agent_integration.py
24+
```
25+
26+
## Agent Log Capture
27+
28+
When agents run inside containers via `pier exec`, their output and session
29+
logs need to survive container teardown. Harbor mounts
30+
`workspace/.pier/_harbor/agent/``/logs/agent` in the container, so
31+
anything written there persists on the host.
32+
33+
### Two capture mechanisms
34+
35+
**Session recording**`pier exec` auto-detects agent commands by matching
36+
the command name against Harbor's agent registry (`get_binary_agent_map()`).
37+
When an agent is detected, the command is wrapped with `script -q -c` to
38+
record terminal output to `/logs/agent/exec/<ts>/<agent>.txt` while preserving
39+
full TTY behavior (colors, cursor, interactive prompts).
40+
41+
**Config-dir env vars**`CLAUDE_CONFIG_DIR` and `CODEX_HOME` are set on
42+
every container exec, pointing into the per-session directory. These agents
43+
write structured session logs (JSONL, session dirs) that survive container
44+
teardown. User `-e` overrides win via `setdefault`.
45+
46+
### Per-session isolation
47+
48+
Every `pier exec` creates a timestamped directory directly under
49+
the agent log dir:
50+
51+
```
52+
workspace/.pier/_harbor/agent/
53+
exec/
54+
2026-04-09_18-30-00-a1b2c3/ # pier exec claude ...
55+
claude-code.txt # session recording
56+
sessions/projects/hash/ # Claude JSONL (via CLAUDE_CONFIG_DIR)
57+
2026-04-09_18-35-00-d4e5f6/ # pier exec goose ...
58+
goose.txt # session recording
59+
2026-04-09_18-40-00-g7h8i9/ # pier exec bash
60+
sessions/projects/hash/ # Claude JSONL if run inside bash
61+
setup/ # Harbor agent install logs (not pier-managed)
62+
```
63+
64+
Each session directory mirrors Harbor's flat `/logs/agent/` layout. When
65+
`pier verify` or `pier capture` extracts trajectory,
66+
`_latest_session_dir()` (called by `extract_agent_context()`) scans by
67+
reverse timestamp to find the most recent directory containing the
68+
requested agent's files.
69+
70+
### What pier hardcodes vs derives from Harbor
71+
72+
| Knowledge | Source | Location |
73+
|-----------|--------|----------|
74+
| CLI binary names (claude, codex, goose...) | Harbor's `get_version_command()` | `get_binary_agent_map()` |
75+
| Log-dir env vars (CLAUDE_CONFIG_DIR, CODEX_HOME) | Harbor's `EnvironmentPaths` | `get_log_capture_env()` |
76+
| Post-run artifact commands (gemini, hermes) | Hardcoded from Harbor's `run()` | `get_post_run_commands()` |
77+
| Behavior env vars (IS_SANDBOX, PATH prefixes) | Hardcoded per-agent | `get_agent_exec_env()` |
78+
| Session recording filename | Convention: `<agent-name>.txt` | `_exec_container()` |
79+
80+
Binary names and log-dir values are derived from Harbor at runtime.
81+
The rest is hardcoded in `harbor_bridge.py` — all grouped under the
82+
"Agent log capture" section. When Harbor exposes APIs for
83+
`get_log_env_vars()`, `get_post_run_commands()`, and `get_cli_binary()`,
84+
these can be replaced.
85+
86+
### Post-run artifact collection
87+
88+
Some agents produce structured artifacts beyond raw output. Harbor's
89+
`run()` collects these after the agent exits:
90+
91+
- **gemini-cli**: copies `~/.gemini/tmp/session-*.json``gemini-cli.trajectory.json`
92+
- **hermes**: runs `hermes sessions export``hermes-session.jsonl`
93+
94+
Pier replicates these via `get_post_run_commands()`, run after the agent
95+
exits but while the container is still up. If a new Harbor agent adds
96+
post-run steps, add them there.
97+
98+
### Multiple sessions
99+
100+
`pier verify` and `pier capture` auto-find the most recent session
101+
for the requested agent. Older sessions are preserved on disk and can be
102+
selected with `--session <timestamp>` on verify/capture.
103+
104+
### Limitations
105+
106+
- **`pier exec bash` + manual agent** — only config-dir agents
107+
(Claude, Codex) get structured logs captured. Other agents need the
108+
`script` wrapper, which only applies when pier detects the agent command.
109+
- **Agents with complex entry points** (openhands, swe-agent) aren't
110+
auto-detected — their binary is `python`/`pip`, not a unique name.

README.md

Lines changed: 3 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -160,11 +160,11 @@ Run a command in the workspace context. Sets workspace env vars so task CLIs fin
160160

161161
```bash
162162
pier exec bash
163-
pier exec claude
163+
pier exec claude # agent auto-detected → full output captured
164164
pier exec -d -- quarto preview --port 8888 --host 0.0.0.0 --no-browse
165165
```
166166

167-
- **Container mode**: delegates to `docker exec` in the container's working directory
167+
- **Container mode**: delegates to `docker exec` in the container's working directory. Agent commands are auto-detected and their output is recorded via `script`. Each exec gets its own timestamped directory so sessions don't overwrite each other. `CLAUDE_CONFIG_DIR` and `CODEX_HOME` are always set so structured session logs land in the mounted volume.
168168
- **Host mode**: runs the command directly with `TASK_WORKSPACE` set
169169
- `-d` / `--detach`: runs in the background (useful for servers like quarto preview)
170170

@@ -223,34 +223,4 @@ Install task skills for your coding agent (host mode only). Reads `skills_dir` f
223223

224224
## Development
225225

226-
### Architecture
227-
228-
Pier stays **agent-agnostic** — it should not need changes when Harbor adds a new agent. Agent-specific details (session layouts, env vars, detection heuristics) belong in Harbor, not pier. `pier/harbor_bridge.py` is the thin boundary between the two; when it must contain agent-specific code, it should be treated as temporary scaffolding until Harbor exposes the right API. `pier/cli.py` should never reference individual agent names.
229-
230-
```bash
231-
git clone https://github.com/allenai/pier && cd pier
232-
make check # run tests, lint, and typecheck
233-
uv run pre-commit install # optional: auto-lint and format on commit
234-
uv tool install -e . # editable install — use `pier` from any directory
235-
```
236-
237-
### Testing
238-
239-
Default test runs stay fast and skip Docker-backed agent install smoke tests:
240-
241-
```bash
242-
uv run --extra dev pytest -rs
243-
```
244-
245-
Run the Docker-backed agent integration suite explicitly:
246-
247-
```bash
248-
PIER_RUN_DOCKER_INTEGRATION=1 uv run --extra dev pytest -rs pier/tests/test_agent_integration.py
249-
```
250-
251-
Optional selectors for the integration suite:
252-
253-
```bash
254-
PIER_RUN_DOCKER_INTEGRATION=1 PIER_TEST_AGENTS=codex uv run --extra dev pytest -rs pier/tests/test_agent_integration.py
255-
PIER_RUN_DOCKER_INTEGRATION=1 PIER_TEST_IMAGE=ubuntu:24.04 uv run --extra dev pytest -rs pier/tests/test_agent_integration.py
256-
```
226+
See [DEVELOPER.md](DEVELOPER.md) for setup, testing, and architecture.

0 commit comments

Comments
 (0)