Skip to content

[Bug]: Inbound document host paths not translated to container paths under Docker backend #18787

@pvdb2178

Description

@pvdb2178

Bug Description

When terminal.backend: docker is enabled, inbound documents uploaded via a messaging platform (Telegram, Slack, Discord, Feishu, Email, etc.) are cached at a host path under ~/.hermes/cache/documents/. The gateway then injects that host path verbatim into the agent's prompt. Since the agent runs inside the Docker sandbox, it cannot resolve the host path and document open attempts fail.

The cache directory is auto-mounted read-only into the container at /root/.hermes/cache/documents by tools/credential_files.py::get_cache_directory_mounts() (introduced in #4846). The file is reachable — just at a different path than the one the prompt advertises.

Steps to Reproduce

  1. Configure the gateway with terminal.backend: docker.
  2. From Telegram (or any supported messaging platform), upload a binary document — e.g. a .docx, .pdf, .xlsx. Text formats (.md, .txt ≤100KB) hit a different code path that injects content directly and aren't affected.
  3. Ask the agent to read the file.

Expected Behavior

The agent receives a path it can open from inside its sandbox — the in-container mount path (/root/.hermes/cache/documents/<file>).

Actual Behavior

The agent's prompt contains the host path (/home/<user>/.hermes/cache/documents/<file>), which doesn't exist inside the container. The agent fails to open the file. Sample agent reply:

The file path the platform handed me (/home/hermes/.hermes/cache/documents/...) isn't reachable from inside my container — that's a host path that's not bind-mounted in. So no, I can't actually open it from here.

Diagnosis

Where the host path is injected into the agent prompt (gateway/run.py:5236-5269):

if event.media_urls and event.message_type == MessageType.DOCUMENT:
    ...
    for i, path in enumerate(event.media_urls):
        ...
        context_note = (
            f"[The user sent a document: '{display_name}'. "
            f"The file is saved at: {path}. "          # ← host path leaks here
            f"Ask the user what they'd like you to do with it.]"
        )

Where the path was originally set (per-platform; example gateway/platforms/telegram.py:3171-3175):

cached_path = cache_document_from_bytes(raw_bytes, original_filename or f"document{ext}")
event.media_urls = [cached_path]   # cached_path is the host path

The same pattern exists in every platform adapter that calls cache_document_from_bytesgateway/platforms/feishu.py:3273, gateway/platforms/email.py:434, and similar in discord.py, slack.py, whatsapp.py, matrix.py, dingtalk.py, qqbot.py.

Where the cache is mounted into the sandbox (tools/environments/docker.py:439-444):

for cache_mount in get_cache_directory_mounts():
    volume_args.extend([
        "-v",
        f"{cache_mount['host_path']}:{cache_mount['container_path']}:ro",
    ])

get_cache_directory_mounts() (tools/credential_files.py:353) defaults container_base="/root/.hermes", so <HERMES_HOME>/cache/documents mounts at /root/.hermes/cache/documents.

The auto-mount mechanism added in #4846 was designed for exactly this scenario — the PR's stated intent: "make the host cache directories accessible inside remote containers so the agent can use standard terminal commands on any cached file." The mount is in place; only the prompt-side path translation is missing. Issue #6004 (open) tracks an adjacent gap for clipboard images.

Proposed Fix

A small helper at the agent prompt boundary that translates host cache paths to their in-sandbox equivalents, gated on backend == "docker" for now (other backends — Modal, Daytona, Vercel — have different mount semantics and are out of scope until verified):

# tools/credential_files.py — add inverse helper next to get_cache_directory_mounts()
def to_agent_visible_cache_path(host_path: str, container_base: str = "/root/.hermes") -> str:
    """Translate a host cache path to its mounted path inside the sandbox.
    Returns the input unchanged if it's not under any auto-mounted cache dir."""
    for mount in get_cache_directory_mounts(container_base=container_base):
        if host_path.startswith(mount["host_path"]):
            return mount["container_path"] + host_path[len(mount["host_path"]):]
    return host_path

Call site: gateway/run.py:5253, only inside the document-context-injection block — not for image/audio paths in the same function, which are consumed gateway-side (vision attachment, Whisper STT) and need the host path.

The translation must be scoped narrowly to text the agent reads and uses to open files. This mirrors the inverse problem already being solved in tools/vision_tools.py in PR #14990 (_resolve_sandbox_path_to_host).

Open questions for triage

  1. Scope. Restrict to backend == "docker" for now (safe), or generalize to all backends that use get_cache_directory_mounts()? Modal does an iter_cache_files() initial-mount + resync which may or may not match the same container path layout. Happy to verify and generalize if preferred.
  2. Helper location. tools/credential_files.py (next to its inverse) vs. gateway/platforms/base.py (next to DOCUMENT_CACHE_DIR)? My preference is the former for cohesion with get_cache_directory_mounts.
  3. Translation site. Rewrite event.media_urls at platform-layer assignment (touches every adapter, but the agent always sees a usable path)? Or rewrite at the single prompt-injection site (narrower blast radius — preferred)? The platform-layer approach also risks breaking gateway-side image/audio consumers that need the host path.
  4. Cross-platform. The startswith comparison is fine on Linux/macOS but may misbehave on Windows hosts (case-insensitive paths, backslash separators). Path.resolve() + relative_to() is more robust but heavier. Preference?

Affected Component

  • Gateway (any messaging platform calling cache_document_from_bytes)
  • Backend: Docker

Environment

  • Hermes Agent: v0.12.0 (commit 20132435c)
  • OS: Debian 12
  • Python: 3.11.2
  • Terminal backend: Docker, image nikolaik/python-nodejs:python3.11-nodejs20
  • Messaging platform reproduced on: Telegram

Debug Report

Intentionally not attaching hermes debug share output — the bug reproduces from a default Docker-backend setup and the file/line trace above pinpoints the exact code path without needing session data. Happy to provide a debug share if useful.


Happy to submit a PR. Would appreciate triage direction on the open questions above before writing it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundbackend/dockerDocker container executioncomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions