Skip to content

Bug: model hallucination on exploration test #20 — bypasses tools entirely #836

@andreasronge

Description

@andreasronge

Problem

Test #20 (decoy: certification reimbursement) triggers hallucination where the model fabricates a policy description and document ID without ever calling the search or fetch tools.

Observed behavior

Turn 1: No valid code produced (parse failure)
Turn 2: Model returns fabricated text:
  "The policy for professional certification reimbursement, as detailed 
   in document pol-883, states that employees are eligible for full 
   reimbursement of certification fees..."

The document ID "pol-883" does not exist. The model never called tool/search or tool/fetch. It hallucinated the entire answer.

Expected behavior

The model should search for certification/reimbursement documents, fetch candidates, compare content, and return "DOC-021" (Certification Reimbursement document).

Root cause analysis

The system prompt says "You have NO direct data access" (coordinator mode) or shows data inventory (direct mode), but the model sometimes ignores tools and generates plausible-sounding text answers. This is worse in auto-return mode because the fabricated text has no println → auto-returns as the answer.

In explicit multi-turn mode, the model would need to wrap the fabrication in (return ...), which provides a slight friction barrier. But the core issue is model behavior, not prompt mechanics.

Reproduction

cd demo && source .env
mix run -e 'PtcDemo.LispTestRunner.run_one(20, model: "openrouter:google/gemini-3.1-flash-lite-preview", prompt: :auto_return, verbose: true, debug: true)'

Occurs intermittently — approximately 1 in 3 runs.

Potential mitigations

  1. Prompt addition: "You MUST use tools to access data. Never fabricate or guess answers."
  2. Validation: Detect when no tool calls were made but a non-trivial answer was returned
  3. Signature constraint: Force DOC-XXX format pattern in the expected output
  4. Model capability: May only affect the lite model — test on stronger models (Experiment: cross-model validation of auto-return #834)

Affects

Metadata

Metadata

Assignees

No one assigned

    Labels

    auto-returnAuto-return completion mode investigationbugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions