[Bug]: reasoning_tokens always 0 for chat_completions mode — normalize_usage only extracts from output_tokens_details, missing completion_tokens_details

## Bug Description

When using `hermes --tui` with a provider in `chat_completions` API mode (e.g., `opencode-go`, `opencode-zen`, `kilo`, `deepseek`, `openrouter`), the exit summary always shows `reasoning 0` even when the model produces substantial reasoning tokens. For example, a 53-message session with `deepseek-v4-pro` shows:

```
Tokens:  890226 (in 35517, out 6581, cache 848128, reasoning 0)
```

The cache tokens (848,128) clearly indicate heavy context reuse across a long session — models like `deepseek-v4-pro` that support extended thinking should have produced non-zero reasoning tokens.

## Steps to Reproduce

1. Run `hermes --tui` with a reasoning-capable model via a `chat_completions`-mode provider (e.g., `deepseek-v4-pro` via `opencode-go`, or `claude-sonnet-4` via `openrouter`)
2. Send several prompts that trigger reasoning/thinking
3. Press Ctrl-C to exit
4. Observe: `reasoning 0` in the exit summary token breakdown

## Expected Behavior

The exit summary should show non-zero `reasoning_tokens` when the model produces reasoning content, matching the actual API usage.

## Actual Behavior

`reasoning 0` is always displayed for `chat_completions` mode providers, regardless of actual reasoning token usage.

## Root Cause Analysis

In `agent/usage_pricing.py:normalize_usage()` (line 575–578):

```python
reasoning_tokens = 0
output_details = getattr(response_usage, "output_tokens_details", None)
if output_details:
    reasoning_tokens = _to_int(getattr(output_details, "reasoning_tokens", 0))
```

This code checks `output_tokens_details.reasoning_tokens` — the field name used by the **Codex Responses API** (`codex_responses` mode).

However, for **OpenAI Chat Completions** (`chat_completions` mode — used by `opencode-go`, `opencode-zen`, `kilo`, `deepseek`, `openrouter`, and most other providers), reasoning tokens are stored in `completion_tokens_details.reasoning_tokens`:

```python
# OpenAI Chat Completions response usage object:
response.usage.completion_tokens_details.reasoning_tokens
```

The function never checks `completion_tokens_details`, so reasoning tokens are always 0 for `chat_completions` mode.

**Note:** The mode-specific branches (lines 539–573) correctly handle input/output/cache tokens for each API mode, but the reasoning extraction (lines 575–578) is a single code path that only checks the Codex-format field.

## Proposed Fix

Add a fallback to check `completion_tokens_details` when `output_tokens_details` is absent:

```python
reasoning_tokens = 0
# Codex Responses format
output_details = getattr(response_usage, "output_tokens_details", None)
if output_details:
    reasoning_tokens = _to_int(getattr(output_details, "reasoning_tokens", 0))
# OpenAI Chat Completions format
if reasoning_tokens == 0:
    completion_details = getattr(response_usage, "completion_tokens_details", None)
    if completion_details:
        reasoning_tokens = _to_int(getattr(completion_details, "reasoning_tokens", 0))
```

Or alternatively, make it mode-aware using the `mode` parameter already available in the function.

## Affected Files

- `agent/usage_pricing.py` — `normalize_usage()` (line 575–578)

## Impact

- **All `chat_completions`-mode providers**: `opencode-go`, `opencode-zen`, `kilo`, `deepseek`, `openrouter`, `huggingface`, `nvidia`, `xiaomi`, `gmi`, `arcee`, `lmstudio`, `ollama-cloud`, `custom` providers, and any others with `transport: openai_chat`
- **Unaffected**: `codex_responses`-mode providers (OpenAI Codex, xAI, etc.) — these correctly use `output_tokens_details`
- **Unaffected**: `anthropic_messages`-mode providers — Anthropic API doesn't break out reasoning tokens separately


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: reasoning_tokens always 0 for chat_completions mode — normalize_usage only extracts from output_tokens_details, missing completion_tokens_details #18466

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Proposed Fix

Affected Files

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: reasoning_tokens always 0 for chat_completions mode — normalize_usage only extracts from output_tokens_details, missing completion_tokens_details #18466

Description

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Proposed Fix

Affected Files

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions