more full prompt reprocessing in recent update?

### What happened?

I noticed after recent update, it is more frequent that my local LLM is doing more full prompt reprocessing when just continuing a conversation.

Here is what I saw from llama.cpp console:
```
155.00.988.881 W slot update_slots: id  0 | task 37846 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)
```

I also notice that an auto-complete for next prompt feature is introduced recently. I wonder if this is related. Anyway, I want to know how to disable that feature even if it doesn't help. I don't need it and want to see if issue can be fixed by disabling it.

### What did you expect to happen?

No full prompt reprocessing when just continuing a conversation.

### Client information

<details>
<summary>Client Information</summary>

Run `qwen` to enter the interactive CLI, then run the `/about` command.

```console
$ qwen /about
Qwen Code v0.18.5
Model: qwen3.6-27b
Fast Model: not set
Auth: openai
Platform: win32 x64 (10.0.26200)
Node.js: v22.22.2
Session: 738e8cbf-0d10-42a5-9b54-91310e77bbdf
Git commit: 2937b09cf
LSP: disabled
```

</details>


### Login information

n/a

### Anything else we need to know?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

more full prompt reprocessing in recent update? #5736

What happened?

What did you expect to happen?

Client information

Login information

Anything else we need to know?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

more full prompt reprocessing in recent update? #5736

Description

What happened?

What did you expect to happen?

Client information

Login information

Anything else we need to know?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions