LM Studio 0.4.15
Windows 11
CPU llama.cpp 0.2.18
LLM: Gemma 4 31B it compressed
At around the time when the model should've output its reply, it went back to square one. "Full prompt re-processing due to lack of cache data." It also says something about a "client" and "server" although the Server is Stopped and everything is being done offline (supposedly). Or is localhost at the end of a request for whatever?
Relevant (I guess) part of the Log below; in short:
Note: This warning was produced by the client and is printed on the server for convenience.
[LLMProcess][ServerPort] Received communication warning from the client (channelUnknown): Received channelSend for unknown channel, channelId = 297
This is usually caused by communication protocol incompatibility. Please make sure you are using the up-to-date versions of the SDK and LM Studio.
Note: This warning was produced by the client and is printed on the server for convenience.
2026-06-03 11:02:46 [DEBUG]
26.26.871.357 I slot launch_slot_: id 0 | task 316 | processing task, is_child = 0
26.26.871.371 W slot update_slots: id 0 | task 316 | cache reuse is not supported - ignoring n_cache_reuse = 256
26.26.871.374 I slot update_slots: id 0 | task 316 | Checking checkpoint with [195, 1474] against 0...
26.26.871.375 W slot update_slots: id 0 | task 316 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see ggml-org/llama.cpp#13194 (comment))
26.26.871.379 W slot update_slots: id 0 | task 316 | erased invalidated context checkpoint (pos_min = 195, pos_max = 1474, n_tokens = 1475, n_swa = 1024, pos_next = 0, size = 1000.016 MiB)
2026-06-03 11:03:02 [DEBUG]
26.43.069.952 W srv stop: cancel task, id_task = 316
2026-06-03 11:03:02 [DEBUG]
[LLMProcess][ServerPort] Received communication warning from the client (channelUnknown): Received channelSend for unknown channel, channelId = 297
This is usually caused by communication protocol incompatibility. Please make sure you are using the up-to-date versions of the SDK and LM Studio.
Note: This warning was produced by the client and is printed on the server for convenience.
[LLMProcess][ServerPort] Received communication warning from the client (channelUnknown): Received channelSend for unknown channel, channelId = 297
This is usually caused by communication protocol incompatibility. Please make sure you are using the up-to-date versions of the SDK and LM Studio.
Note: This warning was produced by the client and is printed on the server for convenience.
2026-06-03 11:03:19 [DEBUG]
LlamaV4::predict slot selection: session_id= server-selected (LCP/LRU)
2026-06-03 11:12:02 [DEBUG]
35.43.460.415 I slot release: id 0 | task 316 | stop processing: n_tokens = 1471, truncated = 0
35.43.460.440 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.640 (> 0.100 thold), f_keep = 0.644
35.43.460.472 I slot launch_slot_: id 0 | task 320 | processing task, is_child = 0
35.43.460.481 W slot update_slots: id 0 | task 320 | cache reuse is not supported - ignoring n_cache_reuse = 256
35.43.460.483 W slot update_slots: id 0 | task 320 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see ggml-org/llama.cpp#13194 (comment))
LM Studio 0.4.15
Windows 11
CPU llama.cpp 0.2.18
LLM: Gemma 4 31B it compressed
At around the time when the model should've output its reply, it went back to square one. "Full prompt re-processing due to lack of cache data." It also says something about a "client" and "server" although the Server is Stopped and everything is being done offline (supposedly). Or is localhost at the end of a request for whatever?
Relevant (I guess) part of the Log below; in short:
Note: This warning was produced by the client and is printed on the server for convenience.
[LLMProcess][ServerPort] Received communication warning from the client (channelUnknown): Received channelSend for unknown channel, channelId = 297
This is usually caused by communication protocol incompatibility. Please make sure you are using the up-to-date versions of the SDK and LM Studio.
Note: This warning was produced by the client and is printed on the server for convenience.
2026-06-03 11:02:46 [DEBUG]
26.26.871.357 I slot launch_slot_: id 0 | task 316 | processing task, is_child = 0
26.26.871.371 W slot update_slots: id 0 | task 316 | cache reuse is not supported - ignoring n_cache_reuse = 256
26.26.871.374 I slot update_slots: id 0 | task 316 | Checking checkpoint with [195, 1474] against 0...
26.26.871.375 W slot update_slots: id 0 | task 316 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see ggml-org/llama.cpp#13194 (comment))
26.26.871.379 W slot update_slots: id 0 | task 316 | erased invalidated context checkpoint (pos_min = 195, pos_max = 1474, n_tokens = 1475, n_swa = 1024, pos_next = 0, size = 1000.016 MiB)
2026-06-03 11:03:02 [DEBUG]
26.43.069.952 W srv stop: cancel task, id_task = 316
2026-06-03 11:03:02 [DEBUG]
[LLMProcess][ServerPort] Received communication warning from the client (channelUnknown): Received channelSend for unknown channel, channelId = 297
This is usually caused by communication protocol incompatibility. Please make sure you are using the up-to-date versions of the SDK and LM Studio.
Note: This warning was produced by the client and is printed on the server for convenience.
[LLMProcess][ServerPort] Received communication warning from the client (channelUnknown): Received channelSend for unknown channel, channelId = 297
This is usually caused by communication protocol incompatibility. Please make sure you are using the up-to-date versions of the SDK and LM Studio.
Note: This warning was produced by the client and is printed on the server for convenience.
2026-06-03 11:03:19 [DEBUG]
LlamaV4::predict slot selection: session_id= server-selected (LCP/LRU)
2026-06-03 11:12:02 [DEBUG]
35.43.460.415 I slot release: id 0 | task 316 | stop processing: n_tokens = 1471, truncated = 0
35.43.460.440 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.640 (> 0.100 thold), f_keep = 0.644
35.43.460.472 I slot launch_slot_: id 0 | task 320 | processing task, is_child = 0
35.43.460.481 W slot update_slots: id 0 | task 320 | cache reuse is not supported - ignoring n_cache_reuse = 256
35.43.460.483 W slot update_slots: id 0 | task 320 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see ggml-org/llama.cpp#13194 (comment))