Skip to content

_create_completion cache_item.input_ids.tolist() - AttributeError: 'NoneType' object has no attribute 'input_ids' #348

Closed
@snxraven

Description

@snxraven

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [ x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [ x] I carefully followed the README.md.
  • [ x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [ x] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

The chat completion endpoint should be processing chat and responding accordingly

Current Behavior

I see the following crash:

llama.cpp: loading model from ./models/ggml-vicuna-7b-1.1-q4_0.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 6096
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =    0.07 MB
llama_model_load_internal: mem required  = 5407.71 MB (+ 1026.00 MB per state)
.
llama_init_from_file: kv self size  = 3048.00 MB
AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
INFO:     Started server process [7]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://llama-python-server:8000 (Press CTRL+C to quit)
INFO:     172.18.0.3:54642 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/h11_impl.py", line 428, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 122, in __call__

Environment and Context

This is a docker-compose project, feel free to check how I have the installation configured:

https://git.ssh.surf/snxraven/llama-cpp-python-djs-bot

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingserver

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions