Skip to content

server : fix incorrect num_tokens_predicted #3480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 5, 2023

Conversation

jhen0409
Copy link
Collaborator

@jhen0409 jhen0409 commented Oct 5, 2023

  • Make it only counting num_tokens_predicted for token generation
  • Remove the assertion in format_timings, the assertion might be wrong because n_eval is not equal to the count of generated tokens. For instance we set n_batch = 1, the count of timings.n_eval will include prompt.

@ggerganov ggerganov merged commit e8b8d32 into ggml-org:master Oct 5, 2023
joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 6, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp:
  kv cache slot search improvements (ggml-org#3493)
  prompts : fix editorconfig checks after ggml-org#3416
  parallel : add option to load external prompt file (ggml-org#3416)
  server : reuse llama_sample_token common util (ggml-org#3494)
  llama : correct hparams comparison (ggml-org#3446)
  ci : fix xcodebuild destinations (ggml-org#3491)
  convert : update Falcon script for new HF config (ggml-org#3448)
  build : use std::make_tuple() for compatibility with older GCC versions (ggml-org#3488)
  common : process escape sequences in reverse prompts (ggml-org#3461)
  CLBlast: Fix handling of on-device tensor data
  server : fix incorrect num_tokens_predicted (ggml-org#3480)
  swift : disable ACCELERATE_NEW_LAPACK (ggml-org#3481)
  ci : add swift build via xcodebuild (ggml-org#3482)
yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants