Skip to content

Conversation

@nayihz
Copy link
Contributor

@nayihz nayihz commented Mar 12, 2025

What this PR does / why we need it

Which issue(s) this PR fixes

Fixes ##274

Special notes for your reviewer

Does this PR introduce a user-facing change?


@InftyAI-Agent InftyAI-Agent added needs-triage Indicates an issue or PR lacks a label and requires one. needs-priority Indicates a PR lacks a label and requires one. do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Mar 12, 2025
@InftyAI-Agent InftyAI-Agent requested a review from kerthcet March 12, 2025 12:15
@nayihz
Copy link
Contributor Author

nayihz commented Mar 12, 2025

/kind test
/kind feature

@nayihz
Copy link
Contributor Author

nayihz commented Mar 12, 2025

Error: test/util/validation/validate_service.go:346:23: Error return value of `resp.Body.Close` is not checked (errcheck)
  	defer resp.Body.Close()
  	                     ^
  Error: test/util/validation/validate_service.go:[37](https://github.com/InftyAI/llmaz/actions/runs/13811392705/job/38633611131?pr=310#step:4:39)7:23: Error return value of `resp.Body.Close` is not checked (errcheck)
  	defer resp.Body.Close()
  	                     ^

Something wrong with golang ci lint?

@nayihz
Copy link
Contributor Author

nayihz commented Mar 12, 2025

/kind feature

@InftyAI-Agent InftyAI-Agent added feature Categorizes issue or PR as related to a new feature. and removed do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Mar 12, 2025
@kerthcet
Copy link
Member

Error: test/util/validation/validate_service.go:346:23: Error return value of `resp.Body.Close` is not checked (errcheck)
  	defer resp.Body.Close()
  	                     ^
  Error: test/util/validation/validate_service.go:[37](https://github.com/InftyAI/llmaz/actions/runs/13811392705/job/38633611131?pr=310#step:4:39)7:23: Error return value of `resp.Body.Close` is not checked (errcheck)
  	defer resp.Body.Close()
  	                     ^

Something wrong with golang ci lint?

This is because resp.Body.Close() will return a value, but we don't validate the return. We can silent the lint I think. Or we have to do something like:

defer func() {
   _ = resp.Body.Close()
}()

return nil
}

func CheckLlamacppServeAvaliable(localPort int) error {
Copy link
Member

@kerthcet kerthcet Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we have two different Check here? I think they're both OpenAI compatible, only one client makes more sense I think. Maybe we can use the https://github.com/openai/openai-go? The benefit is it supports more rich features like constructing the system prompt and chat conversation, or we have to build the structure ourself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's support OpenAI API compatible chat completions first. It's a standard protocol across most of the inference engines.

@nayihz
Copy link
Contributor Author

nayihz commented Mar 13, 2025

This is because resp.Body.Close() will return a value, but we don't validate the return. We can silent the lint I think.

I see the same code in other project like kubernetes/lws, but no lint error in CI.

@nayihz nayihz changed the title {WIP}feat: add e2e test to verify service is avaliable feat: add e2e test to verify service is avaliable Mar 13, 2025
Copy link
Member

@kerthcet kerthcet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. Thanks @nayihz

@nayihz nayihz force-pushed the feat_e2e branch 4 times, most recently from 7c31097 to c972cba Compare March 14, 2025 08:23
@nayihz
Copy link
Contributor Author

nayihz commented Mar 14, 2025

All comments addressed. @kerthcet
e2e-test took a very long time.

playground e2e tests Deploy a huggingface model with customized backendRuntime
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:96
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
• [1075.442 seconds]

Will it be rate-limited by huggingFace if we run ci many times? I suspect that most of the time is spent on pulling the model.

Copy link
Member

@kerthcet kerthcet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one nit.

@kerthcet
Copy link
Member

All comments addressed. @kerthcet e2e-test took a very long time.

playground e2e tests Deploy a huggingface model with customized backendRuntime
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:96
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
• [1075.442 seconds]

Will it be rate-limited by huggingFace if we run ci many times? I suspect that most of the time is spent on pulling the model.

I think it's unrelated to this PR, right? We just check the service ready.

@kerthcet
Copy link
Member

em.. seems we used to run e2e tests with minutes. the last record.

succeeded  in 5m 55s

@kerthcet
Copy link
Member

/retest all

@kerthcet
Copy link
Member

Once model is downloaded, loading the model into memory still takes time.

@kerthcet
Copy link
Member

I rerun the tests, it finished in 6mins. I think the time is acceptable.

succeeded  in 6m 11s

@kerthcet
Copy link
Member

rerun again to see the final result.

@kerthcet
Copy link
Member

Seems stuck here, model is already downloaded.

playground e2e tests Deploy a huggingface model with llama.cpp
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:77
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080

@kerthcet
Copy link
Member

Have no idea whether is because of the portforward or the resource contention. May take a look later.

playground e2e tests Deploy a huggingface model with llama.cpp
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:77
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
• [985.885 seconds]
------------------------------
playground e2e tests Deploy a huggingface model with customized backendRuntime
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:96
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
• [964.528 seconds]
------------------------------
playground e2e tests Deploy a huggingface model with llama.cpp, HPA enabled
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:122
• [16.392 seconds]

}()
<-readyChan
return check()
}).Should(gomega.Succeed())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not use eventually here, I don't think we want to forward the port for several time, eventually is used for status check, so we can wrap the check() function with eventually. Generally it looks like:

func ValidateServiceAvaliable() {
  // port forward logic
  select {
  case <-readyChan:
  case <-time.After(TIMEOUT):
    return fmt.Errorf("port forwarding timeout")
  }

  Eventually ({check()}, TIMEOUT, INTERVAL)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. PTAL

@kerthcet
Copy link
Member

/lgtm
/approve

Let's make it happen now, we can focus on the performance later! Thanks @nayihz

@InftyAI-Agent InftyAI-Agent added lgtm Looks good to me, indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 17, 2025
@InftyAI-Agent InftyAI-Agent merged commit 5adc074 into InftyAI:main Mar 17, 2025
19 checks passed
@nayihz nayihz deleted the feat_e2e branch March 17, 2025 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. feature Categorizes issue or PR as related to a new feature. lgtm Looks good to me, indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a label and requires one. needs-triage Indicates an issue or PR lacks a label and requires one.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants