Skip to content

Inference task type endpoints #3545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Mar 6, 2025
Merged

Conversation

jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented Jan 16, 2025

This PR makes breaking changes to the client for the inference API. Prior to this PR we had a single endpoint for most task types supported in the inference API: _inference/<optional_task_type>/<inference id>. After discussion with @swallez we decided to make the task type required in the URL. This way we could have separate requests and responses for each task type.

This PR does not include another item of work to make well defined task_settings for each route. Correct me if I'm wrong, but I don't believe that would be a breaking change? If it is not a breaking change, I think we can defer that work until later.

@@ -1,5 +1,5 @@
{
"inference.stream_inference": {
"inference.stream_completion": {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we might have a streaming endpoint for text embeddings for example.

*/
export class SparseEmbeddingInferenceResult {
// TODO should we make this optional if we ever support multiple encoding types? So we can make it a variant
sparse_embedding: Array<SparseEmbeddingResult>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see us having a variant here for a different type of response (like byte encoding for text embedding). That would be returned using the same URL so it wouldn't be a new response. Should we make this a variant and make sparse_embedding optional?

I suppose changing some from required to optional in the future would be a breaking change right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For text embeddings, the pattern used in the InferenceResult class (also in this file) is to have a different variant for each type.

  text_embedding_bytes?: Array<TextEmbeddingByteResult>
  text_embedding_bits?: Array<TextEmbeddingByteResult>
  text_embedding?: Array<TextEmbeddingResult>

Sparse would be the same:

  sparse_embedding?: Array<SparseEmbeddingResult>
  sparse_embedding_byte?: Array<SparseEmbeddingByteResult>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually when I make the sparse embedding type a variant, I get an error that indicates there must be multiple fields in the type to be able to leverage the variant type. So I think we can make this change when we need to.

* TextEmbeddingInferenceResult is an aggregation of mutually exclusive text_embedding variants
* @variants container
*/
export class TextEmbeddingInferenceResult {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here, one URL multiple response formats so keeping this as it was.

/**
* Defines the completion result.
*/
export class CompletionInferenceResult {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to other ideas for naming the classes. *Result was already taken for everything for the nested field which is why I went with *InferenceResult.

/**
* Query input.
*/
query: string
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query is required for the rerank task type.

/**
* Optional task settings
*/
task_settings?: TaskSettings
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this because I think it was missing before.

Copy link
Contributor

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

"url": {
"paths": [
{
"path": "/_inference/chat_completion/{inference_id}/_unified",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be _stream

Suggested change
"path": "/_inference/chat_completion/{inference_id}/_unified",
"path": "/_inference/chat_completion/{inference_id}/_stream",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it does have _unified at the end. I think technically we could remove it since the client code doesn't need the results to be in a specific format for SSE.

@pquentin
Copy link
Member

Just before merging this, please add the _json_spec contents to the rest-api-spec folder in the Elasticsearch repository. We're working on making this repository the source of truth, but we're not there yet and we currently have tooling that syncs rest-api-spec to _json_spec and breaks if they don't match.

Copy link
Contributor

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

1 similar comment
Copy link
Contributor

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

@jonathan-buttner
Copy link
Contributor Author

_json_spec

Sorry for the delay, getting back to this now. I created the PR here: elastic/elasticsearch#121078

Waiting to see if I need to fix the formatting 🤔 I just copied the files directly 🤷‍♂️

Copy link
Contributor

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"url": {
"paths": [
{
"path": "/_inference/chat_completion/{inference_id}/_unified",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"path": "/_inference/chat_completion/{inference_id}/_unified",
"path": "/_inference/chat_completion/{inference_id}/_stream",

*/
export class SparseEmbeddingInferenceResult {
// TODO should we make this optional if we ever support multiple encoding types? So we can make it a variant
sparse_embedding: Array<SparseEmbeddingResult>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For text embeddings, the pattern used in the InferenceResult class (also in this file) is to have a different variant for each type.

  text_embedding_bytes?: Array<TextEmbeddingByteResult>
  text_embedding_bits?: Array<TextEmbeddingByteResult>
  text_embedding?: Array<TextEmbeddingResult>

Sparse would be the same:

  sparse_embedding?: Array<SparseEmbeddingResult>
  sparse_embedding_byte?: Array<SparseEmbeddingByteResult>

Copy link
Contributor

github-actions bot commented Feb 5, 2025

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

Copy link
Contributor

github-actions bot commented Feb 5, 2025

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

Copy link
Contributor

github-actions bot commented Feb 5, 2025

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

Copy link
Contributor

github-actions bot commented Feb 5, 2025

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

@jonathan-buttner
Copy link
Contributor Author

Created a docs request here to address the docs migration failure: elastic/docs-content#339

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

github-actions bot commented Mar 5, 2025

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.put_watsonx Missing test Missing test
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

@jonathan-buttner jonathan-buttner merged commit 35551eb into main Mar 6, 2025
8 checks passed
@jonathan-buttner jonathan-buttner deleted the ml-inference-task-type-separation branch March 6, 2025 20:53
Copy link
Contributor

github-actions bot commented Mar 6, 2025

The backport to 8.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-8.x 8.x
# Navigate to the new working tree
cd .worktrees/backport-8.x
# Create a new branch
git switch --create backport-3545-to-8.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 35551ebba38abf6dea90317c1993cb0ce7faf44a
# Push it to GitHub
git push --set-upstream origin backport-3545-to-8.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-8.x

Then, create a pull request where the base branch is 8.x and the compare/head branch is backport-3545-to-8.x.

Copy link
Contributor

github-actions bot commented Mar 6, 2025

The backport to 8.18 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-8.18 8.18
# Navigate to the new working tree
cd .worktrees/backport-8.18
# Create a new branch
git switch --create backport-3545-to-8.18
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 35551ebba38abf6dea90317c1993cb0ce7faf44a
# Push it to GitHub
git push --set-upstream origin backport-3545-to-8.18
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-8.18

Then, create a pull request where the base branch is 8.18 and the compare/head branch is backport-3545-to-8.18.

github-actions bot pushed a commit that referenced this pull request Mar 6, 2025
* Refactoring inference endpoints

* Fixing stream completion url and removing the old url and class

* generating spec

* Adding doc id

* Renaming to match filename

* Switching to stream and regenerating files

* Using variant and adding _stream

* Removing variant

* Adding chat_completion and fixing update api

* Resolving conflicts

(cherry picked from commit 35551eb)
jonathan-buttner added a commit that referenced this pull request Mar 6, 2025
* Refactoring inference endpoints

* Fixing stream completion url and removing the old url and class

* generating spec

* Adding doc id

* Renaming to match filename

* Switching to stream and regenerating files

* Using variant and adding _stream

* Removing variant

* Adding chat_completion and fixing update api

* Resolving conflicts
jonathan-buttner added a commit that referenced this pull request Mar 6, 2025
* Refactoring inference endpoints

* Fixing stream completion url and removing the old url and class

* generating spec

* Adding doc id

* Renaming to match filename

* Switching to stream and regenerating files

* Using variant and adding _stream

* Removing variant

* Adding chat_completion and fixing update api

* Resolving conflicts
jonathan-buttner added a commit that referenced this pull request Mar 7, 2025
* Inference task type endpoints (#3545)

* Refactoring inference endpoints

* Fixing stream completion url and removing the old url and class

* generating spec

* Adding doc id

* Renaming to match filename

* Switching to stream and regenerating files

* Using variant and adding _stream

* Removing variant

* Adding chat_completion and fixing update api

* Resolving conflicts

* Generating correct output
jonathan-buttner added a commit that referenced this pull request Mar 7, 2025
* Refactoring inference endpoints

* Fixing stream completion url and removing the old url and class

* generating spec

* Adding doc id

* Renaming to match filename

* Switching to stream and regenerating files

* Using variant and adding _stream

* Removing variant

* Adding chat_completion and fixing update api

* Resolving conflicts

(cherry picked from commit 35551eb)

Co-authored-by: Jonathan Buttner <[email protected]>
jonathan-buttner added a commit that referenced this pull request Mar 7, 2025
* Inference task type endpoints (#3545)

* Refactoring inference endpoints

* Fixing stream completion url and removing the old url and class

* generating spec

* Adding doc id

* Renaming to match filename

* Switching to stream and regenerating files

* Using variant and adding _stream

* Removing variant

* Adding chat_completion and fixing update api

* Resolving conflicts

* Fixing merge conflicts
@pquentin
Copy link
Member

Sorry for dropping the ball on this and not reviewing it as we should have. This pull request introduces breaking changes we did not anticipate - we'll need to revisit the approach. We'll come back to you with specific recommendations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants