Better handling of Sagemaker models #11410

Jacobh2 · 2025-06-04T16:42:51Z

Dynamic key for request body and handle response being embeddings directly

First stab at fixing #11019 as well as better handling of the response format from Sagemaker.

This makes it possible to dynamically select what the key should be for the request body for embedding models. It defaults to the existing value for backwards compatability.

Also handles the response a bit better by allowing the embeddings to be directly the response (which it is for some models e.g. the https://huggingface.co/nomic-ai/nomic-embed-text-v1 one) and not only a dict with a key.

Not included yet: This only handles the LiteLLM cli part, but for myself I'd like this to be supported to set on a per-model basis in the proxy as well. Any hints on how to do this would be highly appreciated!

Relevant issues

#11019

Type

🆕 New Feature
🐛 Bug Fix

vercel · 2025-06-04T16:42:56Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 5, 2025 6:13am

krrishdholakia · 2025-06-04T17:08:21Z

litellm/llms/sagemaker/completion/handler.py

+        if isinstance(response, list):
+            embeddings = response
+        elif isinstance(response, dict):
+            embeddings = response["embedding"]


if 'embedding' does not exist - can we raise a helpful error (maybe with the dict keys received) ?

For sure! How about this?

There is also still the verbose print just above, which will log the entire response which can also help during debugging 🎉

Jacobh2 · 2025-06-04T17:11:10Z

Trying to understand the router logic, and if I have understood it correctly, maybe the litellm_params in the config.yaml file of LiteLLM Proxy already is provided as extra input? Meaning I just have to add it there:

model_list:
  - litellm_params:
      sagemaker_input_key: inputs
      ...

Is that correctly understood?

krrishdholakia · 2025-06-04T17:15:38Z

Yes that's right @Jacobh2

Jacobh2 · 2025-06-04T17:20:32Z

Tested this with the nomic-ai/nomic-embed-text-v1 model and with the custom key for input and support for handling the embeddings directly as a list I now properly get an embedding result 🎉

EmbeddingResponse(
    model="my-test-endpoint",
    data=[{"object": "embedding", "index": 0, "embedding": [-0.005336614, ..., -0.019391568]}],
    object="list",
    usage=Usage(
        completion_tokens=0, prompt_tokens=4, total_tokens=4, completion_tokens_details=None, prompt_tokens_details=None
    ),
)

…s directly

Jacobh2 · 2025-06-07T05:18:53Z

@krrishdholakia anything you'd like to change, or could we take this in?

krrishdholakia · 2025-06-07T14:39:23Z

@Jacobh2 Can you please add a unit test under test_litellm/

This will prevent future regressions

vercel bot deployed to Preview June 4, 2025 16:43 View deployment

Jacobh2 mentioned this pull request Jun 4, 2025

[Bug]: AWS Sagemaker embedding calls are failing with a Jina endpoint #11019

Open

krrishdholakia reviewed Jun 4, 2025

View reviewed changes

vercel bot deployed to Preview June 4, 2025 17:17 View deployment

Jacobh2 requested a review from krrishdholakia June 4, 2025 19:52

Jacobh2 added 3 commits June 5, 2025 08:11

Dynamic key for request body and also handle response being embedding…

3e389c5

…s directly

Update key name and add better error message

854af2a

Match new main

f6a1aec

Jacobh2 force-pushed the jh/dynamic-key-for-sagemaker branch from fc45c46 to f6a1aec Compare June 5, 2025 06:12

vercel bot deployed to Preview June 5, 2025 06:13 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Better handling of Sagemaker models #11410

Better handling of Sagemaker models #11410

Uh oh!

Jacobh2 commented Jun 4, 2025

Uh oh!

vercel bot commented Jun 4, 2025 •

edited

Loading

Uh oh!

krrishdholakia Jun 4, 2025

Uh oh!

Jacobh2 Jun 4, 2025

Uh oh!

Jacobh2 Jun 4, 2025

Uh oh!

Jacobh2 commented Jun 4, 2025

Uh oh!

krrishdholakia commented Jun 4, 2025

Uh oh!

Jacobh2 commented Jun 4, 2025

Uh oh!

Jacobh2 commented Jun 7, 2025

Uh oh!

krrishdholakia commented Jun 7, 2025

Uh oh!

Uh oh!

Uh oh!

Better handling of Sagemaker models #11410

Are you sure you want to change the base?

Better handling of Sagemaker models #11410

Uh oh!

Conversation

Jacobh2 commented Jun 4, 2025

Dynamic key for request body and handle response being embeddings directly

Relevant issues

Type

Uh oh!

vercel bot commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krrishdholakia Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

Jacobh2 Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

Jacobh2 Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

Jacobh2 commented Jun 4, 2025

Uh oh!

krrishdholakia commented Jun 4, 2025

Uh oh!

Jacobh2 commented Jun 4, 2025

Uh oh!

Jacobh2 commented Jun 7, 2025

Uh oh!

krrishdholakia commented Jun 7, 2025

Uh oh!

Uh oh!

vercel bot commented Jun 4, 2025 •

edited

Loading