Skip to content

Update model name to gemini-embedding-001 in code snipets #13388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Feiyang1
Copy link

Description

Fixes #

Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.

Checklist

@Feiyang1 Feiyang1 requested review from a team as code owners May 23, 2025 23:40
@product-auto-label product-auto-label bot added the samples Issues that are directly related to samples. label May 23, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @Feiyang1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello everyone, gemini-code-assist here to provide a summary of this pull request. The primary intent of this PR, as indicated by the title and changes, is to update the model names used in several embedding code snippets within the generative_ai/embeddings directory. Specifically, it replaces older model identifiers with the newer gemini-embedding-001 model name to ensure the examples are current and use the recommended model.

Highlights

  • Model Name Update: This pull request updates the model name used for text embeddings in several example files from older versions (textembedding-gecko@003 and text-embedding-005) to the gemini-embedding-001 model.

Changelog

  • generative_ai/embeddings/batch_example.py
    • Updated the model name from textembedding-gecko@003 to gemini-embedding-001 when loading the TextEmbeddingModel on line 42.
  • generative_ai/embeddings/code_retrieval_example.py
    • Updated the default MODEL_NAME constant from text-embedding-005 to gemini-embedding-001 on line 20.
    • Updated the default model_name parameter in the embed_text function from text-embedding-005 to gemini-embedding-001 on line 27.
  • generative_ai/embeddings/document_retrieval_example.py
    • Updated the model name from text-embedding-005 to gemini-embedding-001 when loading the TextEmbeddingModel on line 35.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Old model name fades,
New Gemini shines so bright,
Examples updated.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR updates the embedding model name to gemini-embedding-001 across several example files. The changes are clear, concise, and directly address the stated goal. Thank you for making these updates!

My main feedback revolves around ensuring that gemini-embedding-001 is fully compatible with the existing parameters and usage patterns in these examples (such as task types and output_dimensionality). I've added specific comments in the files regarding this verification, as this is crucial for the correctness and utility of these samples.

Additionally, I noticed that several items in the PR checklist (like confirming tests pass and linting) are not yet marked as completed. Ensuring these checks are performed would increase confidence in the changes before merging.

Overall, the code modifications themselves are straightforward. Addressing the compatibility verification would be key.

Summary of Findings

  • Model Compatibility Verification: The primary concern raised is the need to verify that the new gemini-embedding-001 model is fully compatible with existing parameters (e.g., task types like CODE_RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT; output_dimensionality=256) and usage patterns (e.g., batch_predict method, behavior of TextEmbeddingModel.from_pretrained) in the updated examples. This verification is crucial to ensure the correctness, effectiveness, and reliability of the code samples.
  • PR Checklist Completion: It was noted that several items in the PR checklist, such as test execution and linting, were not marked as completed. Completing these standard checks would enhance confidence in the proposed changes.

Merge Readiness

While the code changes to update the model name are simple and correctly implemented, I recommend REQUEST_CHANGES at this time. This is primarily to ensure that the compatibility of the new gemini-embedding-001 model with the existing example code (parameters, task types, expected outputs, API usage) is thoroughly verified, as detailed in the review comments. Once this verification is confirmed and other checklist items (like tests and linting) are addressed, the PR should be in good shape for merging. As a reviewer, I am not authorized to approve the PR directly; further review and approval by authorized personnel will be needed.

@@ -39,7 +39,7 @@ def embed_text_batch() -> BatchPredictionJob:
output_uri = OUTPUT_URI

textembedding_model = language_models.TextEmbeddingModel.from_pretrained(
"textembedding-gecko@003"
"gemini-embedding-001"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The model is updated to gemini-embedding-001. This is a key change. Could you please confirm if this new model has been verified for full compatibility with the existing usage patterns in this and other updated examples?

Specifically for the examples touched in this PR:

  • Are the task types used in code_retrieval_example.py (CODE_RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT) and document_retrieval_example.py (RETRIEVAL_DOCUMENT) supported and appropriate for gemini-embedding-001?
  • Is the output_dimensionality=256 (used in code_retrieval_example.py and document_retrieval_example.py) a valid, supported, and optimal setting for gemini-embedding-001?
  • Does gemini-embedding-001 (when used via TextEmbeddingModel.from_pretrained) behave as expected with the batch_predict method and the input/output formats used in this specific batch_example.py file?

Ensuring these compatibilities are verified is important for the correctness and reliability of these samples. If the new model has different characteristics or requirements, further code adjustments might be needed.

@@ -17,14 +17,14 @@
# [START generativeaionvertexai_embedding_code_retrieval]
from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel

MODEL_NAME = "text-embedding-005"
MODEL_NAME = "gemini-embedding-001"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

With MODEL_NAME now set to gemini-embedding-001, it's important to ensure this model works correctly with the parameters used in this example. Specifically, have the task types (CODE_RETRIEVAL_QUERY and RETRIEVAL_DOCUMENT, as used in the if __name__ == "__main__" block) and the default dimensionality=256 been verified for compatibility and optimal performance with gemini-embedding-001?

Different models can have varying support for task types or optimal dimensionalities, so confirming this would help ensure the example remains accurate and effective.

DIMENSIONALITY = 256


def embed_text(
texts: list[str] = ["Retrieve a function that adds two numbers"],
task: str = "CODE_RETRIEVAL_QUERY",
model_name: str = "text-embedding-005",
model_name: str = "gemini-embedding-001",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The default model_name in the embed_text function signature is also updated to gemini-embedding-001. This is consistent with the MODEL_NAME constant change. The same verification question applies here: has this default been confirmed to work well with the function's other default parameters (like task and dimensionality) when using gemini-embedding-001?

@@ -32,7 +32,7 @@ def embed_text() -> list[list[float]]:
# The task type for embedding. Check the available tasks in the model's documentation.
task = "RETRIEVAL_DOCUMENT"

model = TextEmbeddingModel.from_pretrained("text-embedding-005")
model = TextEmbeddingModel.from_pretrained("gemini-embedding-001")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Updating the model to gemini-embedding-001 here is consistent with the other changes. As with the other files, could you confirm if the task = "RETRIEVAL_DOCUMENT" and dimensionality = 256 settings have been verified as compatible and appropriate for gemini-embedding-001? This verification is key to ensuring the example functions correctly and demonstrates best practices with the new model.

@glasnt glasnt added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 26, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 26, 2025
@glasnt
Copy link
Contributor

glasnt commented May 26, 2025

Unable to submit request because batchSize must be 1 but the entered value was 3.

@glasnt glasnt added the waiting-response Waiting for the author's response. label May 26, 2025
@Feiyang1
Copy link
Author

Unable to submit request because batchSize must be 1 but the entered value was 3.

I'm glad we have tests to catch errors in code snippets. Updated the code and test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
samples Issues that are directly related to samples. waiting-response Waiting for the author's response.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants