-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Update model name to gemini-embedding-001 in code snipets #13388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @Feiyang1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello everyone, gemini-code-assist here to provide a summary of this pull request. The primary intent of this PR, as indicated by the title and changes, is to update the model names used in several embedding code snippets within the generative_ai/embeddings
directory. Specifically, it replaces older model identifiers with the newer gemini-embedding-001
model name to ensure the examples are current and use the recommended model.
Highlights
- Model Name Update: This pull request updates the model name used for text embeddings in several example files from older versions (
textembedding-gecko@003
andtext-embedding-005
) to thegemini-embedding-001
model.
Changelog
- generative_ai/embeddings/batch_example.py
- Updated the model name from
textembedding-gecko@003
togemini-embedding-001
when loading theTextEmbeddingModel
on line 42.
- Updated the model name from
- generative_ai/embeddings/code_retrieval_example.py
- Updated the default
MODEL_NAME
constant fromtext-embedding-005
togemini-embedding-001
on line 20. - Updated the default
model_name
parameter in theembed_text
function fromtext-embedding-005
togemini-embedding-001
on line 27.
- Updated the default
- generative_ai/embeddings/document_retrieval_example.py
- Updated the model name from
text-embedding-005
togemini-embedding-001
when loading theTextEmbeddingModel
on line 35.
- Updated the model name from
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Old model name fades,
New Gemini shines so bright,
Examples updated.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR updates the embedding model name to gemini-embedding-001
across several example files. The changes are clear, concise, and directly address the stated goal. Thank you for making these updates!
My main feedback revolves around ensuring that gemini-embedding-001
is fully compatible with the existing parameters and usage patterns in these examples (such as task
types and output_dimensionality
). I've added specific comments in the files regarding this verification, as this is crucial for the correctness and utility of these samples.
Additionally, I noticed that several items in the PR checklist (like confirming tests pass and linting) are not yet marked as completed. Ensuring these checks are performed would increase confidence in the changes before merging.
Overall, the code modifications themselves are straightforward. Addressing the compatibility verification would be key.
Summary of Findings
- Model Compatibility Verification: The primary concern raised is the need to verify that the new
gemini-embedding-001
model is fully compatible with existing parameters (e.g.,task
types likeCODE_RETRIEVAL_QUERY
,RETRIEVAL_DOCUMENT
;output_dimensionality=256
) and usage patterns (e.g.,batch_predict
method, behavior ofTextEmbeddingModel.from_pretrained
) in the updated examples. This verification is crucial to ensure the correctness, effectiveness, and reliability of the code samples. - PR Checklist Completion: It was noted that several items in the PR checklist, such as test execution and linting, were not marked as completed. Completing these standard checks would enhance confidence in the proposed changes.
Merge Readiness
While the code changes to update the model name are simple and correctly implemented, I recommend REQUEST_CHANGES
at this time. This is primarily to ensure that the compatibility of the new gemini-embedding-001
model with the existing example code (parameters, task types, expected outputs, API usage) is thoroughly verified, as detailed in the review comments. Once this verification is confirmed and other checklist items (like tests and linting) are addressed, the PR should be in good shape for merging. As a reviewer, I am not authorized to approve the PR directly; further review and approval by authorized personnel will be needed.
@@ -39,7 +39,7 @@ def embed_text_batch() -> BatchPredictionJob: | |||
output_uri = OUTPUT_URI | |||
|
|||
textembedding_model = language_models.TextEmbeddingModel.from_pretrained( | |||
"textembedding-gecko@003" | |||
"gemini-embedding-001" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The model is updated to gemini-embedding-001
. This is a key change. Could you please confirm if this new model has been verified for full compatibility with the existing usage patterns in this and other updated examples?
Specifically for the examples touched in this PR:
- Are the
task
types used incode_retrieval_example.py
(CODE_RETRIEVAL_QUERY
,RETRIEVAL_DOCUMENT
) anddocument_retrieval_example.py
(RETRIEVAL_DOCUMENT
) supported and appropriate forgemini-embedding-001
? - Is the
output_dimensionality=256
(used incode_retrieval_example.py
anddocument_retrieval_example.py
) a valid, supported, and optimal setting forgemini-embedding-001
? - Does
gemini-embedding-001
(when used viaTextEmbeddingModel.from_pretrained
) behave as expected with thebatch_predict
method and the input/output formats used in this specificbatch_example.py
file?
Ensuring these compatibilities are verified is important for the correctness and reliability of these samples. If the new model has different characteristics or requirements, further code adjustments might be needed.
@@ -17,14 +17,14 @@ | |||
# [START generativeaionvertexai_embedding_code_retrieval] | |||
from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel | |||
|
|||
MODEL_NAME = "text-embedding-005" | |||
MODEL_NAME = "gemini-embedding-001" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With MODEL_NAME
now set to gemini-embedding-001
, it's important to ensure this model works correctly with the parameters used in this example. Specifically, have the task
types (CODE_RETRIEVAL_QUERY
and RETRIEVAL_DOCUMENT
, as used in the if __name__ == "__main__"
block) and the default dimensionality=256
been verified for compatibility and optimal performance with gemini-embedding-001
?
Different models can have varying support for task types or optimal dimensionalities, so confirming this would help ensure the example remains accurate and effective.
DIMENSIONALITY = 256 | ||
|
||
|
||
def embed_text( | ||
texts: list[str] = ["Retrieve a function that adds two numbers"], | ||
task: str = "CODE_RETRIEVAL_QUERY", | ||
model_name: str = "text-embedding-005", | ||
model_name: str = "gemini-embedding-001", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default model_name
in the embed_text
function signature is also updated to gemini-embedding-001
. This is consistent with the MODEL_NAME
constant change. The same verification question applies here: has this default been confirmed to work well with the function's other default parameters (like task
and dimensionality
) when using gemini-embedding-001
?
@@ -32,7 +32,7 @@ def embed_text() -> list[list[float]]: | |||
# The task type for embedding. Check the available tasks in the model's documentation. | |||
task = "RETRIEVAL_DOCUMENT" | |||
|
|||
model = TextEmbeddingModel.from_pretrained("text-embedding-005") | |||
model = TextEmbeddingModel.from_pretrained("gemini-embedding-001") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating the model to gemini-embedding-001
here is consistent with the other changes. As with the other files, could you confirm if the task = "RETRIEVAL_DOCUMENT"
and dimensionality = 256
settings have been verified as compatible and appropriate for gemini-embedding-001
? This verification is key to ensuring the example functions correctly and demonstrates best practices with the new model.
|
I'm glad we have tests to catch errors in code snippets. Updated the code and test. |
Description
Fixes #
Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.
Checklist
nox -s py-3.9
(see Test Environment Setup)nox -s lint
(see Test Environment Setup)