Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions ai/embedding-gemma.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Embedding Gemma

![logo](https://github.com/docker/model-cards/raw/refs/heads/main/logos/[email protected])

**Embedding Gemma** is a state-of-the-art text embedding model from Google DeepMind, designed to create high-quality vector representations of text. Built on the Gemma architecture, this model converts text into dense vector embeddings that capture semantic meaning, making it ideal for retrieval-augmented generation (RAG), semantic search, and similarity tasks. With open weights and efficient design, Embedding Gemma provides a powerful foundation for embedding-based applications. The GGUF format version is provided by Unsloth.

## Intended uses

Embedding Gemma is designed for applications requiring high-quality text embeddings:

- **Semantic search and retrieval**: Excellent for building search systems, document retrieval, and RAG applications that need to find semantically relevant content.
- **Text similarity and clustering**: Generate embeddings for measuring text similarity, document clustering, and content deduplication tasks.
- **Classification and downstream tasks**: Use embeddings as input features for various NLP classification tasks and machine learning pipelines.

## Characteristics

| Attribute | Details |
|---------------------- |--------------------------------------------------------------|
| **Provider** | Google DeepMind |
| **Architecture** | Gemma Embedding |
| **Cutoff date** | - |
| **Languages** | English |
| **Tool calling** | ❌ |
| **Input modalities** | Text |
| **Output modalities** | Embedding vectors |
| **License** | [Gemma Terms](https://ai.google.dev/gemma/terms) |

## Available model variants

| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
|----------------------------------------------------------------------|------------|--------------|----------------|----------|-----------|
| `ai/embedding-gemma:latest`<br><br>`ai/embedding-gemma:300M-Q8_0` | 300M | Q8_0 | 2K tokens | 0.95 GiB | 761.25 MB |
| `ai/embedding-gemma:300M-Q8_0` | 300M | Q8_0 | 2K tokens | 0.95 GiB | 761.25 MB |

¹: VRAM estimated based on model characteristics.

> `latest` → `300M-Q8_0`

## Use this AI model with Docker Model Runner

First, pull the model:

```bash
docker model pull ai/embedding-gemma
```

Then run the model:

```bash
docker model run ai/embedding-gemma
```

To generate embeddings using the API:

```bash
curl --location 'http://localhost:12434/engines/llama.cpp/v1/embeddings' \
--header 'Content-Type: application/json' \
--data '{
"model": "ai/embedding-gemma",
"input": "Your text to embed here"
}'
```

For more information on Docker Model Runner, [explore the documentation](https://docs.docker.com/desktop/features/model-runner/).

## Considerations

- **Context length**: The model supports up to 2K tokens. Longer texts may need to be chunked for optimal performance.
- **Language support**: Primarily trained on English text, performance on other languages may vary.
- **Embedding dimension**: The model produces 768-dimensional embeddings suitable for most downstream tasks.
- **Normalization**: Embeddings are normalized by default, making them suitable for cosine similarity calculations.

## Benchmark performance

| Task Category | Embedding Gemma |
|---------------------|----------------|
| Retrieval | 54.87 |
| STS | 78.53 |
| Classification | 73.26 |
| Clustering | 44.72 |
| Pair Classification| 85.94 |
| Reranking | 59.36 |

## Links

- [Embedding Gemma Model Card](https://huggingface.co/google/embeddinggemma-300m)
- [Unsloth GGUF Version](https://huggingface.co/unsloth/embeddinggemma-300m-GGUF)
- [Gemma Model Family](https://ai.google.dev/gemma/docs)
- [Gemma Terms of Use](https://ai.google.dev/gemma/terms)
Loading