docker · ilopezluna · May 6, 2025 · Apr 26, 2025 · Apr 26, 2025 · Apr 26, 2025
diff --git a/.gitignore b/.gitignore
@@ -1,2 +1,4 @@
 .idea
-.DS_Store
+.DS_Store
+
+bin
diff --git a/README.md b/README.md
@@ -24,7 +24,7 @@ Distilled LLaMA by DeepSeek, fast and optimized for real-world tasks.
 ![Gemma Logo](https://github.com/docker/model-cards/raw/refs/heads/main/logos/[email protected])
 
 📌 **Description:**  
-Google’s latest Gemma, small yet strong for chat and generation
+Google's latest Gemma, small yet strong for chat and generation
 
 📂 **Model File:** [`ai/gemma3.md`](ai/gemma3.md)
 
@@ -37,7 +37,7 @@ Google’s latest Gemma, small yet strong for chat and generation
 ![Meta Logo](https://github.com/docker/model-cards/raw/refs/heads/main/logos/[email protected])
 
 📌 **Description:**  
-Meta’s LLaMA 3.1: Chat-focused, benchmark-strong, multilingual-ready.
+Meta's LLaMA 3.1: Chat-focused, benchmark-strong, multilingual-ready.
 
 📂 **Model File:** [`ai/llama3.1.md`](ai/llama3.1.md)
 
@@ -111,7 +111,7 @@ A state-of-the-art English language embedding model developed by Mixedbread AI.
 ![Microsoft Logo](https://github.com/docker/model-cards/raw/refs/heads/main/logos/[email protected])
 
 📌 **Description:**  
-Microsoft’s compact model, surprisingly capable at reasoning and code.
+Microsoft's compact model, surprisingly capable at reasoning and code.
 
 📂 **Model File:** [`ai/phi4.md`](ai/phi4.md)
 
@@ -152,11 +152,49 @@ Experimental Qwen variant—lean, fast, and a bit mysterious.
 📌 **Description:**  
 A compact language model, designed to run efficiently on-device while performing a wide range of language tasks 
 
-📂 **Model File:** [`ai/smolllm2.md`](ai/smollm2.md)
+📂 **Model File:** [`ai/smollm2.md`](ai/smollm2.md)
 
 **URLs:**
 - https://huggingface.co/HuggingFaceTB/SmolLM2-360M-Instruct
 - https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct
 
 ---
 
+## 🔧 CLI Usage
+
+The model-cards-cli tool provides commands to inspect and update model information:
+
+### Inspect Command
+```bash
+# Basic inspection
+make inspect REPOSITORY=ai/smollm2
+
+# Inspect specific tag
+make inspect REPOSITORY=ai/smollm2 TAG=360M-Q4_K_M
+
+# Show all metadata
+make inspect REPOSITORY=ai/smollm2 OPTIONS="--all"
+```
+
+### Update Command
+```bash
+# Update all models
+make run
+
+# Update specific model
+make run-single MODEL=ai/smollm2.md
+```
+
+### Available Options
+
+#### Inspect Command Options
+- `REPOSITORY`: (Required) The repository to inspect (e.g., `ai/smollm2`)
+- `TAG`: (Optional) Specific tag to inspect (e.g., `360M-Q4_K_M`)
+- `OPTIONS`: (Optional) Additional options:
+  - `--all`: Show all metadata fields
+  - `--log-level`: Set log level (debug, info, warn, error)
+
+#### Update Command Options
+- `MODEL`: (Required for run-single) Specific model file to update (e.g., `ai/smollm2.md`)
+- `--log-level`: Set log level (debug, info, warn, error)
+
diff --git a/ai/deepcoder-preview.md b/ai/deepcoder-preview.md
@@ -32,12 +32,14 @@ DeepCoder-14B is purpose-built for advanced code reasoning, programming task sol
 
 ## Available model variants
 
-| Model variant                | Parameters | Quantization | Context window | VRAM  | Size    |
-|------------------------------|------------|--------------|----------------|--------|--------|
-| `deepcoder-preview:14B-F16`    | 14.77B     | F16          | 131,072        | 24GB¹  | 29.5GB |
-| `deepcoder-preview:14B:latest` <br><br> `deepcoder-preview:14B-Q4_K_M` | 14.77B     | Q4_K_M       | 131,072        | 8GB¹   | 9GB    |
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/deepcoder-preview:latest`<br><br>`ai/deepcoder-preview:14B-Q4_K_M` | 14B | IQ2_XXS/Q4_K_M | 131K tokens | 4.03 GB | 8.37 GB |
+| `ai/deepcoder-preview:14B-F16` | 14B | F16 | 131K tokens | 31.29 GB | 27.51 GB |
 
-¹: VRAM estimated based on GGUF model characteristics.
+¹: VRAM estimated based on model characteristics.
+
+> `latest` → `14B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/deepseek-r1-distill-llama.md b/ai/deepseek-r1-distill-llama.md
@@ -33,15 +33,17 @@ i: Estimated
 
 ## Available model variants
 
-| Model Variant                                                                      | Parameters | Quantization   | Context Window  | VRAM     | Size  |
-|------------------------------------------------------------------------------------|----------- |----------------|---------------- |--------- |-------|
-| `ai/deepseek-r1-distill-llama:70B-Q4_K_M`                                          | 70B        | IQ2_XXS/Q4_K_M | 128K tokens     | 42GB¹    | 42GB  |
-| `ai/deepseek-r1-distill-llama:8B-F16`                                              | 8B         | F16            | 128K tokens     | 19.2GB¹  | 16GB  |
-| `ai/deepseek-r1-distill-llama:latest`<br><br>`ai/deepseek-r1-distill-llama:8B-Q4_K_M`                                           | 8B         | IQ2_XXS/Q4_K_M | 128K tokens     | 4.5GB¹   | 5GB   |
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/deepseek-r1-distill-llama:latest`<br><br>`ai/deepseek-r1-distill-llama:8B-Q4_K_M` | 8B | IQ2_XXS/Q4_K_M | 131K tokens | 2.31 GB | 4.58 GB |
+| `ai/deepseek-r1-distill-llama:70B-Q4_0` | 70B | Q4_0 | 131K tokens | 44.00 GB | 37.22 GB |
+| `ai/deepseek-r1-distill-llama:70B-Q4_K_M` | 70B | IQ2_XXS/Q4_K_M | 131K tokens | 20.17 GB | 39.59 GB |
+| `ai/deepseek-r1-distill-llama:8B-F16` | 8B | F16 | 131K tokens | 17.88 GB | 14.96 GB |
+| `ai/deepseek-r1-distill-llama:8B-Q4_0` | 8B | Q4_0 | 131K tokens | 5.03 GB | 4.33 GB |
 
 ¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `70B-Q4_K_M`
+> `latest` → `8B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/gemma3-qat.md b/ai/gemma3-qat.md
@@ -36,17 +36,16 @@ Gemma 3 4B model can be used for:
 
 ## Available model variants
 
-| Model variant                                           | Parameters | Quantization   | Context window | VRAM      | Size   | 
-|-------------------------------------------------------- |----------- |----------------|--------------- |---------- |------- |
-| `ai/gemma3-qat:1B-Q4_K_M`                               | 1B         | IQ2_XXS/Q4_K_M | 32K tokens     |  0.892GB¹ | 0.95GB |
-| `ai/gemma3-qat:latest`<br><br>`ai/gemma3-qat:4B-Q4_K_M` | 4B         | IQ2_XXS/Q4_K_M | 128K tokens    |  3.4GB¹   | 2.93GB |
-| `ai/gemma3-qat:12B-Q4_K_M`                              | 12B        | IQ2_XXS/Q4_K_M | 128K tokens    |  8.7GB¹   | 7.52GB |
-| `ai/gemma3-qat:27B-Q4_K_M`                              | 27B        | IQ2_XXS/Q4_K_M | 128K tokens    |  21GB¹    | 16GB   |
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/gemma3-qat:latest`<br><br>`ai/gemma3-qat:4B-Q4_K_M` | 3.88 B | Q4_0 | 131K tokens | 5.44 GB | 2.93 GB |
+| `ai/gemma3-qat:1B-Q4_K_M` | 999.89 M | Q4_0 | 33K tokens | 5.02 GB | 950.82 MB |
+| `ai/gemma3-qat:27B-Q4_K_M` | 27.01 B | Q4_0 | 131K tokens | 20.28 GB | 16.04 GB |
+| `ai/gemma3-qat:12B-Q4_K_M` | 11.77 B | Q4_0 | 131K tokens | 9.80 GB | 7.51 GB |
 
-¹: VRAM extracted from Gemma documentation ([link](https://ai.google.dev/gemma/docs/core#128k-context)).  
-These are rough estimations. QAT models should use much less memory compared to the standard Gemma3 models
+¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `4B-Q4_K_M`
+> `latest` → `4B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/gemma3.md b/ai/gemma3.md
@@ -30,16 +30,17 @@ Gemma 3 4B model can be used for:
 
 ## Available model variants
 
-| Model Variant                                   | Parameters | Quantization   | Context Window | VRAM      | Size   | 
-|-------------------------------------------------|----------- |----------------|--------------- |---------- |------- |
-| `ai/gemma3:1B-F16`                              | 1B         | F16            | 32K tokens     |  1.5GB¹   | 1.86GB |
-| `ai/gemma3:1B-Q4_K_M`                           | 1B         | IQ2_XXS/Q4_K_M | 32K tokens     |  0.892GB¹ | 0.76GB |
-| `ai/gemma3:4B-F16`                              | 4B         | F16            | 128K tokens    |  6.4GB¹   | 7.23GB | 
-| `ai/gemma3:latest`<br><br>`ai/gemma3:4B-Q4_K_M` | 4B         | IQ2_XXS/Q4_K_M | 128K tokens    |  3.4GB¹   | 2.31GB | 
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/gemma3:latest`<br><br>`ai/gemma3:4B-Q4_K_M` | 4B | IQ2_XXS/Q4_K_M | 131K tokens | 4.15 GB | 2.31 GB |
+| `ai/gemma3:4B-F16` | 4B | F16 | 131K tokens | 11.94 GB | 7.23 GB |
+| `ai/gemma3:4B-Q4_0` | 4B | Q4_0 | 131K tokens | 5.51 GB | 2.19 GB |
+| `ai/gemma3:1B-F16` | 1B | F16 | 33K tokens | 6.62 GB | 1.86 GB |
+| `ai/gemma3:1B-Q4_K_M` | 1B | IQ2_XXS/Q4_K_M | 33K tokens | 4.68 GB | 762.49 MB |
 
-¹: VRAM extracted from Gemma documentation ([link](https://ai.google.dev/gemma/docs/core#128k-context))
+¹: VRAM estimated based on model characteristics.
 
-`:latest`→ `4B-Q4_K_M`
+> `latest` → `4B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/llama3.1.md b/ai/llama3.1.md
@@ -31,14 +31,14 @@
 
 ## Available model variants
 
-| Model variant                                        | Parameters | Quantization   | Context window | VRAM      | Size   | 
-|----------------------------------------------------- |----------- |--------------- |--------------- |---------- |------- |
-| `ai/llama3.1:latest`<br><br>`ai/llama3.1:8B-Q4_K_M`  | 8B         | Q4_K_M         | 128K           | 4.8GB¹    | 5GB    |
-| `ai/llama3.1:8B-F16`                                 | 8B         | F16            | 128K           | 19.2GB¹   | 16GB   | 
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/llama3.1:latest`<br><br>`ai/llama3.1:8B-Q4_K_M` | 8B | IQ2_XXS/Q4_K_M | 131K tokens | 2.31 GB | 4.58 GB |
+| `ai/llama3.1:8B-F16` | 8B | F16 | 131K tokens | 17.88 GB | 14.96 GB |
 
-¹: VRAM estimates based on model characteristics.
+¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `8B-Q4_K_M`
+> `latest` → `8B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/llama3.2.md b/ai/llama3.2.md
@@ -29,16 +29,18 @@ Llama 3.2 instruct models are designed for:
 
 ## Available model variants
 
-| Model Variant                                       | Parameters | Quantization | Context window | VRAM   | Size  | 
-|---------------------------------------------------- |------------|--------------|----------------|--------|-------|
-| `ai/llama3.2:3B-F16`                                | 3B         | F16          | 128k tokens    | 7.2GB¹ | 6GB   |
-| `ai/llama3.2:latest`<br><br>`ai/llama3.2:3B-Q4_K_M` | 3B         | Q4_K_M       | 128K tokens    | 1.8GB¹ | 1.8GB | 
-| `ai/llama3.2:1B-F16`                                | 1B         | F16          | 128K tokens    | 2.4GB¹ | 2.3GB |
-| `ai/llama3.2:1B-Q8_0`                               | 1B         | Q8_0         | 128K tokens    | 1.2GB¹ | 1.2GB | 
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/llama3.2:latest`<br><br>`ai/llama3.2:3B-Q4_K_M` | 3B | IQ2_XXS/Q4_K_M | 131K tokens | 3.26 GB | 1.87 GB |
+| `ai/llama3.2:1B-Q8_0` | 1B | Q8_0 | 131K tokens | 1.19 GB | 1.22 GB |
+| `ai/llama3.2:3B-F16` | 3B | F16 | 131K tokens | 9.11 GB | 5.98 GB |
+| `ai/llama3.2:3B-Q4_0` | 3B | Q4_0 | 131K tokens | 4.29 GB | 1.78 GB |
+| `ai/llama3.2:1B-F16` | 1B | F16 | 131K tokens | 2.24 GB | 2.30 GB |
+| `ai/llama3.2:1B-Q4_0` | 1B | Q4_0 | 131K tokens | 0.63 GB | 727.75 MB |
 
 ¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `3B-Q4_K_M`
+> `latest` → `3B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/llama3.3.md b/ai/llama3.3.md
@@ -33,13 +33,14 @@ Meta Llama 3.3 is a powerful 70B parameter multilingual language model designed
 
 ## Available model variants
 
-| Model variant                                        | Parameters | Quantization   | Context window | VRAM      | Size   | 
-|----------------------------------------------------- |----------- |--------------- |--------------- |---------- |------- |
-| `ai/llama3.3:latest`<br><br>`ai/llama3.3:70B-Q4_K_M` | 70B        | Q4_K_M         | 128K           | 42GB¹     | 42.5GB | 
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/llama3.3:latest`<br><br>`ai/llama3.3:70B-Q4_K_M` | 70B | IQ2_XXS/Q4_K_M | 131K tokens | 20.17 GB | 39.59 GB |
+| `ai/llama3.3:70B-Q4_0` | 70B | Q4_0 | 131K tokens | 44.00 GB | 37.22 GB |
 
-¹: VRAM estimates based on model characteristics.
+¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `70B-Q4_K_M`
+> `latest` → `70B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/mistral-nemo.md b/ai/mistral-nemo.md
@@ -28,13 +28,13 @@ Mistral-Nemo-Instruct-2407 is designed for instruction-following tasks and multi
 
 ## Available model variants
 
-| Model Variant                                                | Parameters | Quantization | Context window | VRAM   | Size  |
-|--------------------------------------------------------------|------------|--------------|----------------|--------|-------|
-| `ai/mistral-nemo:latest`<br><br>`ai/mistral-nemo:12B-Q4_K_M` | 12B        | Q4_K_M       | 128k tokens    | 7GB¹   | 7.1 GB|
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/mistral-nemo:latest`<br><br>`ai/mistral-nemo:12B-Q4_K_M` | 12B | IQ2_XXS/Q4_K_M | 131K tokens | 3.46 GB | 6.96 GB |
 
 ¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `12B-Q4_K_M` 
+> `latest` → `12B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/mistral.md b/ai/mistral.md
@@ -35,14 +35,15 @@ i: Estimated
 
 ## Available model variants
 
-| Model variant                                      | Parameters | Quantization   | Context window | VRAM    | Size   | 
-|----------------------------------------------------|----------- |--------------- |----------------|---------|--------|
-| `ai/mistral:latest`<br><br>`ai/mistral:7B-Q4_K_M`  | 7B         | IQ2_XXS/Q4_K_M | 32K            | 4.2B¹   | 4.3GB  | 
-| `ai/mistral:7B-F16`                                | 7B         | F16            | 32K            | 16.8¹   | 14.5GB |
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/mistral:latest`<br><br>`ai/mistral:7B-Q4_K_M` | 7B | IQ2_XXS/Q4_K_M | 33K tokens | 2.02 GB | 4.07 GB |
+| `ai/mistral:7B-F16` | 7B | F16 | 33K tokens | 15.65 GB | 13.50 GB |
+| `ai/mistral:7B-Q4_0` | 7B | Q4_0 | 33K tokens | 4.40 GB | 3.83 GB |
 
-¹: VRAM estimated based on model characteristics and quantization.
+¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `7B-Q4_K_M` 
+> `latest` → `7B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/mxbai-embed-large.md b/ai/mxbai-embed-large.md
@@ -27,13 +27,13 @@ mxbai-embed-large-v1 is designed for generating sentence embeddings suitable for
 
 ## Available model variants
 
-| Model Variant                                                 | Parameters | Quantization   | Context window | VRAM      | Size   | 
-|-------------------------------------------------------------- |----------- |--------------- |--------------- |---------- |------- |
-| `ai/mxbai-embed-large:latest`<br><br>`ai/mxbai-embed-large:335M-F16` | 335M       | F16            | 512 tokens     | 0.8GB¹    | 670MB  | 
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/mxbai-embed-large:latest`<br><br>`ai/mxbai-embed-large:335M-F16` | 334.09 M | F16 | 512 tokens | 0.80 GB | 638.85 MB |
 
-¹: VRAM estimates based on model characteristics.
+¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `mxbai-embed-large:335M-F16`
+> `latest` → `335M-F16`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/phi4.md b/ai/phi4.md
@@ -27,14 +27,15 @@ Phi-4 is designed for:
 
 ## Available model variants
 
-| Model Variant                                | Parameters | Quantization   | Context window | VRAM     | Size   |
-|----------------------------------------------|----------- |----------------|--------------- |--------- |------- |
-| `ai/phi4:14B-F16`                            | 14B        | F16            | 16K tokens     |  33.6GB¹ | 29.3GB |
-| `ai/phi4:latest`<br><br>`ai/phi4:14B-Q4_K_M` | 14B        | IQ2_XXS/Q4_K_M | 16K tokens     |  8.4GB¹  | 9.GB   |
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/phi4:latest`<br><br>`ai/phi4:14B-Q4_K_M` | 15B | IQ2_XXS/Q4_K_M | 16K tokens | 4.92 GB | 8.43 GB |
+| `ai/phi4:14B-F16` | 15B | F16 | 16K tokens | 34.13 GB | 27.31 GB |
+| `ai/phi4:14B-Q4_0` | 15B | Q4_0 | 16K tokens | 10.03 GB | 7.80 GB |
 
-¹: VRAM estimates based on model characteristics.
+¹: VRAM estimated based on model characteristics.
 
-> `:latest` →  `14B-Q4_K_M` 
+> `latest` → `14B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 

diff --git a/ai/qwen2.5.md b/ai/qwen2.5.md
@@ -30,18 +30,19 @@ Qwen2.5-7B-Instruct is designed to assist in various natural language processing
 
 ## Available model variants
 
-| Model Variant                                    | Parameters | Quantization     | Context window | VRAM     | Size   |
-|--------------------------------------------------|------------|------------------|----------------|----------|--------|
-| `ai/qwen2.5:0.5B-F16`                            | 0.5B       | F16              | 32K tokens     | ~1.2GB¹  | 0.99GB |
-| `ai/qwen2.5:1.5B-F16`                            | 1.5B       | F16              | 32K tokens     | ~3.5GB¹  | 3.09GB |
-| `ai/qwen2.5:3B-F16`                              | 3.09B      | F16              | 32K tokens     | ~7GB¹    | 6.18GB |
-| `ai/qwen2.5:3B-Q4_K_M`                           | 3.09B      | IQ2_XXS/Q4_K_M   | 32K tokens     | ~2.2GB¹  | 1.93GB |
-| `ai/qwen2.5:7B-F16`                              | 7.62B      | F16              | 32K tokens     | ~16GB¹   | 15.24GB|
-| `ai/qwen2.5:7B-Q4_K_M`<br><br>`ai/qwen2.5:latest`| 7.62B      | IQ2_XXS/Q4_K_M   | 32K tokens     | ~4.7GB¹  | 4.68GB |
-
-¹: VRAM estimates based on model characteristics.
-
-> `:latest`→ `7B-Q4_K_M`
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/qwen2.5:latest`<br><br>`ai/qwen2.5:7B-Q4_K_M` | 7B | IQ2_XXS/Q4_K_M | 33K tokens | 2.32 GB | 4.36 GB |
+| `ai/qwen2.5:0.5B-F16` | 0.5B | F16 | 33K tokens | 4.27 GB | 942.43 MB |
+| `ai/qwen2.5:1.5B-F16` | 1.5B | F16 | 33K tokens | 4.85 GB | 2.88 GB |
+| `ai/qwen2.5:3B-F16` | 3B | F16 | 33K tokens | 7.91 GB | 5.75 GB |
+| `ai/qwen2.5:3B-Q4_K_M` | 3B | IQ2_XXS/Q4_K_M | 33K tokens | 2.06 GB | 1.79 GB |
+| `ai/qwen2.5:7B-F16` | 7B | F16 | 33K tokens | 15.95 GB | 14.19 GB |
+| `ai/qwen2.5:7B-Q4_0` | 7B | Q4_0 | 33K tokens | 4.70 GB | 4.12 GB |
+
+¹: VRAM estimated based on model characteristics.
+
+> `latest` → `7B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner