docker
diff --git a/‎.gitignore
Lines changed: 3 additions & 1 deletion b/‎.gitignore
Lines changed: 3 additions & 1 deletion
diff --git a/‎README.md
Lines changed: 42 additions & 4 deletions b/‎README.md
Lines changed: 42 additions & 4 deletions
diff --git a/‎ai/deepcoder-preview.md
Lines changed: 7 additions & 5 deletions b/‎ai/deepcoder-preview.md
Lines changed: 7 additions & 5 deletions
diff --git a/‎ai/deepseek-r1-distill-llama.md
Lines changed: 8 additions & 6 deletions b/‎ai/deepseek-r1-distill-llama.md
Lines changed: 8 additions & 6 deletions
diff --git a/‎ai/gemma3-qat.md
Lines changed: 8 additions & 9 deletions b/‎ai/gemma3-qat.md
Lines changed: 8 additions & 9 deletions
diff --git a/‎ai/gemma3.md
Lines changed: 9 additions & 8 deletions b/‎ai/gemma3.md
Lines changed: 9 additions & 8 deletions
diff --git a/‎ai/llama3.1.md
Lines changed: 6 additions & 6 deletions b/‎ai/llama3.1.md
Lines changed: 6 additions & 6 deletions
diff --git a/‎ai/llama3.2.md
Lines changed: 9 additions & 7 deletions b/‎ai/llama3.2.md
Lines changed: 9 additions & 7 deletions
diff --git a/‎ai/llama3.3.md
Lines changed: 6 additions & 5 deletions b/‎ai/llama3.3.md
Lines changed: 6 additions & 5 deletions
diff --git a/‎ai/mistral-nemo.md
Lines changed: 4 additions & 4 deletions b/‎ai/mistral-nemo.md
Lines changed: 4 additions & 4 deletions
@@ -1,2 +1,4 @@
 .idea
-.DS_Store
+.DS_Store
+
+bin
@@ -24,7 +24,7 @@ Distilled LLaMA by DeepSeek, fast and optimized for real-world tasks.
 ![Gemma Logo](https://github.com/docker/model-cards/raw/refs/heads/main/logos/[email protected])
 
 📌 **Description:**  
-Google’s latest Gemma, small yet strong for chat and generation
+Google's latest Gemma, small yet strong for chat and generation
 
 📂 **Model File:** [`ai/gemma3.md`](ai/gemma3.md)
 
@@ -37,7 +37,7 @@ Google’s latest Gemma, small yet strong for chat and generation
 ![Meta Logo](https://github.com/docker/model-cards/raw/refs/heads/main/logos/[email protected])
 
 📌 **Description:**  
-Meta’s LLaMA 3.1: Chat-focused, benchmark-strong, multilingual-ready.
+Meta's LLaMA 3.1: Chat-focused, benchmark-strong, multilingual-ready.
 
 📂 **Model File:** [`ai/llama3.1.md`](ai/llama3.1.md)
 
@@ -111,7 +111,7 @@ A state-of-the-art English language embedding model developed by Mixedbread AI.
 ![Microsoft Logo](https://github.com/docker/model-cards/raw/refs/heads/main/logos/[email protected])
 
 📌 **Description:**  
-Microsoft’s compact model, surprisingly capable at reasoning and code.
+Microsoft's compact model, surprisingly capable at reasoning and code.
 
 📂 **Model File:** [`ai/phi4.md`](ai/phi4.md)
 
@@ -152,11 +152,49 @@ Experimental Qwen variant—lean, fast, and a bit mysterious.
 📌 **Description:**  
 A compact language model, designed to run efficiently on-device while performing a wide range of language tasks 
 
-📂 **Model File:** [`ai/smolllm2.md`](ai/smollm2.md)
+📂 **Model File:** [`ai/smollm2.md`](ai/smollm2.md)
 
 **URLs:**
 - https://huggingface.co/HuggingFaceTB/SmolLM2-360M-Instruct
 - https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct
 
 ---
 
+## 🔧 CLI Usage
+
+The model-cards-cli tool provides commands to inspect and update model information:
+
+### Inspect Command
+```bash
+# Basic inspection
+make inspect REPOSITORY=ai/smollm2
+
+# Inspect specific tag
+make inspect REPOSITORY=ai/smollm2 TAG=360M-Q4_K_M
+
+# Show all metadata
+make inspect REPOSITORY=ai/smollm2 OPTIONS="--all"
+```
+
+### Update Command
+```bash
+# Update all models
+make run
+
+# Update specific model
+make run-single MODEL=ai/smollm2.md
+```
+
+### Available Options
+
+#### Inspect Command Options
+- `REPOSITORY`: (Required) The repository to inspect (e.g., `ai/smollm2`)
+- `TAG`: (Optional) Specific tag to inspect (e.g., `360M-Q4_K_M`)
+- `OPTIONS`: (Optional) Additional options:
+  - `--all`: Show all metadata fields
+  - `--log-level`: Set log level (debug, info, warn, error)
+
+#### Update Command Options
+- `MODEL`: (Required for run-single) Specific model file to update (e.g., `ai/smollm2.md`)
+- `--log-level`: Set log level (debug, info, warn, error)
+
@@ -32,12 +32,14 @@ DeepCoder-14B is purpose-built for advanced code reasoning, programming task sol
 
 ## Available model variants
 
-| Model variant                | Parameters | Quantization | Context window | VRAM  | Size    |
-|------------------------------|------------|--------------|----------------|--------|--------|
-| `deepcoder-preview:14B-F16`    | 14.77B     | F16          | 131,072        | 24GB¹  | 29.5GB |
-| `deepcoder-preview:14B:latest` <br><br> `deepcoder-preview:14B-Q4_K_M` | 14.77B     | Q4_K_M       | 131,072        | 8GB¹   | 9GB    |
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/deepcoder-preview:latest`<br><br>`ai/deepcoder-preview:14B-Q4_K_M` | 14B | IQ2_XXS/Q4_K_M | 131K tokens | 4.03 GB | 8.37 GB |
+| `ai/deepcoder-preview:14B-F16` | 14B | F16 | 131K tokens | 31.29 GB | 27.51 GB |
 
-¹: VRAM estimated based on GGUF model characteristics.
+¹: VRAM estimated based on model characteristics.
+
+> `latest` → `14B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 
 
@@ -33,15 +33,17 @@ i: Estimated
 
 ## Available model variants
 
-| Model Variant                                                                      | Parameters | Quantization   | Context Window  | VRAM     | Size  |
-|------------------------------------------------------------------------------------|----------- |----------------|---------------- |--------- |-------|
-| `ai/deepseek-r1-distill-llama:70B-Q4_K_M`                                          | 70B        | IQ2_XXS/Q4_K_M | 128K tokens     | 42GB¹    | 42GB  |
-| `ai/deepseek-r1-distill-llama:8B-F16`                                              | 8B         | F16            | 128K tokens     | 19.2GB¹  | 16GB  |
-| `ai/deepseek-r1-distill-llama:latest`<br><br>`ai/deepseek-r1-distill-llama:8B-Q4_K_M`                                           | 8B         | IQ2_XXS/Q4_K_M | 128K tokens     | 4.5GB¹   | 5GB   |
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/deepseek-r1-distill-llama:latest`<br><br>`ai/deepseek-r1-distill-llama:8B-Q4_K_M` | 8B | IQ2_XXS/Q4_K_M | 131K tokens | 2.31 GB | 4.58 GB |
+| `ai/deepseek-r1-distill-llama:70B-Q4_0` | 70B | Q4_0 | 131K tokens | 44.00 GB | 37.22 GB |
+| `ai/deepseek-r1-distill-llama:70B-Q4_K_M` | 70B | IQ2_XXS/Q4_K_M | 131K tokens | 20.17 GB | 39.59 GB |
+| `ai/deepseek-r1-distill-llama:8B-F16` | 8B | F16 | 131K tokens | 17.88 GB | 14.96 GB |
+| `ai/deepseek-r1-distill-llama:8B-Q4_0` | 8B | Q4_0 | 131K tokens | 5.03 GB | 4.33 GB |
 
 ¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `70B-Q4_K_M`
+> `latest` → `8B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 
 
@@ -36,17 +36,16 @@ Gemma 3 4B model can be used for:
 
 ## Available model variants
 
-| Model variant                                           | Parameters | Quantization   | Context window | VRAM      | Size   | 
-|-------------------------------------------------------- |----------- |----------------|--------------- |---------- |------- |
-| `ai/gemma3-qat:1B-Q4_K_M`                               | 1B         | IQ2_XXS/Q4_K_M | 32K tokens     |  0.892GB¹ | 0.95GB |
-| `ai/gemma3-qat:latest`<br><br>`ai/gemma3-qat:4B-Q4_K_M` | 4B         | IQ2_XXS/Q4_K_M | 128K tokens    |  3.4GB¹   | 2.93GB |
-| `ai/gemma3-qat:12B-Q4_K_M`                              | 12B        | IQ2_XXS/Q4_K_M | 128K tokens    |  8.7GB¹   | 7.52GB |
-| `ai/gemma3-qat:27B-Q4_K_M`                              | 27B        | IQ2_XXS/Q4_K_M | 128K tokens    |  21GB¹    | 16GB   |
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/gemma3-qat:latest`<br><br>`ai/gemma3-qat:4B-Q4_K_M` | 3.88 B | Q4_0 | 131K tokens | 5.44 GB | 2.93 GB |
+| `ai/gemma3-qat:1B-Q4_K_M` | 999.89 M | Q4_0 | 33K tokens | 5.02 GB | 950.82 MB |
+| `ai/gemma3-qat:27B-Q4_K_M` | 27.01 B | Q4_0 | 131K tokens | 20.28 GB | 16.04 GB |
+| `ai/gemma3-qat:12B-Q4_K_M` | 11.77 B | Q4_0 | 131K tokens | 9.80 GB | 7.51 GB |
 
-¹: VRAM extracted from Gemma documentation ([link](https://ai.google.dev/gemma/docs/core#128k-context)).  
-These are rough estimations. QAT models should use much less memory compared to the standard Gemma3 models
+¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `4B-Q4_K_M`
+> `latest` → `4B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 
 
@@ -30,16 +30,17 @@ Gemma 3 4B model can be used for:
 
 ## Available model variants
 
-| Model Variant                                   | Parameters | Quantization   | Context Window | VRAM      | Size   | 
-|-------------------------------------------------|----------- |----------------|--------------- |---------- |------- |
-| `ai/gemma3:1B-F16`                              | 1B         | F16            | 32K tokens     |  1.5GB¹   | 1.86GB |
-| `ai/gemma3:1B-Q4_K_M`                           | 1B         | IQ2_XXS/Q4_K_M | 32K tokens     |  0.892GB¹ | 0.76GB |
-| `ai/gemma3:4B-F16`                              | 4B         | F16            | 128K tokens    |  6.4GB¹   | 7.23GB | 
-| `ai/gemma3:latest`<br><br>`ai/gemma3:4B-Q4_K_M` | 4B         | IQ2_XXS/Q4_K_M | 128K tokens    |  3.4GB¹   | 2.31GB | 
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/gemma3:latest`<br><br>`ai/gemma3:4B-Q4_K_M` | 4B | IQ2_XXS/Q4_K_M | 131K tokens | 4.15 GB | 2.31 GB |
+| `ai/gemma3:4B-F16` | 4B | F16 | 131K tokens | 11.94 GB | 7.23 GB |
+| `ai/gemma3:4B-Q4_0` | 4B | Q4_0 | 131K tokens | 5.51 GB | 2.19 GB |
+| `ai/gemma3:1B-F16` | 1B | F16 | 33K tokens | 6.62 GB | 1.86 GB |
+| `ai/gemma3:1B-Q4_K_M` | 1B | IQ2_XXS/Q4_K_M | 33K tokens | 4.68 GB | 762.49 MB |
 
-¹: VRAM extracted from Gemma documentation ([link](https://ai.google.dev/gemma/docs/core#128k-context))
+¹: VRAM estimated based on model characteristics.
 
-`:latest`→ `4B-Q4_K_M`
+> `latest` → `4B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 
 
@@ -31,14 +31,14 @@
 
 ## Available model variants
 
-| Model variant                                        | Parameters | Quantization   | Context window | VRAM      | Size   | 
-|----------------------------------------------------- |----------- |--------------- |--------------- |---------- |------- |
-| `ai/llama3.1:latest`<br><br>`ai/llama3.1:8B-Q4_K_M`  | 8B         | Q4_K_M         | 128K           | 4.8GB¹    | 5GB    |
-| `ai/llama3.1:8B-F16`                                 | 8B         | F16            | 128K           | 19.2GB¹   | 16GB   | 
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/llama3.1:latest`<br><br>`ai/llama3.1:8B-Q4_K_M` | 8B | IQ2_XXS/Q4_K_M | 131K tokens | 2.31 GB | 4.58 GB |
+| `ai/llama3.1:8B-F16` | 8B | F16 | 131K tokens | 17.88 GB | 14.96 GB |
 
-¹: VRAM estimates based on model characteristics.
+¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `8B-Q4_K_M`
+> `latest` → `8B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 
 
@@ -29,16 +29,18 @@ Llama 3.2 instruct models are designed for:
 
 ## Available model variants
 
-| Model Variant                                       | Parameters | Quantization | Context window | VRAM   | Size  | 
-|---------------------------------------------------- |------------|--------------|----------------|--------|-------|
-| `ai/llama3.2:3B-F16`                                | 3B         | F16          | 128k tokens    | 7.2GB¹ | 6GB   |
-| `ai/llama3.2:latest`<br><br>`ai/llama3.2:3B-Q4_K_M` | 3B         | Q4_K_M       | 128K tokens    | 1.8GB¹ | 1.8GB | 
-| `ai/llama3.2:1B-F16`                                | 1B         | F16          | 128K tokens    | 2.4GB¹ | 2.3GB |
-| `ai/llama3.2:1B-Q8_0`                               | 1B         | Q8_0         | 128K tokens    | 1.2GB¹ | 1.2GB | 
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/llama3.2:latest`<br><br>`ai/llama3.2:3B-Q4_K_M` | 3B | IQ2_XXS/Q4_K_M | 131K tokens | 3.26 GB | 1.87 GB |
+| `ai/llama3.2:1B-Q8_0` | 1B | Q8_0 | 131K tokens | 1.19 GB | 1.22 GB |
+| `ai/llama3.2:3B-F16` | 3B | F16 | 131K tokens | 9.11 GB | 5.98 GB |
+| `ai/llama3.2:3B-Q4_0` | 3B | Q4_0 | 131K tokens | 4.29 GB | 1.78 GB |
+| `ai/llama3.2:1B-F16` | 1B | F16 | 131K tokens | 2.24 GB | 2.30 GB |
+| `ai/llama3.2:1B-Q4_0` | 1B | Q4_0 | 131K tokens | 0.63 GB | 727.75 MB |
 
 ¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `3B-Q4_K_M`
+> `latest` → `3B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 
 
@@ -33,13 +33,14 @@ Meta Llama 3.3 is a powerful 70B parameter multilingual language model designed
 
 ## Available model variants
 
-| Model variant                                        | Parameters | Quantization   | Context window | VRAM      | Size   | 
-|----------------------------------------------------- |----------- |--------------- |--------------- |---------- |------- |
-| `ai/llama3.3:latest`<br><br>`ai/llama3.3:70B-Q4_K_M` | 70B        | Q4_K_M         | 128K           | 42GB¹     | 42.5GB | 
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/llama3.3:latest`<br><br>`ai/llama3.3:70B-Q4_K_M` | 70B | IQ2_XXS/Q4_K_M | 131K tokens | 20.17 GB | 39.59 GB |
+| `ai/llama3.3:70B-Q4_0` | 70B | Q4_0 | 131K tokens | 44.00 GB | 37.22 GB |
 
-¹: VRAM estimates based on model characteristics.
+¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `70B-Q4_K_M`
+> `latest` → `70B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
 
 
@@ -28,13 +28,13 @@ Mistral-Nemo-Instruct-2407 is designed for instruction-following tasks and multi
 
 ## Available model variants
 
-| Model Variant                                                | Parameters | Quantization | Context window | VRAM   | Size  |
-|--------------------------------------------------------------|------------|--------------|----------------|--------|-------|
-| `ai/mistral-nemo:latest`<br><br>`ai/mistral-nemo:12B-Q4_K_M` | 12B        | Q4_K_M       | 128k tokens    | 7GB¹   | 7.1 GB|
+| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
+|---------------|------------|--------------|----------------|------|-------|
+| `ai/mistral-nemo:latest`<br><br>`ai/mistral-nemo:12B-Q4_K_M` | 12B | IQ2_XXS/Q4_K_M | 131K tokens | 3.46 GB | 6.96 GB |
 
 ¹: VRAM estimated based on model characteristics.
 
-> `:latest` → `12B-Q4_K_M` 
+> `latest` → `12B-Q4_K_M`
 
 ## Use this AI model with Docker Model Runner
-Original file line number
+Diff line change
@@ @@ -1,2 +1,4 @@ @@
 .idea
 -.DS_Store
 +.DS_Store
++
 +bin