Update overviews #18

ilopezluna · 2025-04-26T20:49:01Z

Introducing Model Cards CLI Tool and Model Documentation Updates

This PR introduces a new Model Cards CLI tool and updates model documentation across the repository. Key changes include:

New Model Cards CLI Tool:
- Command-line interface for updating model card markdown files
- Model repository inspection capabilities
- OCI registry integration for model metadata
- GGUF file metadata extraction
- Markdown file processing utilities
Model Documentation Updates:
- Updated model variant information across multiple model cards
- Improved accuracy of parameters, quantization options, and VRAM estimates
- Added new model variants and options
- Enhanced clarity in model documentation

model-cards-cli % make help
Available targets:
  all              - Clean, build, and test
  build            - Build the binary
  clean            - Clean build artifacts
  lint             - Run linters
  run              - Run the binary to update all model files
  run-single       - Run the binary to update a single model file (Usage: make run-single MODEL=<model-file.md>)
  inspect          - Inspect a model repository (Usage: make inspect REPO=<repository> [TAG=<tag>] [OPTIONS=<options>])
                     Example: make inspect REPO=ai/smollm2
                     Example: make inspect REPO=ai/smollm2 TAG=360M-Q4_K_M
                     Example: make inspect REPO=ai/smollm2 OPTIONS="--parameters --vram --json"
  help             - Show this help message

make inspect REPO=ai/llama3.2 TAG=latest
Inspecting model: ai/llama3.2:latest
INFO[2025-05-02 17:18:01] Starting model inspector                     
INFO[2025-05-02 17:18:01] Inspecting ai/llama3.2:latest                
🔍 Model: ai/llama3.2:latest
   • Parameters   : 3B
   • Architecture : llama
   • Quantization : IQ2_XXS/Q4_K_M
   • Size         : 1.87 GiB
   • Context      : 131072 tokens
   • VRAM         : 4.08 GB
INFO[2025-05-02 17:18:04] Inspection completed successfully

ilopezluna · 2025-04-27T18:34:14Z

Context window (context length) seems to be part of the gguf metadata: https://github.com/ggml-org/ggml/blob/master/docs/gguf.md#llm
I'm going to check if its contained in the gguf we have in Hub and if so I will include it as metadata in config file

aevesdocker · 2025-04-28T09:37:33Z

ai/deepcoder-preview.md

-|------------------------------|------------|--------------|----------------|--------|--------|
-| `deepcoder-preview:14B-F16`    | 14.77B     | F16          | 131,072        | 24GB¹  | 29.5GB |
-| `deepcoder-preview:14B:latest` <br><br> `deepcoder-preview:14B-Q4_K_M` | 14.77B     | Q4_K_M       | 131,072        | 8GB¹   | 9GB    |
+| Model Variant | Parameters | Quantization | Context window | VRAM | Size |


* Renaming readme files for each model to the same name used in Hub * Fix smollm2 urls

krissetto · 2025-04-28T10:32:05Z

I didn't find how we add context window and vram, how can we automatize that?

@ilopezluna Context length is generally model specific and should be given by the model creators, I'm not sure if there's an easy way to automate that if the metadata is not included in the HF repo consistently. Also, we should be aware of the context length limitations we currently have in DMR (I'm not sure if any progress has been made there).. maybe we should specify that instead of just removing all values? if not, we could also remove the column in the table instead of just leaving it empty

ilopezluna · 2025-04-28T10:44:22Z

I didn't find how we add context window and vram, how can we automatize that?

@ilopezluna Context length is generally model specific and should be given by the model creators, I'm not sure if there's an easy way to automate that if the metadata is not included in the HF repo consistently. Also, we should be aware of the context length limitations we currently have in DMR (I'm not sure if any progress has been made there).. maybe we should specify that instead of just removing all values? if not, we could also remove the column in the table instead of just leaving it empty

@krissetto I've just verified (thanks @jalonsogo for the hint) its included in the GGUF metadata as [llm].context_length
I'm going to discuss with the team to include it in the config file.
I will update the current script to also look into this metadata to include it in the table.

I'm not sure if any progress has been made there
Unfortunately there is no progress here yet

- Parse gguf without downloading it

- Fixes update of the markdown

…k parameters metadata

ilopezluna · 2025-05-02T15:20:16Z

I didn't find how we add context window and vram, how can we automatize that?

@ilopezluna Context length is generally model specific and should be given by the model creators, I'm not sure if there's an easy way to automate that if the metadata is not included in the HF repo consistently. Also, we should be aware of the context length limitations we currently have in DMR (I'm not sure if any progress has been made there).. maybe we should specify that instead of just removing all values? if not, we could also remove the column in the table instead of just leaving it empty

@krissetto I've just verified (thanks @jalonsogo for the hint) its included in the GGUF metadata as [llm].context_length I'm going to discuss with the team to include it in the config file. I will update the current script to also look into this metadata to include it in the table.

I'm not sure if any progress has been made there
Unfortunately there is no progress here yet

@krissetto @jalonsogo I'm using this formula now: https://github.com/docker/model-cards/pull/18/files#diff-3ddaf77e1aeb6813c77ff54404fc4be8e4aa5bbff4bd6227bbea8d04155d4468R216

ilopezluna · 2025-05-02T15:20:56Z

@jalonsogo I kept the previous scripts but I think it would be better to remove them once we confirm that current go approach works as expected

krissetto · 2025-05-02T15:24:58Z

@ilopezluna noice 🫶

nit: don't forget the footnote notation (the little "1") in the VRAM calc parts of the tables when we generate them

krissetto · 2025-05-02T15:25:56Z

@ilopezluna noice 🫶

nit: don't forget the footnote notation (the little "1") in the VRAM calc parts of the tables when we generate them

or maybe lets put it in the table header itself? 🤔

ilopezluna · 2025-05-02T17:32:09Z

@ilopezluna noice 🫶

nit: don't forget the footnote notation (the little "1") in the VRAM calc parts of the tables when we generate them

good catch, thanks! (added)

…ct all metadata

* Renaming readme files for each model to the same name used in Hub * Fix smollm2 urls * Update overviews (#18) * adds update script * adds build-model-table.sh script * Updates all models * force param is not needed anymore * Renaming model overviews to match with the model name in Hub (#17) * Renaming readme files for each model to the same name used in Hub * Fix smollm2 urls * Use sentence case * Adds initial go script to update table * - build-all tables script to Go - Parse gguf without downloading it * - Uses authenticated req (to avoid rate limit) - Fixes update of the markdown * Try to get labels from general.size_label first, if not found fallback parameters metadata * Format context length * VRAM estimation * Allow to update only the specified file * Removes unneeded scripts * Fix estimated VRAM for embedding model * Adds model inspect command * Rename to model-cards-cli * Updates model-cards * Rename header to VRAM¹ * Adds parsed gguf file into ModelVariant, and includes method to extract all metadata * Includes gguf metadata into inspect * No need to use interface for registry client for now. * A ModelVariant has multiple tags * Formats VRAM * Formats context length * Adds --all to include metadata * Removes formatter * Format size * Update models * Script not needed anymore * Updates README.md

ilopezluna added 3 commits April 26, 2025 21:57

adds update script

c3723e3

adds build-model-table.sh script

3e86f07

Updates all models

86d7da2

ilopezluna requested review from krissetto, jalonsogo and aevesdocker April 26, 2025 20:49

force param is not needed anymore

71d9310

jalonsogo approved these changes Apr 28, 2025

View reviewed changes

aevesdocker reviewed Apr 28, 2025

View reviewed changes

ilopezluna added 2 commits April 28, 2025 11:44

Renaming model overviews to match with the model name in Hub (#17)

2610171

* Renaming readme files for each model to the same name used in Hub * Fix smollm2 urls

Merge branch 'main' into update-overviews

b02ce9f

Use sentence case

c07716e

ilopezluna added 12 commits April 30, 2025 12:59

Adds initial go script to update table

b0a6209

- build-all tables script to Go

d7d2ceb

- Parse gguf without downloading it

- Uses authenticated req (to avoid rate limit)

999a550

- Fixes update of the markdown

Try to get labels from general.size_label first, if not found fallbac…

d2d7b55

…k parameters metadata

Format context length

05d5a0a

VRAM estimation

f9a0f26

Allow to update only the specified file

422a910

Removes unneeded scripts

71ca927

Fix estimated VRAM for embedding model

5003e10

Adds model inspect command

60bfe1b

Rename to model-cards-cli

9e2c7d9

Updates model-cards

05f64b0

Rename header to VRAM¹

0b3f33a

ilopezluna added 12 commits May 5, 2025 14:50

Adds parsed gguf file into ModelVariant, and includes method to extra…

b5609cb

…ct all metadata

Includes gguf metadata into inspect

96446b3

No need to use interface for registry client for now.

cb93c29

A ModelVariant has multiple tags

aacfa12

Formats VRAM

fca5091

Formats context length

c40c7e3

Adds --all to include metadata

09c0595

Removes formatter

854ef60

Format size

1c607e0

Update models

35bd4c1

Script not needed anymore

097a121

Updates README.md

277331b

jalonsogo approved these changes May 5, 2025

View reviewed changes

ilopezluna merged commit e775016 into rename May 6, 2025

ilopezluna mentioned this pull request May 6, 2025

update-overviews #20

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update overviews #18

Update overviews #18

Uh oh!

ilopezluna commented Apr 26, 2025 •

edited

Loading

Uh oh!

ilopezluna commented Apr 27, 2025

Uh oh!

aevesdocker Apr 28, 2025

Uh oh!

krissetto commented Apr 28, 2025 •

edited

Loading

Uh oh!

ilopezluna commented Apr 28, 2025

Uh oh!

ilopezluna commented May 2, 2025

Uh oh!

ilopezluna commented May 2, 2025

Uh oh!

krissetto commented May 2, 2025

Uh oh!

krissetto commented May 2, 2025

Uh oh!

ilopezluna commented May 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

	\| Model Variant \| Parameters \| Quantization \| Context window \| VRAM \| Size \|
	\| Model variant \| Parameters \| Quantization \| Context window \| VRAM \| Size \|

Update overviews #18

Update overviews #18

Uh oh!

Conversation

ilopezluna commented Apr 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilopezluna commented Apr 27, 2025

Uh oh!

aevesdocker Apr 28, 2025

Choose a reason for hiding this comment

Uh oh!

krissetto commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilopezluna commented Apr 28, 2025

Uh oh!

ilopezluna commented May 2, 2025

Uh oh!

ilopezluna commented May 2, 2025

Uh oh!

krissetto commented May 2, 2025

Uh oh!

krissetto commented May 2, 2025

Uh oh!

ilopezluna commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ilopezluna commented Apr 26, 2025 •

edited

Loading

krissetto commented Apr 28, 2025 •

edited

Loading

ilopezluna commented May 2, 2025 •

edited

Loading