Skip to content

dots.llm1 support #51

@louiehelm

Description

@louiehelm

New architecture dots.llm1

Might somewhat easy to support if Qwen3 and Deepseek are working according to this:

It looks like it is a mixture of deepseek-3 MoE modules and qwen-3 attention modules:

https://github.com/huggingface/transformers/blob/ffe12627b4e84489d2ab91dd0ec00614855edc79/src/transformers/models/dots1/modular_dots1.py

and uses qwen-2 tokeniser:

https://huggingface.co/rednote-hilab/dots.llm1.inst/blob/main/tokenizer_config.json

So it's probably just a case of glueing all this together?

There's a working hf patch and llama.cpp patch in testing.

People are planning Deepseek distills and other tunes with this model since it has several base checkpoints released + more world knowledge than anything outside 5x+ it's size. At 143B, dots.llm1 tunes could plausibly become the premier EXL3 models for people with dual GPUs who can't run any Deepseek quant but want to run something better than Qwen3 32B.

And even if others don't, I will quantize and release EXL3 quants of this model if a working definition file gets created.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions