|
| 1 | +# Qwen3 |
| 2 | + |
| 3 | + |
| 4 | + |
| 5 | +Qwen3 is the latest generation in the Qwen LLM family, designed for top-tier performance in coding, math, reasoning, and language tasks. It includes both dense and Mixture-of-Experts (MoE) models, offering flexible deployment from lightweight apps to large-scale research. |
| 6 | + |
| 7 | +Qwen3 introduces dual reasoning modes—"thinking" for complex tasks and "non-thinking" for fast responses—giving users dynamic control over performance. It outperforms prior models in reasoning, instruction following, and code generation, while excelling in creative writing and dialogue. |
| 8 | + |
| 9 | +With strong agentic and tool-use capabilities and support for over 100 languages, Qwen3 is optimized for multilingual, multi-domain applications. |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +## 📌 Characteristics |
| 14 | + |
| 15 | +| Attribute | Value | |
| 16 | +|-----------------------|-------------------| |
| 17 | +| **Provider** | Alibaba Cloud | |
| 18 | +| **Architecture** | qwen3 | |
| 19 | +| **Cutoff date** | April 2025 (est.) | |
| 20 | +| **Languages** | 119 languages from multiple families (Indo European, Sino-Tibetan, Afro-Asiatic, Austronesian, Dravidian, Turkic, Tai-Kadai, Uralic, Astroasiatic) including others like Japanese, Basque, Haitian,... | |
| 21 | +| **Tool calling** | ✅ | |
| 22 | +| **Input modalities** | Text | |
| 23 | +| **Output modalities** | Text | |
| 24 | +| **License** | Apache 2.0 | |
| 25 | + |
| 26 | +--- |
| 27 | + |
| 28 | + |
| 29 | +## 📦 Available Model Variants |
| 30 | + |
| 31 | +| Model Variant | Parameters | Quantization | Context Length | VRAM | Size | |
| 32 | +|---------------------------------------------|------------|--------------------|----------------|----------|---------| |
| 33 | +| `ai/qwen3:8B-F16` | 8.19B | F16 | 40,960 tokens | ~16GB¹ | 15.26GB | |
| 34 | +| `ai/qwen3:8B-Q4_0` | 8.19B | Q4_0 | 40,960 tokens | ~4.5GB¹ | 4.44GB | |
| 35 | +| `ai/qwen3:8B-Q4_K_M` <br> `ai/qwen3:latest` | 8.19B | IQ2_XXS / Q4_K_M | 40,960 tokens | ~4.7GB¹ | 4.68GB | |
| 36 | + |
| 37 | +¹: Estimated VRAM requirements. Actual usage may vary depending on system configuration and inference backend. |
| 38 | + |
| 39 | +> `:latest` → `8B-Q4_K_M` |
| 40 | +
|
| 41 | +--- |
| 42 | + |
| 43 | +## 🧠 Intended uses |
| 44 | + |
| 45 | +Qwen3-8B is designed for a wide range of advanced natural language processing tasks: |
| 46 | + |
| 47 | +- Supports both **Dense and Mixture-of-Experts (MoE)** model architectures, available in sizes including 0.6B, 1.7B, 4B, 8B, 14B, 32B, and large MoE variants like 30B-A3B and 235B-A22B. |
| 48 | +- Enables **seamless switching between thinking and non-thinking modes**: |
| 49 | + - *Thinking mode*: optimized for complex logical reasoning, math, and code generation. |
| 50 | + - *Non-thinking mode*: tuned for efficient, general-purpose dialogue and chat. |
| 51 | +- Offers **significant improvements in reasoning performance**, outperforming previous QwQ (in thinking mode) and Qwen2.5-Instruct (in non-thinking mode) models on mathematics, code generation, and commonsense reasoning benchmarks. |
| 52 | +- Delivers **superior human alignment** and excels at: Creative writing, Role-playing, Multi-turn dialogue, Instruction following with immersive conversations. |
| 53 | +- Provides strong **agent capabilities**, including: Integration with external tools and best-in-class performance in complex agent-based workflows across both thinking and unthinking modes. |
| 54 | +- Offers support for **100+ languages and dialects**, with robust multilingual instruction following and translation abilities. |
| 55 | + |
| 56 | +--- |
| 57 | + |
| 58 | +## Considerations |
| 59 | + |
| 60 | +- **Thinking Mode Switching** |
| 61 | + Qwen3 supports a soft switch mechanism via `/think` and `/no_think` prompts (when `enable_thinking=True`). This allows dynamic control over the model's reasoning depth during multi-turn conversations. |
| 62 | +- **Tool Calling with Qwen-Agent** |
| 63 | + For agentic tasks, use **Qwen-Agent**, which simplifies integration of external tools through built-in templates and parsers, minimizing the need for manual tool-call handling. |
| 64 | +> **Note:** Qwen3 models use a new naming convention: post-trained models no longer include the `-Instruct` suffix (e.g., `Qwen3-32B` replaces `Qwen2.5-32B-Instruct`), and base models now end with `-Base`. |
| 65 | +
|
| 66 | +--- |
| 67 | + |
| 68 | +## 🐳 Using this model with Docker Model Runner |
| 69 | + |
| 70 | +First, pull the model: |
| 71 | + |
| 72 | +```bash |
| 73 | +docker model pull ai/qwen3 |
| 74 | +``` |
| 75 | + |
| 76 | +Then run the model: |
| 77 | + |
| 78 | +```bash |
| 79 | +docker model run ai/qwen3 |
| 80 | +``` |
| 81 | + |
| 82 | +For more information, check out the [Docker Model Runner docs](https://docs.docker.com/desktop/features/model-runner/). |
| 83 | + |
| 84 | +--- |
| 85 | + |
| 86 | +## Benchmarks |
| 87 | + |
| 88 | +| Category | Benchmark | Qwen3 | |
| 89 | +|-----------------------------|------------|-------| |
| 90 | +| General Tasks | MMLU | 87.81 | |
| 91 | +| | MMLU-Redux | 87.40 | |
| 92 | +| | MMLU-Pro | 68.18 | |
| 93 | +| | SuperGPQA | 44.06 | |
| 94 | +| | BBH | 88.87 | |
| 95 | +| Mathematics & Science Tasks | GPQA | 47.47 | |
| 96 | +| | GSM8K | 94.39 | |
| 97 | +| | MATH | 71.84 | |
| 98 | +| Multilingual Tasks | MGSM | 83.53 | |
| 99 | +| | MMMLU | 86.70 | |
| 100 | +| | INCLUDE | 73.46 | |
| 101 | +| Code Tasks | EvalPlus | 77.60 | |
| 102 | +| | MultiPL-E | 65.94 | |
| 103 | +| | MBPP | 81.40 | |
| 104 | +| | CRUX-O | 79.00 | |
| 105 | + |
| 106 | +--- |
| 107 | + |
| 108 | +## 🔗 Links |
| 109 | + |
| 110 | +- [Qwen3: Think Deeper, Act Faster](https://qwenlm.github.io/blog/qwen3/) |
0 commit comments