44
55### Multi-Model AI Orchestration Platform
66
7- [ ![ Version] ( https://img.shields.io/badge/version-2.14.7 -blue.svg )] ( https://www.npmjs.com/package/tachibot-mcp )
8- [ ![ Tools] ( https://img.shields.io/badge/tools-48_active -brightgreen.svg )] ( #-tool-ecosystem-48 -tools )
7+ [ ![ Version] ( https://img.shields.io/badge/version-2.15.0 -blue.svg )] ( https://www.npmjs.com/package/tachibot-mcp )
8+ [ ![ Tools] ( https://img.shields.io/badge/tools-51_active -brightgreen.svg )] ( #-tool-ecosystem-51 -tools )
99[ ![ License] ( https://img.shields.io/badge/license-AGPL--3.0-green.svg )] ( LICENSE )
1010[ ![ Node] ( https://img.shields.io/badge/node-%3E%3D18.0.0-brightgreen.svg )] ( https://nodejs.org )
1111[ ![ MCP] ( https://img.shields.io/badge/MCP-Compatible-purple.svg )] ( https://modelcontextprotocol.io )
1212
13- ** 48 AI tools. 7 providers. One protocol.**
13+ ** 51 AI tools. 7 providers. One protocol.**
1414
15- Orchestrate Perplexity, Grok, GPT-5, Gemini, Qwen, Kimi K2.5, and MiniMax M2.1
15+ Orchestrate Perplexity, Grok, GPT-5, Gemini, Qwen, Kimi K2.5, and MiniMax M2.5
1616from Claude Code, Claude Desktop, Cursor, or any MCP client.
1717
18- [ Get Started] ( #-quick-start ) · ; [ View Tools] ( #-tool-ecosystem-48 -tools ) · ; [ Documentation] ( https://tachibot.com/docs )
18+ [ Get Started] ( #-quick-start ) · ; [ View Tools] ( #-tool-ecosystem-51 -tools ) · ; [ Documentation] ( https://tachibot.com/docs )
1919
2020<br >
2121
@@ -28,57 +28,59 @@ from Claude Code, Claude Desktop, Cursor, or any MCP client.
2828
2929---
3030
31- ## What's New in v2.14.7
31+ ## What's New in v2.15.0
3232
33- ### Gemini Judge & ; Jury System
34- - ** ` gemini_judge ` ** &mdash ; Science-backed LLM-as-a-Judge (arXiv:2411.15594). 4 modes: synthesize, evaluate, rank, resolve
35- - ** ` jury ` ** &mdash ; Multi-model jury panel. Configurable jurors (grok, openai, qwen, kimi, perplexity, minimax) run in parallel, Gemini synthesizes the verdict. Based on "Replacing Judges with Juries" (Cohere, arXiv:2404.18796)
36-
37- ### Perplexity Model Fixes
38- - Fixed ` sonar-pro ` model ID (was accidentally using lightweight ` sonar ` )
39- - ` perplexity_research ` now uses ** ` sonar-deep-research ` ** &mdash ; exhaustive multi-source reports in a single call
33+ ### ` /blueprint ` Skill &mdash ; Multi-Model Implementation Planning
34+ New skill that creates bite-sized TDD implementation plans using a 7-step multi-model council:
35+ ```
36+ /blueprint add OAuth with refresh tokens
37+ ```
38+ Pipeline: Grok search → Qwen+Kimi analysis → Kimi decompose → GPT pre-mortem critique → Gemini final judgment → ** bite-sized TDD output** (exact files, test-first steps, commit points).
4039
41- ### Qwen3-Coder-Next
42- ` qwen_coder ` now runs on ** Qwen3-Coder-Next** (Feb 2026) &mdash ; purpose-built for agentic coding:
40+ Bridges ` planner_maker ` 's multi-model intelligence with the ` writing-plans ` execution format.
4341
44- | | Before (qwen3-coder) | After (qwen3-coder-next) |
45- | ---| ---| ---|
46- | ** Params** | 480B / ~ 35B active | 80B / 3B active |
47- | ** Context** | 131K | 262K |
48- | ** SWE-Bench** | 69.6% | >70% |
49- | ** Pricing** | $0.22/$0.88 per M | $0.07/$0.30 per M |
42+ ### 31 Prompt Engineering Techniques (was 22)
43+ Added 9 research-backed techniques for coding and decision-making:
5044
51- 3x cheaper, 2x context, better benchmarks. Falls back to legacy 480B on provider failure.
45+ | Technique | Source | Category |
46+ | -----------| --------| ----------|
47+ | ` reflexion ` | Shinn et al. 2023 | Engineering |
48+ | ` react ` (ReAct) | Yao et al. 2022 | Engineering |
49+ | ` rubber_duck ` | Hunt & Thomas 2008 | Engineering |
50+ | ` test_driven ` | Beck 2003 | Engineering |
51+ | ` scot ` (Structured CoT) | Li et al. 2025 (+13.79% HumanEval) | Structured Coding |
52+ | ` pre_post ` (Contracts) | Empirical SE 2025 | Structured Coding |
53+ | ` bdd_spec ` (Given/When/Then) | BDD 2025 | Structured Coding |
54+ | ` least_to_most ` | Zhou et al. 2022 | Research |
55+ | ` pre_mortem ` | Klein 2007 | Decision |
5256
53- ### Kimi K2.5 Suite (4 tools)
54- | Tool | Capability | Highlight |
55- | ------| -----------| -----------|
56- | ` kimi_thinking ` | Step-by-step reasoning | Agent Swarm architecture |
57- | ` kimi_code ` | Code generation & fixing | SWE-Bench 76.8% |
58- | ` kimi_decompose ` | Task decomposition | Dependency graphs, parallel subtasks |
59- | ` kimi_long_context ` | Document analysis | 256K context window |
57+ Techniques are embedded directly in tool system prompts for automatic application.
6058
61- ### MiniMax M2.1 (2 tools)
62- - ` minimax_code ` &mdash ; SWE tasks at very low cost (72.5% SWE-Bench)
63- - ` minimax_agent ` &mdash ; Agentic workflows (77.2% & tau ;& sup2 ; -Bench)
59+ ### MiniMax M2.5 Upgrade
60+ - ` minimax_code ` &mdash ; SWE-Bench ** 80.2% ** , per-task TECHNIQUE tags (SCoT, reflexion, rubber_duck), per-task temperatures
61+ - ` minimax_agent ` &mdash ; ReAct + least-to-most decomposition protocol, HALT criteria
6462
65- ### Qwen Reasoning
66- - ` qwen_reason ` &mdash ; Heavy reasoning with Qwen3-Max-Thinking (>1T params, 98% HMMT math)
63+ ### Enhanced Skills
64+ - ` /breakdown ` &mdash ; now uses ` least_to_most ` ordering + ` pre_mortem ` failure analysis
65+ - ` /judge ` &mdash ; first judge now runs pre-mortem ("assume this FAILED")
66+ - ` /decompose ` &mdash ; deep-dives include pre/post contracts per sub-problem
67+ - ` /prompt ` &mdash ; auto-recommend flow with 30-intent matching guide, 13 categories
6768
6869---
6970
7071## Skills (Claude Code)
7172
72- TachiBot ships with 8 slash commands for Claude Code. These orchestrate the tools into powerful workflows:
73+ TachiBot ships with 9 slash commands for Claude Code. These orchestrate the tools into powerful workflows:
7374
7475| Skill | What it does | Example |
7576| -------| -------------| ---------|
77+ | ` /blueprint ` | Multi-model planning → bite-sized TDD steps | ` /blueprint add OAuth with refresh tokens ` |
7678| ` /judge ` | Multi-model council - parallel analysis with synthesis | ` /judge how to implement rate limiting ` |
7779| ` /think ` | Sequential reasoning chain with any model | ` /think grok,gemini design a cache layer ` |
7880| ` /focus ` | Mode-based reasoning (debate, research, analyze) | ` /focus architecture-debate Redis vs Pg ` |
79- | ` /breakdown ` | Strategic decomposition with feasibility check | ` /breakdown add OAuth with refresh tokens ` |
81+ | ` /breakdown ` | Strategic decomposition with pre-mortem | ` /breakdown refactor payment module ` |
8082| ` /decompose ` | Split into sub-problems, deep-dive each one | ` /decompose implement collaborative editor ` |
81- | ` /prompt ` | Pick the right thinking technique for your problem | ` /prompt why do users churn ` |
83+ | ` /prompt ` | Recommend the right thinking technique (31 available) | ` /prompt why do users churn ` |
8284| ` /algo ` | Algorithm analysis with 3 specialized models | ` /algo optimize LRU cache O(1) ` |
8385| ` /tachi ` | Help - see available skills, tools, key status | ` /tachi ` |
8486
@@ -91,26 +93,26 @@ Skills automatically adapt to your configured API keys. Even with just 1-2 provi
9193## Key Features
9294
9395### Multi-Model Intelligence
94- - ** 48 AI Tools** across 7 providers &mdash ; Perplexity, Grok, GPT-5, Gemini, Qwen, Kimi, MiniMax
95- - ** Multi-Model Council** &mdash ; planner_maker synthesizes plans from 5+ models
96+ - ** 51 AI Tools** across 7 providers &mdash ; Perplexity, Grok, GPT-5, Gemini, Qwen, Kimi, MiniMax
97+ - ** Multi-Model Council** &mdash ; planner_maker synthesizes plans from 5+ models into bite-sized TDD steps
9698- ** Smart Routing** &mdash ; Automatic model selection for optimal results
9799- ** OpenRouter Gateway** &mdash ; Optional single API key for all providers
98100
99101### Advanced Workflows
100102- ** YAML-Based Workflows** &mdash ; Multi-step AI processes with dependency graphs
101- - ** Prompt Engineering** &mdash ; 14 research-backed techniques built-in
103+ - ** Prompt Engineering** &mdash ; 31 research-backed techniques (including SCoT, ReAct, Reflexion)
102104- ** Verification Checkpoints** &mdash ; 50% / 80% / 100% with automated quality scoring
103105- ** Parallel Execution** &mdash ; Run multiple models simultaneously
104106
105107### Tool Profiles
106108| Profile | Tools | Best For |
107109| ---------| -------| ----------|
108110| ** Minimal** | 12 | Quick tasks, low token budget |
109- | ** Research Power** | 30 | Deep investigation, multi-source |
110- | ** Code Focus** | 28 | Software development, SWE tasks |
111- | ** Balanced** | 38 | General-purpose, mixed workflows |
112- | ** Heavy Coding** (default) | 44 | Max code tools + agentic workflows |
113- | ** Full** | 50 | Everything enabled |
111+ | ** Research Power** | 31 | Deep investigation, multi-source |
112+ | ** Code Focus** | 29 | Software development, SWE tasks |
113+ | ** Balanced** | 39 | General-purpose, mixed workflows |
114+ | ** Heavy Coding** (default) | 45 | Max code tools + agentic workflows |
115+ | ** Full** | 51 | Everything enabled |
114116
115117### Developer Experience
116118- ** Claude Code** &mdash ; First-class support
@@ -172,19 +174,19 @@ See [Installation Guide](docs/INSTALLATION_BOTH.md) for detailed instructions.
172174
173175---
174176
175- ## Tool Ecosystem (48 Tools)
177+ ## Tool Ecosystem (51 Tools)
176178
177179### Research & Search (6)
178180` perplexity_ask ` · ; ` perplexity_research ` · ; ` perplexity_reason ` · ; ` grok_search ` · ; ` openai_search ` · ; ` gemini_search `
179181
180- ### Reasoning & Planning (8 )
181- ` grok_reason ` · ; ` openai_reason ` · ; ` qwen_reason ` · ; ` kimi_thinking ` · ; ` kimi_decompose ` · ; ` planner_maker ` · ; ` planner_runner ` · ; ` list_plans `
182+ ### Reasoning & Planning (9 )
183+ ` grok_reason ` · ; ` openai_reason ` · ; ` qwen_reason ` · ; ` qwq_reason ` &# 183 ; ` kimi_thinking ` · ; ` kimi_decompose ` · ; ` planner_maker ` · ; ` planner_runner ` · ; ` list_plans `
182184
183185### Code Intelligence (8)
184186` kimi_code ` · ; ` grok_code ` · ; ` grok_debug ` · ; ` qwen_coder ` · ; ` qwen_algo ` · ; ` qwen_competitive ` · ; ` minimax_code ` · ; ` minimax_agent `
185187
186- ### Analysis & Brainstorming (9 )
187- ` gemini_analyze_text ` · ; ` gemini_analyze_code ` · ; ` gemini_brainstorm ` · ; ` openai_brainstorm ` · ; ` openai_code_review ` · ; ` openai_explain ` · ; ` grok_brainstorm ` · ; ` grok_architect ` · ; ` kimi_long_context `
188+ ### Analysis & Judgment (11 )
189+ ` gemini_analyze_text ` · ; ` gemini_analyze_code ` · ; ` gemini_judge ` &# 183 ; ` jury ` &# 183 ; ` gemini_brainstorm ` · ; ` openai_brainstorm ` · ; ` openai_code_review ` · ; ` openai_explain ` · ; ` grok_brainstorm ` · ; ` grok_architect ` · ; ` kimi_long_context `
188190
189191### Meta & Orchestration (5)
190192` think ` · ; ` nextThought ` · ; ` focus ` · ; ` tachi ` · ; ` usage_stats `
0 commit comments