Native macOS LLM server with MCP support. Run local and remote language models on Apple Silicon with OpenAI-compatible APIs, tool calling, and a built-in plugin ecosystem.
Created by Dinoki Labs (dinoki.ai)
Documentation · Discord · Plugin Registry · Contributing
brew install --cask osaurusOr download from Releases.
After installing, launch from Spotlight (⌘ Space → "osaurus") or run osaurus ui from the terminal.
Osaurus is an all-in-one LLM server for macOS. It combines:
- MLX Runtime — Optimized local inference for Apple Silicon using MLX
- Remote Providers — Connect to OpenAI, OpenRouter, Ollama, LM Studio, or any OpenAI-compatible API
- OpenAI, Anthropic & Ollama APIs — Drop-in compatible endpoints for existing tools
- MCP Server — Expose tools to AI agents via Model Context Protocol
- Remote MCP Providers — Connect to external MCP servers and aggregate their tools
- Plugin System — Extend functionality with community and custom tools
- Developer Tools — Built-in insights and server explorer for debugging
- Apple Foundation Models — Use the system model on macOS 26+ (Tahoe)
| Feature | Description |
|---|---|
| Local LLM Server | Run Llama, Qwen, Gemma, Mistral, and more locally |
| Remote Providers | OpenAI, OpenRouter, Ollama, LM Studio, or custom endpoints |
| OpenAI Compatible | /v1/chat/completions with streaming and tool calling |
| Anthropic Compatible | /messages endpoint for Claude Code and Anthropic SDK clients |
| MCP Server | Connect to Cursor, Claude Desktop, and other MCP clients |
| Remote MCP Providers | Aggregate tools from external MCP servers |
| Tools & Plugins | Browser automation, file system, git, web search, and more |
| Custom Themes | Create, import, and export themes with full color customization |
| Developer Tools | Request insights, API explorer, and live endpoint testing |
| Menu Bar Chat | Chat overlay with session history, context tracking (⌘;) |
| Model Manager | Download and manage models from Hugging Face |
Launch Osaurus from Spotlight or run:
osaurus serveThe server starts on port 1337 by default.
Add to your MCP client configuration (e.g., Cursor, Claude Desktop):
{
"mcpServers": {
"osaurus": {
"command": "osaurus",
"args": ["mcp"]
}
}
}Open the Management window (⌘ Shift M) → Providers → Add Provider.
Choose from presets (OpenAI, Ollama, LM Studio, OpenRouter) or configure a custom endpoint.
Run models locally with optimized Apple Silicon inference:
# Download a model
osaurus run llama-3.2-3b-instruct-4bit
# Use via API
curl http://127.0.0.1:1337/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "llama-3.2-3b-instruct-4bit", "messages": [{"role": "user", "content": "Hello!"}]}'Connect to any OpenAI-compatible API to access cloud models alongside local ones.
Supported presets:
- OpenAI — GPT-4o, o1, and other OpenAI models
- OpenRouter — Access multiple providers through one API
- Ollama — Connect to a local or remote Ollama instance
- LM Studio — Use LM Studio as a backend
- Custom — Any OpenAI-compatible endpoint
Features:
- Secure API key storage (macOS Keychain)
- Custom headers for authentication
- Auto-connect on launch
- Connection health monitoring
See Remote Providers Guide for details.
Osaurus is a full MCP (Model Context Protocol) server. Connect it to any MCP client to give AI agents access to your installed tools.
| Endpoint | Description |
|---|---|
GET /mcp/health |
Check MCP availability |
GET /mcp/tools |
List active tools |
POST /mcp/call |
Execute a tool |
Connect to external MCP servers and aggregate their tools into Osaurus:
- Discover and register tools from remote MCP endpoints
- Configurable timeouts and streaming
- Tools are namespaced by provider (e.g.,
provider_toolname) - Secure token storage
See Remote MCP Providers Guide for details.
Install tools from the central registry or create your own.
Official System Tools:
| Plugin | Tools |
|---|---|
osaurus.filesystem |
read_file, write_file, list_directory, search_files, and more |
osaurus.browser |
browser_navigate, browser_click, browser_type, browser_screenshot |
osaurus.git |
git_status, git_log, git_diff, git_branch |
osaurus.search |
search, search_news, search_images (DuckDuckGo) |
osaurus.fetch |
fetch, fetch_json, fetch_html, download |
osaurus.time |
current_time, format_date |
# Install from registry
osaurus tools install osaurus.browser
# List installed tools
osaurus tools list
# Create your own plugin
osaurus tools create MyPlugin --language swiftSee the Plugin Authoring Guide for details.
Built-in tools for debugging and development:
Insights — Monitor all API requests in real-time:
- Request/response logging with full payloads
- Filter by method (GET/POST) and source (Chat UI/HTTP API)
- Performance stats: success rate, average latency, errors
- Inference metrics: tokens, speed (tok/s), model used
Server Explorer — Interactive API reference:
- Live server status and health
- Browse all available endpoints
- Test endpoints directly with editable payloads
- View formatted responses
Access via Management window (⌘ Shift M) → Insights or Server.
See Developer Tools Guide for details.
| Command | Description |
|---|---|
osaurus serve |
Start the server (default port 1337) |
osaurus serve --expose |
Start exposed on LAN |
osaurus stop |
Stop the server |
osaurus status |
Check server status |
osaurus ui |
Open the menu bar UI |
osaurus list |
List downloaded models |
osaurus run <model> |
Interactive chat with a model |
osaurus mcp |
Start MCP stdio transport |
osaurus tools <cmd> |
Manage plugins (install, list, search, etc.) |
Tip: Set OSU_PORT to override the default port.
Base URL: http://127.0.0.1:1337 (or your configured port)
| Endpoint | Description |
|---|---|
GET /health |
Server health |
GET /v1/models |
List models (OpenAI format) |
GET /v1/tags |
List models (Ollama format) |
POST /v1/chat/completions |
Chat completions (OpenAI format) |
POST /messages |
Chat completions (Anthropic format) |
POST /chat |
Chat (Ollama format, NDJSON) |
All endpoints support /v1, /api, and /v1/api prefixes.
See the OpenAI API Guide for tool calling, streaming, and SDK examples.
Point any OpenAI-compatible client at Osaurus:
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:1337/v1", api_key="osaurus")
response = client.chat.completions.create(
model="llama-3.2-3b-instruct-4bit",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)- macOS 15.5+ (Apple Foundation Models require macOS 26)
- Apple Silicon (M1 or newer)
- Xcode 16.4+ (to build from source)
Models are stored at ~/MLXModels by default. Override with OSU_MODELS_DIR.
git clone https://github.com/dinoki-ai/osaurus.git
cd osaurus
open osaurus.xcworkspace
# Build and run the "osaurus" targetWe're looking for contributors! Osaurus is actively developed and we welcome help in many areas:
- Bug fixes and performance improvements
- New plugins and tool integrations
- Documentation and tutorials
- UI/UX enhancements
- Testing and issue triage
- Check out Good First Issues
- Read the Contributing Guide
- Join our Discord to connect with the team
See docs/FEATURES.md for a complete feature inventory and architecture overview.
- Documentation — Guides and tutorials
- Discord — Chat with the community
- Plugin Registry — Browse and contribute tools
- Contributing Guide — How to contribute
If you find Osaurus useful, please star the repo and share it!
