How to connect SubAgent to an LLM provider — from the built-in adapter to custom integrations.
Add the dependencies:
def deps do
[
{:ptc_runner, "~> 0.10.0"},
{:req_llm, "~> 1.8"} # enables the built-in adapter
]
endSet your API key and run:
export OPENROUTER_API_KEY=sk-or-...
# Pass model alias directly - no callback needed!
{:ok, step} = PtcRunner.SubAgent.run("What is 2 + 2?", llm: "haiku")
step.return #=> 4That's it. The built-in adapter handles text generation, structured output, tool calling, and prompt caching across providers.
SubAgent accepts model strings directly via the llm: option. Aliases are resolved
automatically through PtcRunner.LLM.Registry:
# Using aliases (resolved to default provider)
{:ok, step} = SubAgent.run(agent, llm: "haiku")
{:ok, step} = SubAgent.run(agent, llm: "sonnet")
# Using provider:alias format
{:ok, step} = SubAgent.run(agent, llm: "bedrock:haiku")
{:ok, step} = SubAgent.run(agent, llm: "openrouter:sonnet")
# Using full model IDs (passthrough)
{:ok, step} = SubAgent.run(agent, llm: "openrouter:anthropic/claude-haiku-4.5")| Alias | Description | Providers |
|---|---|---|
haiku |
Claude Haiku 4.5 - Fast, cost-effective | openrouter, bedrock, anthropic |
sonnet |
Claude Sonnet 4.5 - Balanced performance | openrouter, bedrock, anthropic |
gemini |
Gemini 2.5 Flash - Google's fast model | openrouter, google |
deepseek |
DeepSeek Chat V3 - Cost-effective reasoning | openrouter |
gpt |
GPT-4.1 Mini - OpenAI's efficient model | openrouter, openai |
qwen-local |
Qwen 2.5 Coder 7B - Local via Ollama | ollama |
Configure the default provider:
# In config.exs
config :ptc_runner, :default_provider, :bedrock
# Or via environment variable
export LLM_DEFAULT_PROVIDER=bedrockWhen you use an alias like "haiku", it resolves using the default provider.
With bedrock as default, "haiku" becomes "amazon_bedrock:anthropic.claude-haiku-4-5-20251001-v1:0".
To add custom aliases or override the default registry, implement the
PtcRunner.LLM.Registry behaviour:
defmodule MyApp.ModelRegistry do
@behaviour PtcRunner.LLM.Registry
@impl true
def resolve("fast"), do: {:ok, "anthropic:claude-haiku-4-5-20251001"}
def resolve("smart"), do: {:ok, "anthropic:claude-sonnet-4-5-20250929"}
def resolve(name), do: PtcRunner.LLM.DefaultRegistry.resolve(name)
@impl true
def resolve!(name) do
case resolve(name) do
{:ok, model_id} -> model_id
{:error, reason} -> raise ArgumentError, reason
end
end
@impl true
def validate(model_string) do
case resolve(model_string) do
{:ok, _} -> :ok
{:error, reason} -> {:error, reason}
end
end
# Delegate remaining callbacks to DefaultRegistry
@impl true
defdelegate default_model(), to: PtcRunner.LLM.DefaultRegistry
@impl true
defdelegate default_provider(), to: PtcRunner.LLM.DefaultRegistry
@impl true
defdelegate aliases(), to: PtcRunner.LLM.DefaultRegistry
@impl true
defdelegate list_models(), to: PtcRunner.LLM.DefaultRegistry
@impl true
defdelegate preset_models(provider), to: PtcRunner.LLM.DefaultRegistry
@impl true
defdelegate available_providers(), to: PtcRunner.LLM.DefaultRegistry
@impl true
defdelegate provider_from_model(model), to: PtcRunner.LLM.DefaultRegistry
endRegister it in your config:
config :ptc_runner, :model_registry, MyApp.ModelRegistryNow you can use your custom aliases:
{:ok, step} = SubAgent.run(agent, llm: "fast") # Your custom alias
{:ok, step} = SubAgent.run(agent, llm: "haiku") # Still works via delegationPtcRunner.LLM.callback/2 creates a SubAgent-compatible callback using the built-in
PtcRunner.LLM.ReqLLMAdapter. It resolves aliases via PtcRunner.LLM.Registry
(e.g., "haiku" → "openrouter:anthropic/claude-haiku-4.5"), so you can pass
aliases directly. Already-resolved provider:model strings pass through unchanged.
Supported provider prefixes:
| Prefix | Provider | API Key Env Var |
|---|---|---|
openrouter: |
OpenRouter | OPENROUTER_API_KEY |
anthropic: |
Anthropic direct | ANTHROPIC_API_KEY |
bedrock: |
AWS Bedrock | AWS_ACCESS_KEY_ID |
google: |
Google Gemini | GOOGLE_API_KEY |
openai: |
OpenAI | OPENAI_API_KEY |
groq: |
Groq | GROQ_API_KEY |
ollama: |
Local Ollama | (none) |
openai-compat: |
Any OpenAI-compatible | (varies) |
# Cloud providers (use provider:model format)
PtcRunner.LLM.callback("openrouter:anthropic/claude-sonnet-4")
PtcRunner.LLM.callback("anthropic:claude-haiku-4-5-20251001")
PtcRunner.LLM.callback("amazon_bedrock:anthropic.claude-haiku-4-5-20251001-v1:0", cache: true)
PtcRunner.LLM.callback("google:gemini-2.5-flash")
# Local providers
PtcRunner.LLM.callback("ollama:deepseek-coder:6.7b")
PtcRunner.LLM.callback("openai-compat:http://localhost:1234/v1|my-model")Pass cache: true to enable prompt caching on supported providers (Anthropic, Bedrock
Claude, OpenRouter with Anthropic models):
llm = PtcRunner.LLM.callback("anthropic:claude-haiku-4-5-20251001", cache: true)For AWS Bedrock, the region is resolved in order:
AWS_REGIONenvironment variableconfig :ptc_runner, :bedrock_region, "us-east-1"- Default:
"eu-north-1"
Pass on_chunk to receive text chunks in real-time:
llm = PtcRunner.LLM.callback("openrouter:anthropic/claude-haiku-4.5")
on_chunk = fn %{delta: text} -> IO.write(text) end
{:ok, step} = PtcRunner.SubAgent.run(agent, llm: llm, on_chunk: on_chunk)When the adapter supports stream/2, chunks arrive incrementally. Otherwise on_chunk
fires once with the full content (graceful degradation). For agents with tools, on_chunk
fires on the final text answer only — tool-calling turns are not streamed.
See PtcRunner.LLM.callback/2 for details.
SubAgent is provider-agnostic. Any function that accepts a request map and returns
{:ok, content} or {:ok, %{content: ..., tokens: ...}} works:
llm = fn %{system: system, messages: messages} ->
# Call your provider here
{:ok, "response text"}
end
{:ok, step} = PtcRunner.SubAgent.run("Hello", llm: llm)The request map contains:
| Key | Type | Description |
|---|---|---|
system |
String.t() |
System prompt (include in messages sent to LLM) |
messages |
[map()] |
Conversation history |
schema |
map() | nil |
JSON Schema for structured output |
tools |
[map()] | nil |
Tool definitions for tool calling |
cache |
boolean() |
Prompt caching hint |
turn |
integer() |
Current turn number |
The return value shape depends on what the agent needs:
# Minimal — text only
{:ok, "response text"}
# With token tracking
{:ok, %{content: "response text", tokens: %{input: 100, output: 50}}}
# With tool calls (when tools are in the request)
{:ok, %{tool_calls: [%{name: "search", args: %{"q" => "test"}}], content: nil, tokens: %{}}}For reuse across your application, implement the PtcRunner.LLM behaviour:
defmodule MyApp.LLMAdapter do
@behaviour PtcRunner.LLM
@impl true
def call(model, request) do
messages = [%{role: :system, content: request.system} | request.messages]
# ... call your provider, return {:ok, %{content: ..., tokens: ...}}
end
# Optional — enables streaming via on_chunk
@impl true
def stream(model, request) do
# Return {:ok, stream} where stream emits %{delta: text} and %{done: true, tokens: map()}
# Or {:error, :streaming_not_supported} to fall back to call/2
end
endRegister it globally:
# config/config.exs
config :ptc_runner, :llm_adapter, MyApp.LLMAdapterThen use PtcRunner.LLM.callback/2 as normal — it delegates to your adapter:
llm = PtcRunner.LLM.callback("my-model-name", cache: true)The callback interface makes it straightforward to wrap any LLM library.
Call any OpenAI-compatible API with Req:
llm = fn %{system: system, messages: messages} ->
body = %{
model: "gpt-4.1-mini",
messages: [%{role: "system", content: system} | messages]
}
case Req.post!("https://api.openai.com/v1/chat/completions",
json: body,
headers: [{"authorization", "Bearer #{System.get_env("OPENAI_API_KEY")}"}]
) do
%{status: 200, body: %{"choices" => [%{"message" => %{"content" => text}} | _]}} ->
{:ok, text}
%{body: body} ->
{:error, body}
end
endWrap LangChain chains:
llm = fn %{system: system, messages: messages} ->
{:ok, chain} =
LangChain.Chains.LLMChain.new(%{
llm: LangChain.ChatModels.ChatOpenAI.new!(%{model: "gpt-4.1-mini"})
})
all_messages =
[LangChain.Message.new_system!(system)] ++
Enum.map(messages, fn
%{role: :user, content: c} -> LangChain.Message.new_user!(c)
%{role: :assistant, content: c} -> LangChain.Message.new_assistant!(c)
end)
case LangChain.Chains.LLMChain.run(chain, %{messages: all_messages}) do
{:ok, _chain, %LangChain.Message{content: content}} ->
{:ok, content}
{:error, reason} ->
{:error, reason}
end
endRun models locally with Bumblebee:
# Start the serving in your application supervisor
{:ok, _} = Bumblebee.Text.Generation.serving(model_info, tokenizer, generation_config)
llm = fn %{system: system, messages: messages} ->
prompt = format_chat_prompt(system, messages)
case Nx.Serving.batched_run(MyApp.LLMServing, prompt) do
%{results: [%{text: text}]} -> {:ok, text}
error -> {:error, error}
end
endInstructor specializes in structured output, which pairs well with text-mode SubAgents:
defmodule MyApp.InstructorAdapter do
@behaviour PtcRunner.LLM
@impl true
def call(model, %{schema: schema} = req) when is_map(schema) do
messages = [%{role: "system", content: req.system} | req.messages]
case Instructor.chat_completion(model: model, messages: messages, response_model: schema) do
{:ok, result} ->
{:ok, %{content: Jason.encode!(result), tokens: %{}}}
{:error, reason} ->
{:error, reason}
end
end
def call(model, req) do
# Fall back to plain text generation for non-schema requests
# ...
end
endWhen you call PtcRunner.LLM.callback/2 or PtcRunner.LLM.call/2, the adapter is
resolved in this order:
config :ptc_runner, :llm_adapter, MyApp.LLMAdapter— explicit configPtcRunner.LLM.ReqLLMAdapter— auto-discovered whenreq_llmis in deps- Raises with setup instructions if neither is available
This means adding {:req_llm, "~> 1.8"} to your deps is all you need — no config
required.
- Getting Started — First SubAgent walkthrough
- Structured Output Callbacks — Schema handling, tool calling, and provider-specific patterns
PtcRunner.LLM— API referencePtcRunner.LLM.ReqLLMAdapter— Built-in adapter reference