"Self-awareness: the hardest problem isn't solving within limits, it's discovering one's own limitations"
中文版 | Installation | Environments | Agent | Experience | Training | Architecture | Evolution | Contributing |
AWorld (Agent World) builds intelligent agents and rich environments where they operate, pushing the frontiers of AI capabilities and enabling continuous evolution. This project provides the fundamental recipe for agentic learning: Environment Access, Agent Construction, Experience Retrieval, and Model Training. What makes AWorld powerful is that agents can use these same components to automatically improve themselves.
💡 Visit our homepage for more details, or try our online environments and agents.
Tip
Python>=3.11
git clone https://github.com/inclusionAI/AWorld && cd AWorld
pip install -e .Provisioning rich environments is hard—packages conflict, APIs need keys, concurrency must scale. We make it painless with three access modes:
- Use our default hosted setup (tooling with usage costs includes a limited free tier).
- Bring your own API keys for unrestricted access (coming soon).
- Pull our Docker images and run everything on your own infrastructure (coming soon).
import os
import asyncio
from aworld.sandbox import Sandbox
INVITATION_CODE = os.environ.get("INVITATION_CODE", "")
mcp_config = {
"mcpServers": {
"gaia_server": {
"type": "streamable-http",
"url": "https://playground.aworldagents.com/environments/mcp",
"timeout": 600,
"sse_read_timeout": 600,
"headers": {
"ENV_CODE": "gaia",
"Authorization": f"Bearer {INVITATION_CODE}",
}
}
}
}
async def _list_tools():
sand_box = Sandbox(mcp_config=mcp_config, mcp_servers=["gaia_server"])
return await sand_box.mcpservers.list_tools()
if __name__ == "__main__":
tools = asyncio.run(_list_tools())
print(tools)In Aworld, an agent is simply a model enhanced with tools. To spin one up, you only need:
- a model endpoint (for training, a vLLM service works great)
- an online environment to call (use our hosted options or plug in your own MCP toolchain) That’s it—no heavyweight scaffolding required.
from aworld.agents.llm_agent import Agent
from aworld.runner import Runners
# refer the section above for details
mcp_config = {...}
searcher = Agent(
name="Search Agent",
system_prompt="You specialize at searching.",
mcp_config=mcp_config
)
if __name__ == "__main__":
result = Runners.sync_run(
input="Use google search tool to answer the question: the news about AI today.",
agent=searcher
)
print(f"answer: {result.answer}")Remember to plug in your LLM credentials first.
# Set LLM credentials
export LLM_MODEL_NAME="gpt-4"
export LLM_API_KEY="your-api-key-here"
export LLM_BASE_URL="https://api.openai.com/v1"Real-world problems often need more than a single agent. AWorld gives you flexible build paths:
- design automated workflows end to end Docs
- spin up MCP-enabled agents Docs
- orchestrate multi-agent systems (MAS) Docs
Want to see it live? Load a pre-built DeepResearch team in the AWorld Playground, inspect the source, and run it end to end.

Our runtime captures every step across offline and online runs. Each task yields a complete trajectory—every LLM call, action, and reward—so you can synthesize training samples, audit performance, and iterate with confidence.
Tasks unfold over many LLM calls. The framework captures every step, giving you a full trajectory.
import asyncio
from aworld.runner import Runners
from aworld.core.task import Task
from aworld.logs.util import logger
import json
# refer the section above for agent constrution
searcher = Agent(...)
if __name__ == "__main__":
async def test_complete_trajectory():
task = Task(
input="Use google search tool to answer the question: the news about AI today.",
agent=searcher
)
responses = await Runners.run_task(task)
resp = responses[task.id]
logger.info(f"task answer: {resp.answer}")
logger.info(f"task trajectory: {json.dumps(resp.trajectory, ensure_ascii=False)}")
asyncio.run(test_complete_trajectory())Need finer control? Call step() to inspect one action/response pair at a time. This lets you inject intermediate rewards during training, enabling richer, more flexible learning signals.
import os
import asyncio
from aworld.runner import Runners
from aworld.core.task import Task
from aworld.logs.util import logger
import json
from aworld.config import TaskConfig, TaskRunMode
# refer the section above for agent constrution
searcher = Agent(...)
if __name__ == "__main__":
async def test_single_step_introspection():
task = Task(
input="Use google search tool to answer the question: the news about AI today.",
agent=searcher,
conf=TaskConfig(
resp_carry_context=True,
run_mode=TaskRunMode.INTERACTIVE
)
)
trajectory_log = os.path.join(os.path.dirname(__file__), "trajectory_log.txt")
is_finished = False
step = 1
while not is_finished:
with open(trajectory_log, "a", encoding="utf-8") as traj_file:
is_finished, observation, response = await Runners.step(task)
traj_file.write(f"Step {step}\n")
traj_file.write(json.dumps(response.trajectory, ensure_ascii=False, indent=2))
traj_file.write("\n\n")
step += 1
asyncio.run(test_single_step_introspection())Once agents can roam across environments, AWorld closes the loop with two complementary training modes that drive continuous improvement.
Plug any mainstream LLM trainer—AReal, Swift, Verl, Slime, etc.—into the runtime to update model parameters directly. Adapters are lightweight, so you can reuse the same environment and agent code across trainers.
from datasets import load_dataset
from aworld.agents.llm_agent import Agent
from aworld.config import AgentConfig
from train.trainer.agent_trainer import AgentTrainer
from train.examples.train_gaia_with_aworld_verl.metrics.gaia_reward_function import gaia_reward_func
# refer the section above for details
mcp_config = {...}
# Configure agent to use Verl as the model service (adapts inference format automatically)
agent_config = AgentConfig(
llm_provider="verl"
)
searcher = Agent(
name="Search Agent",
system_prompt="You specialize at searching.",
mcp_config=mcp_config,
conf=agent_config
)
train_dataset = load_dataset("", split="train")
test_dataset = load_dataset("", split="test")
trainer = AgentTrainer(
agent=agent,
config=custom_train_config,
reward_func=gaia_reward_func,
train_dataset=train_dataset,
test_dataset=test_dataset
)
trainer.train()💡 Check the real case which includes the full training config to run agentic training.
Beyond weights, you can meta-learn whole agent systems. Spin up role-specific agents that critique, rewrite prompts, refine workflow, or adjust strategies for a target agent, then iterate the team (e.g., our Gaia demo).
This framework is engineered to be highly adaptable, enabling researchers and developers to explore and innovate across multiple domains, thereby advancing the capabilities and applications of multi-agent systems.
| Concepts | Description |
|---|---|
agent |
Define the foundational classes, descriptions, output parsing, and multi-agent collaboration (swarm) logic for defining, managing, and orchestrating agents in the AWorld system. |
runner |
Contains runner classes that manage the execution loop for agents in environments, handling episode rollouts and parallel training/evaluation workflows. |
task |
Define the base Task class that encapsulates environment objectives, necessary tools, and termination conditions for agent interactions. |
swarm |
Implement the SwarmAgent class managing multi-agent coordination and emergent group behaviors through decentralized policies. |
sandbox |
Provide a controlled runtime with configurable scenarios for rapid prototyping and validation of agent behaviors. |
tools |
Offer a flexible framework for defining, adapting, and executing tools for agent-environment interaction in the AWorld system. |
context |
Feature a comprehensive context management system for AWorld agents, enabling complete state tracking, configuration management, prompt optimization, multi-task state handling, and dynamic prompt templating throughout the agent lifecycle. |
memory |
Implement an extensible memory system for agents, supporting short-term and long-term memory, summarization, retrieval, embeddings, and integration. |
trace |
Feature an observable tracing framework for AWorld, enabling distributed tracing, context propagation, span management, and integration with popular frameworks and protocols to monitor and analyze agent, tool, and task execution. |
| Agent Construction | Topology Orchestration | Environment |
|---|---|---|
| ✅ Integrated MCP services | ✅ Encapsulated runtime | ✅ Runtime state management |
| ✅ Multi-model providers | ✅ Flexible MAS patterns | ✅ High-concurrency support |
| ✅ Customization options | ✅ Clear state tracing | ✅ Distributed training |
| ✅ Support Agent Skills 🚀 |
Our mission: AWorld handles the complexity, you focus on innovation. This section showcases cutting-edge multi-agent systems built with AWorld, advancing toward AGI.
-
FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling arxiv, 2025. paper, code, model, dataset
Zengzhuang Xu, Bingguang Hao, Zechuan Wang, Yuntao Wen, Maolin Wang, etc.
-
AWorld: Orchestrating the Training Recipe for Agentic AI. arxiv, 2025. paper, code, model
Chengyue Yu, Siyuan Lu, Chenyi Zhuang, Dong Wang, Qintong Wu, etc.
-
FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement. arxiv, 2025. paper, model
Bingguang Hao, Maolin Wang, Zengzhuang Xu, Cunyin Peng, etc.
-
Exploring Superior Function Calls via Reinforcement Learning. arxiv, 2025. paper, code
Bingguang Hao, Maolin Wang, Zengzhuang Xu, Yicheng Chen, etc.
-
RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism. arxiv, 2025. paper, code, model
Zhiwen Tan, Jiaming Huang, Qintong Wu, Hongxuan Zhang, Chenyi Zhuang, Jinjie Gu
-
V2P: From Background Suppression to Center Peaking for Robust GUI Grounding Task. arxiv, 2025. paper, code
Jikai Chen, Long Chen, Dong Wang, Leilei Gan, Chenyi Zhuang, Jinjie Gu
-
Don’t Just Fine-tune the Agent, Tune the Environment arxiv, 2025. paper
Siyuan Lu, Zechuan Wang, Hongxuan Zhang, Qintong Wu, Leilei Gan, Chenyi Zhuang, etc.
-
Profile-Aware Maneuvering: A Dynamic Multi-Agent System for Robust GAIA Problem Solving by AWorld. arxiv, 2025. paper, code
Zhitian Xie, Qintong Wu, Chengyue Yu, Chenyi Zhuang, Jinjie Gu
-
Recon-Act: A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation, and Task Execution. arxiv, 2025. paper, code
Kaiwen He, Zhiwei Wang, Chenyi Zhuang, Jinjie Gu
We warmly welcome developers to join us in building and improving AWorld! Whether you're interested in enhancing the framework, fixing bugs, or adding new features, your contributions are valuable to us.
For academic citations or wish to contact us, please use the following BibTeX entry:
@misc{yu2025aworldorchestratingtrainingrecipe,
title={AWorld: Orchestrating the Training Recipe for Agentic AI},
author={Chengyue Yu and Siyuan Lu and Chenyi Zhuang and Dong Wang and Qintong Wu and Zongyue Li and Runsheng Gan and Chunfeng Wang and Siqi Hou and Gaochi Huang and Wenlong Yan and Lifeng Hong and Aohui Xue and Yanfeng Wang and Jinjie Gu and David Tsai and Tao Lin},
year={2025},
eprint={2508.20404},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2508.20404},
}
