Introduction to ROMA (Recursive Open Meta-Agent)
ROMA (Recursive Open Meta-Agent) is an open-source meta-agent framework designed for building high-performance multi-agent systems. It coordinates multiple simple agents and tools to collaboratively solve complex problems.
At the core of ROMA is a structure designed for multi-agent systems: a hierarchical recursive task tree.
In this system, the main node breaks down complex goals into multiple sub-tasks, passing the context to sub-nodes for execution; when the sub-tasks are completed, the results are aggregated back to the main node. Through this context flow mechanism, ROMA simplifies and ensures the reliability of building agents capable of handling medium to long-range, multi-step tasks.
Example Explanation
For instance, if you want an agent to help you write a report on the climate differences between Los Angeles and New York.
In ROMA:
- The top-level main node will split the task into multiple sub-tasks:
Sub-task 1: Research the climate of Los Angeles.
Sub-task 2: Research the climate of New York.
Each sub-task can call specialized agents and tools, such as AI search models or weather APIs.
Once both studies are completed, the main node generates a "comparative analysis" task to compile the results into a complete report.
This structure makes the task decomposition and result aggregation of the system clear and straightforward.
Advantages of ROMA
ROMA makes the construction of multi-agent systems more direct and transparent.
Utilizes Pydantic for structured input and output, ensuring clear and traceable context flow;
Developers can precisely observe the reasoning process, facilitating debugging, optimizing prompts, and replacing agents;
The transparency of the system allows for rapid iteration of "context engineering" rather than black-box operations;
The modular design allows you to insert agents, tools, or models at any node, including LLM-based specialized agents or "human review" stages;
The tree structure naturally supports parallelization, balancing flexibility and high performance, suitable for large and complex tasks.
Performance Validation: ROMA Search
To validate the framework's effectiveness, Sentient built ROMA Search—a web search agent based on the ROMA architecture (without specific domain optimization).
In the SEALQA benchmark's most challenging subset Seal-0 (testing complex multi-source reasoning), ROMA Search achieved an accuracy of 45.6%, setting a new record.
Leading over the previous entry Kimi Researcher (36%);
Nearly double that of Gemini 2.5 Pro (19.8%);
In the open-source models, ROMA Search significantly outperformed Sentient's self-developed Open Deep Search (8.9%).

Additionally, ROMA Search achieved industry-best performance in FRAMES (multi-step reasoning) and approached top-tier levels in SimpleQA (fact retrieval), demonstrating its strong versatility across tasks.


Open-ness and Scalability of ROMA
ROMA is fully open-source and highly extensible.
Search is just the beginning—anyone can:
Insert new agents;
Use custom tools to extend the framework;
Apply ROMA in fields such as financial analysis, scientific reports, creative content generation, etc.
ROMA provides a solid central framework, and the real breakthroughs will come from the community building an ecosystem on top of it.
Why "Long-range Tasks" Challenge Agents
AI has made significant progress in single-step tasks (such as summarizing articles, writing emails, performing arithmetic), but it remains fragile when facing "long-range tasks"—goals that require multi-step reasoning and continuous actions.
The key issue is: error accumulation.
A model may have a single-step success rate as high as 99%, but when it needs to coherently execute ten steps, the overall success rate can plummet. A single hallucination, misinterpretation, or loss of context can lead to complete failure.
Therefore, building systems that can stably handle multiple sub-tasks and cross-information source reasoning is exceptionally challenging.
To solve such problems, two major challenges must be overcome:
Architectural Layer (Meta-Challenge): How to design a system that can reliably perform long-range reasoning even under error accumulation?
Task Layer (Task-Specific Challenge): For specific goals, how to determine the best way to decompose tasks, tools, models, prompts, and validation steps?
Search tasks are an ideal case:
They are inherently multi-step (retrieve → read → extract → cross-validate → synthesize) and rely on real-time, complex external knowledge.
For example, the question: "How many movies with a budget of $350 million or more were not the highest-grossing films of the year?"
To answer this question, the agent needs to:
Decompose the question (find high-budget movies → find the highest-grossing films of each year);
Retrieve the latest data from multiple sources;
Logically reason about the results;
Synthesize the final answer.

In this process, hallucinations, misalignments, and inefficient loops can lead to failure. Traditional agent architectures often hide internal reasoning paths, making tuning and improvement very difficult.
ROMA's Solution
ROMA addresses the challenges of long-range tasks by providing a recursive, hierarchical system structure.
Each task is a "node":
It can be executed directly;
Or decomposed into sub-tasks;
Or aggregate sub-results.
The tree structure allows for transparent and traceable context flow, making it easier to optimize layer by layer.

On this framework, developers only need to choose appropriate tools, prompts, or validation mechanisms for each node to build a robust multi-agent system.
Execution Process of ROMA (Using ROMA Search as an Example)
1️⃣ Atomizer (Analyzer) — Assessing Task Complexity
The system starts with the main task, first determining whether it can be completed by a single agent or needs further decomposition.
2️⃣ Planner — Decomposing Sub-Tasks
If the task is complex, the node becomes a planner, breaking the goal down into smaller tasks, such as:
Searching for movies with a budget of ≥ $350 million;
Searching for the highest-grossing films of the corresponding years;
Analyzing and generating a list of qualifying films.
Each sub-task generates a sub-node, which can be executed dependently or in parallel.
3️⃣ Executor — Executing Sub-Tasks
When a sub-task is simple enough, the node becomes an executor, calling the appropriate tools or models (such as search APIs, information extraction models), and passing the output to subsequent nodes.
4️⃣ Aggregator — Integrating Results
Once all executors are complete, the main node becomes an aggregator, compiling results, verifying consistency, and generating the final answer.
Human-in-the-Loop and Stage Tracing
At any node, humans can intervene to verify facts or supplement context.
ROMA can also request user confirmation of sub-tasks during the planning phase to avoid early misunderstandings.
Even without human intervention, the stage tracing system can fully record the input and output of each node, helping developers quickly locate errors and optimize logic.
Scalability of ROMA
The above example only demonstrates single-layer task decomposition.
In practical applications, ROMA can recursively handle multiple layers, forming deep task trees.
When sub-tasks are independent, the system automatically executes them in parallel, achieving efficient computation with hundreds or even thousands of nodes.
Are you ready to participate in the future of AI agents?
ROMA Search is just the starting point.
We have fully open-sourced ROMA and invite developers worldwide to explore together.
Developers (Builders): Try building agents in ROMA, replace models, test multimodal capabilities, or create generative content (such as comics, podcasts) and analytical tasks (such as research reports).
Researchers: Advance meta-agent architecture research based on ROMA. Its transparent stage tracing mechanism can provide unique insights into agent interactions and context flow.
The progress of proprietary systems relies on a single company; the evolution of ROMA comes from the collective wisdom of the entire open-source community.
Join ROMA now:
GitHub Repository:
https://github.com/sentient-agi/ROMA
Video Introduction:
https://youtu.be/ghoYOq1bSE4?feature=shared
References:
¹https://arxiv.org/pdf/2506.01062
²https://moonshotai.github.io/Kimi-Researcher/
³https://arxiv.org/pdf/2409.12941
⁴ https://openai.com/index/introducing-simpleqa/
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。