Why smolagents open_deep_research can surpass auogen magnet one on Gaia benchmark #673

chenatu · 2025-02-17T12:01:02Z

chenatu
Feb 17, 2025

I am researching agents architecture among these open-source agents. I see that smolagents open_deep_research surpasses auogen magnet one on Gaia benchmark.

Smolagents and autogen can both use web and coding tools. Compared to Autgen, why smolagents can lead so much ahead.

KeepALifeUS · 2026-02-03T16:53:06Z

KeepALifeUS
Feb 3, 2026

Great question! Having worked with multi-agent architectures, here are the key architectural differences that likely explain the performance gap:

1. Code-First vs. Conversation-Based Execution

Smolagents uses a "code agent" approach where the LLM writes Python code that gets executed directly. This is more token-efficient than AutoGen's conversation-based approach where agents "talk" to each other using natural language.

2. Simpler Agent Coordination

AutoGen's Magentic-One uses multiple specialized agents communicating through an orchestrator. Each agent-to-agent message = API calls on both sides. Smolagents uses a flatter architecture with less inter-agent communication overhead.

3. Tool Call Efficiency

Smolagents executes tools directly in generated code rather than through a separate tool-calling protocol. This reduces the back-and-forth that adds latency and token usage.

4. Deep Research Pattern

The open_deep_research implementation specifically uses iterative refinement with web search that's optimized for the GAIA benchmark tasks (which often require multi-step research).

Key Insight:

From my experience building production multi-agent systems, the biggest cost/performance driver is often indirect coordination (stigmergy pattern) vs direct agent messaging. Systems that minimize agent-to-agent conversation and use shared state instead tend to be more efficient.

I documented some patterns on agent coordination here: https://github.com/KeepALifeUS/autonomous-agents

Would be curious to see benchmark comparisons on token usage, not just accuracy!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why smolagents open_deep_research can surpass auogen magnet one on Gaia benchmark #673

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why smolagents open_deep_research can surpass auogen magnet one on Gaia benchmark #673

Uh oh!

chenatu Feb 17, 2025

Replies: 1 comment

Uh oh!

KeepALifeUS Feb 3, 2026

chenatu
Feb 17, 2025

KeepALifeUS
Feb 3, 2026