Releases: letta-ai/letta
v0.12.1
Major Features
New Agent Architecture: letta_v1_agent
The new recommended agent architecture with significant improvements over the legacy agent system.
What's Different:
- No
send_message
tool required - Works with any chat model, including non-tool-calling models
- No heartbeat system
- Simpler base system prompt - agentic control loop understanding is baked into modern LLMs
- Follows standard tool calling patterns (auto mode) for broader compatibility
Provider Support:
- Compatible with all inference providers (OpenRouter, Azure, Together, Ollama, etc.).
- Works with non-tool-calling models.
- Supports OpenAI's Responses API for drastically improved performance with GPT-5 models.
Trade-offs:
- No heartbeats: Agents won't independently trigger repeated execution. If you need sleep-time compute or periodic processing, implement this through your own prompting or scheduling.
- Tool rules on messaging: Cannot apply tool rules to agent messaging, i.e. you cannot require a particular tool to be followed by an assistant message.
- Reasoning visibility: Non-reasoning models (GPT-4.1, GPT-4o-mini) will no longer generate explicit reasoning output.
- Reasoning control: Less control over reasoning tokens, which are typically encrypted by providers and cannot be passed between different providers.
Recommendation: Create all new agents as letta_v1_agent
. The expanded provider compatibility and simpler architecture make this the best choice for most use cases. Use legacy agents only if you specifically need heartbeats or tool rules on messaging.
Human-in-the-Loop (HITL)
Tools can now require human approval before execution. Set approval requirements via API or in the ADE for greater control over agent actions.
📖 Documentation
Parallel Tool Calling
Agents now execute multiple tool calls simultaneously when supported by the inference provider. Each tool runs in its own sandbox for true parallel execution. See the Claude documentation for examples on parallel tool calling.
Runs API
New tracking system providing substantially improved observability and debugging capabilities. Documentation coming soon.
Enhanced Archival Memory (Letta Cloud only)
- Hybrid search: Combines full-text and semantic search
- DateTime filtering: Query memories by time range
- Search API endpoint: Documentation
Improved Pagination
Cursor-based pagination now available across many endpoints for handling large result sets.
New Tools
Memory Omni-Tool
Unified memory interface for more intuitive agent memory management.
fetch_webpage
Tool
Utility tool for retrieving LLM-friendly webpage content.
Agent Configuration
Templates & Agentfiles
- Template updates: Templates can now be updated via agentfiles
- Agentfile v2 schema: Now supports groups, folders, etc.
Breaking Changes
Deprecated APIs
get_folder_by_name
: Useclient.folders.list(name=...)
insteadsources
routes: All routes renamed tofolders
What's Changed
- fix: summarization_agent unknown attribute bug by @carenthomas in #3028
- fix: open router invalid model id bug by @carenthomas in #3028
- chore: bump version 0.12.1 by @carenthomas in #3029
Full Changelog: 0.12.0...0.12.1
v0.12.0
chore: add various fixes (#3026)
v0.11.7
🧑 Human-in-the-Loop (HITL) Support
This release introduces human-in-the-loop functionality for tool execution, allowing users to configure certain tools are requiring approvals or specify per-agent requirements. This feature introduces two new LettaMessage
types:
ApprovalRequestMessage
(for the agent to request an approval)ApprovalResponseMessage
(for the client to either provide or deny an approval)
Example of approving a tool call:
response = client.agents.messages.create(
agent_id=agent.id,
messages=[{
"type": "approval",
"approve": True,
"approval_request_id": "message-abc123",
}]
)
See the full documentation here.
📁 Agent File (.af
) v2
The Agent File schema has been migrated to a v2 version, and now supportsgroups (multi-agent) and files (#4249).
🔎 Archival memory search
Improvements to archival memory search with support for tags, timestamps, and hybrid search:
- Tag-Based Search and Insert: Agents can now insert and search archival memories with arbitrary string tags (#4300, #4285)
- Temporal Filtering: Support for timestamp-based filtering of archival memories (#4330, #4398)
- Hybrid Search: New archival search endpoint with hybrid search functionality (#4390)
🧠 Model and provider support
- GPT-5 Optimization: Improved GPT-5 support with proper context window handling and reasoning effort configuration (#4344, #4379, #4380)
- DeepSeek Support: Migration of DeepSeek to new agent loop architecture (#4266)
- Enhanced Anthropic Support: Better native reasoning support and tool schema formatting (#4331, #4378)
- Extended Thinking: Fixed various issues with extended thinking mode for Anthropic (#4341)
- MCP Tool Schema: Fixed MCP tool schema formatting for Anthropic streaming (#4378)
- Gemini Improvements: Enhanced error handling and retry logic (#4323, #4397)
🧩 Misc improvements
- Refactored Streaming Logic: Improved streaming route architecture (#4369)
- Better Error Handling: Enhanced error propagation in streaming responses (#4253)
- Tool Return Limits: Reduced default tool return size to prevent token overflow (#4383)
- Embedding Support: Enable overriding embedding config on Agent File import (#4224)
- Tool Type Filtering: Ability to list and filter tools by type (#4036)
- Tool De-duplication: Automatic de-duplication of tool rules (#4282)
v0.11.6
chore: release 0.11.6 2 (#2779)
v0.11.5
What's Changed
- feat: add background mode for message streaming by @carenthomas in #2777
Full Changelog: 0.11.4...0.11.5
v0.11.4
What's Changed
- feat: deprecate legacy paths for azure and together by @carenthomas in https://github.com/letta-ai/letta/pull/3987
- feat: introduce asyncio shield to prevent stream timeouts by @carenthomas in https://github.com/letta-ai/letta/pull/3992
- feat: record step metrics to table by @@jnjpng in https://github.com/letta-ai/letta/pull/3887
Full Changelog: 0.11.3...0.11.4
v0.11.3
What's Changed
- mv dictconfig out of getlogger by @andrewrfitz in #2759
- chore: bump v0.11.3 by @carenthomas in #2760
Full Changelog: 0.11.2...0.11.3
v0.11.2
What's Changed
- fix: incorrect URL for Ollama embeddings endpoint by @antondevson in #2750
- fix: all model types returned from ollama provider by @antondevson in #2744
- feat: Add max_steps parameter to agent export by @mattzh72 in https://github.com/letta-ai/letta/pull/3828
Full Changelog: 0.11.1...0.11.2
v0.11.1
This release adds support for the latest model releases, and makes improvements to base memory and file tools.
🧠 Improved LLM model support
- Added support for Claude Opus 4.1 and GPT-5 models (#3806)
- Added
minimal
option forreasoning_effort
parameter in toLLMConfig
(#3816)
🔨 Built-in tool improvements
v0.11.0
⚠️ Legacy LocalClient and RestClient fully deprecated
- Legacy clients are fully removed moving forward and replaced by the new Letta sdk clients (python and typescript supported)
⛩️ Jinja Templating optimizations
- Jinja template engine is now offloaded to the thread pool to minimize CPU-bound operations blocking the async event loop
- Removed redundant rendering operations in critical paths
📈 Add Signoz integration for traces exporting
- You can configure exporting otel traces to Signoz by passing the required enviornment variables:
docker run \
-v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
-p 8283:8283 \
...
-e SIGNOZ_ENDPOINT=${SIGNOZ_ENDPOINT} \
-e SIGNOZ_INGESTION_KEY=${SIGNOZ_INGESTION_KEY} \
-e LETTA_OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
letta/letta:latest
Other Improvements
- chore: bumped min python version for letta package to
3.11
- fix: incorrect context_window or embedding_dim using ollama by @antondevson in #2743
- feat: add filesystem demo with file upload and streaming by @cpfiffer in #2736
- chore: remove python 3.10 support and testing by @carenthomas in #2754
New Contributors
- @antondevson made their first contribution in #2743
- @cpfiffer made their first contribution in #2736
Full Changelog: 0.10.1...0.11.0