-
Notifications
You must be signed in to change notification settings - Fork 934
feat(@mastra/core, @mastra/memory): add working memory vnext #5924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Review or Edit in CodeSandboxOpen the branch in Web Editor • VS Code • Insiders |
🦋 Changeset detectedLatest commit: 44b240d The changes in this PR will be included in the next version bump. This PR includes changesets to release 11 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR introduces an experimental next-generation working memory implementation ('vnext') that achieves a 20% performance improvement in longmemeval benchmarks. Key changes include:
- Added concurrent memory operation handling using async-mutex
- Implemented more granular memory updates with search/replace functionality
- Added safeguards against common model mistakes in memory updates
- Introduced a feature flag system to maintain backward compatibility
The implementation remains experimental due to:
- Some agents failing to properly utilize the new tools
- Issues with duplicate memory content in certain scenarios
PR Description Notes:
- Documentation updates are intentionally omitted, pending a future memory benchmarks blog post
Confidence score: 3/5
- Safe to merge with experimental flag, but needs careful monitoring in production
- Score reflects the explicitly experimental nature and known issues with tool usage
- Files needing attention:
- packages/memory/src/tools/working-memory.ts: Contains core logic for preventing duplicate content
- packages/core/src/memory/memory.ts: New abstract method implementation needs thorough testing
5 files reviewed, 4 comments
Edit PR Review Bot Settings | Greptile
} | ||
} | ||
|
||
async saveThread({ thread }: { thread: StorageThreadType; memoryConfig?: MemoryConfig }): Promise<StorageThreadType> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this for vnext+stable because it's a bug that we were storing the WM template as WM data on the thread/resource.
<working_memory_template> | ||
${template.content} | ||
</working_memory_template> | ||
|
||
${hasEmptyWorkingMemoryTemplateObject ? 'When working with json data, the object format below represents the template:' : ''} | ||
${hasEmptyWorkingMemoryTemplateObject ? JSON.stringify(emptyWorkingMemoryTemplateObject) : ''} | ||
|
||
WORKING MEMORY DATA: | ||
<working_memory_data> | ||
${data} | ||
</working_memory_data> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this for vnext+stable to make it clearer to the LLM where data starts/stops since LLMs are typically trained to understand xml-like tags as meaningful delimiters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also related to my comment above. Because we were storing the template as data when initializing a new thread/resource, the template and data would initially both contain the template, which would confuse the LLM.
…n unfilled template
For very long user interactions across multiple threads, the agent would sometimes get confused and completely erase working memory, thinking the data was no longer relevant.
To improve this, the vnext working memory tool takes a couple args now:
append-new-memory
,clarify-existing-memory
, andreplace-irrelevant-memory
).updateReason
is notappend-new-memory
When working memory
scope
is resource, if the agent does a find/replace where the reason isreplace-irrelevant-memory
, it will append the string to working memory instead of replacing.While running longmemeval benchmarks I saw that working memory alone scored quite poorly. The reason was the agent would frequently delete data from another thread if the data wasn't relevant to the current thread.
The changes here increased the score by about 20% for WM. I was going to PR these changes and replace the existing wm implementation, but I noticed the agent fails to properly use the new tools sometimes and adds duplicate memory content in some cases.
So for now I want to ship it as a
vnext
flag on working memory - the improvement in memory performance is significant but it still requires some more testing/fixing so it's experimental for now.I've intentionally left documentation changes out, I'll add docs in another PR when we make our memory blog post for benchmarks.