Skip to content

Memory MCP Integration LoRA #1

@codelion

Description

@codelion

🚀 New Recipe Proposal: Memory MCP Integration LoRA

Problem Statement

Local models (especially smaller ones like Qwen2.5-Coder-0.5B, Gemma-3-1B) lack the ability to effectively interact with the Memory MCP Server for persistent knowledge graph-based memory management. This prevents them from maintaining context across conversations and building user-specific knowledge over time.

Proposed Solution

Create a specialized LoRA adapter that teaches models to:

  • Interact with the Memory MCP Server's knowledge graph structure
  • Generate appropriate tool calls for entity/relation/observation management
  • Maintain persistent memory across chat sessions
  • Query and update the knowledge graph effectively

Key Objectives

  1. MCP Protocol Mastery: Train models to use Memory MCP tools correctly (create_entities, create_relations, add_observations, read_graph, etc.)
  2. Knowledge Graph Understanding: Teach structured thinking about entities, relations, and observations
  3. Memory Strategy: Implement effective patterns for when to store, update, or retrieve memories
  4. Context Persistence: Enable cross-session memory recall and updates

Training Methodology

Following Ellora's philosophy of self-supervised data generation:

  1. Synthetic Conversation Generation: Use Magpie-style approach to generate diverse conversational scenarios requiring memory
  2. Memory Operation Sequences: Create training data with proper MCP tool calling patterns
  3. Graph State Tracking: Include examples of building and querying knowledge graphs
  4. GRPO Training: Use preference learning to optimize memory usage patterns

Expected Dataset Structure

{
  "conversations": [
    {
      "user": "My name is Alice and I work at Anthropic",
      "assistant_thinking": "Need to create entities for user and organization, then relate them",
      "mcp_calls": [
        {"tool": "create_entities", "params": {"entities": [{"name": "Alice", "type": "person"}]}},
        {"tool": "create_entities", "params": {"entities": [{"name": "Anthropic", "type": "organization"}]}},
        {"tool": "create_relations", "params": {"relations": [{"from": "Alice", "to": "Anthropic", "type": "works_at"}]}}
      ]
    }
  ]
}

Technical Approach

  1. Base Models: Start with Qwen2.5-Coder-0.5B-Instruct and Gemma-3-1B-IT
  2. LoRA Configuration: r=16, alpha=32, targeting attention layers
  3. Training Framework: Unsloth for memory efficiency + PEFT
  4. Evaluation: Test on memory recall accuracy, graph construction quality, and cross-session persistence

Success Metrics

  • Memory Recall Rate: >85% accurate retrieval of stored information
  • Graph Construction: Proper entity/relation/observation structure in >90% cases
  • Tool Usage Accuracy: >95% correct MCP tool call formatting
  • Cross-Session Persistence: Successfully maintains context across 5+ conversation sessions

Deliverables

  1. Training notebook: Ellora_Recipe_6_Memory_MCP_Integration_LoRA.ipynb
  2. Trained adapters on HuggingFace Hub
  3. Evaluation benchmarks for memory operations
  4. Documentation and usage examples

Why This Recipe?

  • Ecosystem Integration: Bridges local models with MCP infrastructure
  • Practical Value: Enables persistent memory for chatbots and assistants
  • No External Dependencies: Uses self-generated training data
  • Wide Applicability: Any model can gain memory capabilities

Related Work

Next Steps

  1. Design the synthetic data generation pipeline
  2. Create memory operation scenarios
  3. Implement GRPO training with memory-specific rewards
  4. Benchmark against Claude's native MCP integration

This recipe would extend Ellora's collection with a crucial capability for building stateful AI applications with local models. The Memory MCP integration would allow even small models to maintain long-term context and build personalized knowledge graphs.

cc: @codelion

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions