Chatbot with Memory

A production-ready chatbot system with persistent memory, user profiles, and document ingestion capabilities.

Features

Persistent Memory: Stores facts about users using a weighted fact system
User Profiles: Automatically builds and updates user profiles from conversations
Document Ingestion: Process and learn from PDF documents (50-200 pages)
Vector Search: Semantic search across conversation history and documents
Fact Connections: Discovers relationships between different pieces of information
Fast Retrieval: Uses PostgreSQL with pgvector for efficient similarity search

Architecture

Backend: FastAPI + PostgreSQL with pgvector
Embeddings: OpenAI text-embedding-3-large (3072 dimensions)
LLM: OpenAI GPT-4o for reasoning and responses
Short-term Memory: Redis for recent conversation context
Document Processing: PyMuPDF and Unstructured for PDF extraction

Setup

Prerequisites

PostgreSQL with pgvector extension
Redis
Python 3.9+
OpenAI API key

Database Setup

Ensure you have PostgreSQL installed with the pgvector extension:

CREATE EXTENSION IF NOT EXISTS vector;

Create a .env file based on .env.example:

cp .env.example .env
# Edit .env with your settings

Initialize the database:

python setup_database.py

Installation

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Running the API

python -m api.main

The API will be available at http://localhost:8000

API Endpoints

Users

POST /users - Create a new user
GET /users/{user_id} - Get user details

Conversations

POST /conversations - Create a new conversation
GET /conversations/{user_id} - List user's conversations

Chat

POST /chat - Send a message and get a response

Documents

POST /documents/upload - Upload and process a PDF
GET /documents/{user_id} - List user's documents

Memory

GET /memory/{user_id}/facts - Get user's memory facts

Usage Example

import requests

# Create a user
response = requests.post("http://localhost:8000/users", 
    json={"username": "alice"})
user_id = response.json()["user_id"]

# Create a conversation
response = requests.post(f"http://localhost:8000/conversations?user_id={user_id}",
    json={"title": "First Chat"})
conversation_id = response.json()["conversation_id"]

# Send a message
response = requests.post("http://localhost:8000/chat",
    json={
        "conversation_id": conversation_id,
        "message": "Hi! I'm interested in learning Python for data science."
    })
print(response.json()["response"])

# Upload a document
with open("python_tutorial.pdf", "rb") as f:
    response = requests.post(
        f"http://localhost:8000/documents/upload?user_id={user_id}",
        files={"file": f}
    )

Memory System

How Facts are Stored

Facts are stored as subject-predicate-object triples
Each fact has a weight (importance) and confidence score
Facts decay over time if not reinforced
Similar facts are merged to avoid duplication

User Profile Schema

{
    "name": "Alice",
    "pronouns": "she/her",
    "interests": ["Python", "data science", "machine learning"],
    "skills": ["programming", "statistics"],
    "goals": ["build ML models", "analyze data"],
    "tone_preference": "friendly and encouraging",
    "reading_level": "technical",
    "conversation_style": "detailed explanations",
    "important_contacts": [{"name": "Bob", "relationship": "colleague"}],
    "constraints": ["prefers visual examples"]
}

Configuration

Key environment variables:

OPENAI_API_KEY: Your OpenAI API key
DATABASE_URL: PostgreSQL connection string
REDIS_URL: Redis connection string
MAX_MEMORY_FACTS: Maximum facts to store per user (default: 1000)
MEMORY_DECAY_DAYS: Days before memories start decaying (default: 90)

Performance Considerations

Document chunks are embedded in batches for efficiency
Embeddings are cached in memory to avoid recomputation
Vector similarity search is optimized with pgvector indexes
Facts are weighted and pruned to maintain relevance

Future Enhancements

Support for more document formats (Word, HTML, Markdown)
Graph database integration for complex relationships
Multi-modal memory (images, audio)
Export/import memory snapshots
Fine-tuned models for better fact extraction

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude		.claude
api		api
app		app
core		core
db		db
docs		docs
frontend		frontend
ingestion		ingestion
memory		memory
uploads		uploads
.DS_Store		.DS_Store
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LIVEKIT_SETUP.md		LIVEKIT_SETUP.md
README.md		README.md
backend.log		backend.log
check_sesame_models.py		check_sesame_models.py
requirements-core.txt		requirements-core.txt
requirements.txt		requirements.txt
restart_backend.sh		restart_backend.sh
run_livekit_agent.py		run_livekit_agent.py
setup_database.py		setup_database.py
start_all.sh		start_all.sh
start_server.sh		start_server.sh
test_ai_switching.py		test_ai_switching.py
test_api.py		test_api.py
test_api_persistence.py		test_api_persistence.py
test_gradio_predict.py		test_gradio_predict.py
test_gradio_queue.py		test_gradio_queue.py
test_gradio_sesame.py		test_gradio_sesame.py
test_livekit_connection.html		test_livekit_connection.html
test_livekit_endpoint.py		test_livekit_endpoint.py
test_livekit_token.py		test_livekit_token.py
test_logging.py		test_logging.py
test_providers.py		test_providers.py
test_sesame_debug.py		test_sesame_debug.py
test_sesame_direct.py		test_sesame_direct.py
test_sesame_web.py		test_sesame_web.py
test_user_creation.py		test_user_creation.py
worker.py		worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chatbot with Memory

Features

Architecture

Setup

Prerequisites

Database Setup

Installation

Running the API

API Endpoints

Users

Conversations

Chat

Documents

Memory

Usage Example

Memory System

How Facts are Stored

User Profile Schema

Configuration

Performance Considerations

Future Enhancements

About

Uh oh!

Releases

Packages

Languages

jaypound/chatbot_memory

Folders and files

Latest commit

History

Repository files navigation

Chatbot with Memory

Features

Architecture

Setup

Prerequisites

Database Setup

Installation

Running the API

API Endpoints

Users

Conversations

Chat

Documents

Memory

Usage Example

Memory System

How Facts are Stored

User Profile Schema

Configuration

Performance Considerations

Future Enhancements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages