A production-ready chatbot system with persistent memory, user profiles, and document ingestion capabilities.
- Persistent Memory: Stores facts about users using a weighted fact system
- User Profiles: Automatically builds and updates user profiles from conversations
- Document Ingestion: Process and learn from PDF documents (50-200 pages)
- Vector Search: Semantic search across conversation history and documents
- Fact Connections: Discovers relationships between different pieces of information
- Fast Retrieval: Uses PostgreSQL with pgvector for efficient similarity search
- Backend: FastAPI + PostgreSQL with pgvector
- Embeddings: OpenAI text-embedding-3-large (3072 dimensions)
- LLM: OpenAI GPT-4o for reasoning and responses
- Short-term Memory: Redis for recent conversation context
- Document Processing: PyMuPDF and Unstructured for PDF extraction
- PostgreSQL with pgvector extension
- Redis
- Python 3.9+
- OpenAI API key
- Ensure you have PostgreSQL installed with the pgvector extension:
CREATE EXTENSION IF NOT EXISTS vector;- Create a
.envfile based on.env.example:
cp .env.example .env
# Edit .env with your settings- Initialize the database:
python setup_database.py- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtpython -m api.mainThe API will be available at http://localhost:8000
POST /users- Create a new userGET /users/{user_id}- Get user details
POST /conversations- Create a new conversationGET /conversations/{user_id}- List user's conversations
POST /chat- Send a message and get a response
POST /documents/upload- Upload and process a PDFGET /documents/{user_id}- List user's documents
GET /memory/{user_id}/facts- Get user's memory facts
import requests
# Create a user
response = requests.post("http://localhost:8000/users",
json={"username": "alice"})
user_id = response.json()["user_id"]
# Create a conversation
response = requests.post(f"http://localhost:8000/conversations?user_id={user_id}",
json={"title": "First Chat"})
conversation_id = response.json()["conversation_id"]
# Send a message
response = requests.post("http://localhost:8000/chat",
json={
"conversation_id": conversation_id,
"message": "Hi! I'm interested in learning Python for data science."
})
print(response.json()["response"])
# Upload a document
with open("python_tutorial.pdf", "rb") as f:
response = requests.post(
f"http://localhost:8000/documents/upload?user_id={user_id}",
files={"file": f}
)- Facts are stored as subject-predicate-object triples
- Each fact has a weight (importance) and confidence score
- Facts decay over time if not reinforced
- Similar facts are merged to avoid duplication
{
"name": "Alice",
"pronouns": "she/her",
"interests": ["Python", "data science", "machine learning"],
"skills": ["programming", "statistics"],
"goals": ["build ML models", "analyze data"],
"tone_preference": "friendly and encouraging",
"reading_level": "technical",
"conversation_style": "detailed explanations",
"important_contacts": [{"name": "Bob", "relationship": "colleague"}],
"constraints": ["prefers visual examples"]
}Key environment variables:
OPENAI_API_KEY: Your OpenAI API keyDATABASE_URL: PostgreSQL connection stringREDIS_URL: Redis connection stringMAX_MEMORY_FACTS: Maximum facts to store per user (default: 1000)MEMORY_DECAY_DAYS: Days before memories start decaying (default: 90)
- Document chunks are embedded in batches for efficiency
- Embeddings are cached in memory to avoid recomputation
- Vector similarity search is optimized with pgvector indexes
- Facts are weighted and pruned to maintain relevance
- Support for more document formats (Word, HTML, Markdown)
- Graph database integration for complex relationships
- Multi-modal memory (images, audio)
- Export/import memory snapshots
- Fine-tuned models for better fact extraction