Agatha is a NL2SQL-driven expense tracker API that uses Llama 3.3 provided by Groq for efficient and fast data extraction and NL2SQL workflow. It allows users to either add transaction data to a SQL database or ask questions about their expenses, all via natural language. Agatha automatically determines whether an input is intended for ingestion (adding a transaction) or QnA (querying transactions) to improve the user experience.
- Overview
- Features
- API Workflow
- Privacy & Security Considerations
- Endpoints
- Installation
- Example Usage with Postman
- Dependencies
- License
Agatha uses three core pipelines to handle user inputs:
-
Ingestion Pipeline:
Extracts transaction details (date, amount, vendor, description) from natural language input and saves them into an SQLite database. -
QnA Pipeline:
Converts natural language questions into SQL queries (via a text-to-SQL chain), executes these queries against the transactions database, and returns answers based on the retrieved data. -
Classification Pipeline:
Automatically classifies a natural language query as either "ingestion" or "chat". Based on the classification, the API routes the query to the appropriate pipeline without requiring the user to specify the intent explicitly.
-
Natural Language Ingestion:
Easily add expenses by describing them in plain language. -
Natural Language Querying:
Ask questions about your spending (e.g., "What is my total spending?") and get answers generated by converting your question into a SQL query. -
Automatic Query Classification:
The API intelligently distinguishes between ingestion and query requests to provide a seamless experience. -
SQLite Database Integration:
All transactions are stored in an SQLite database that is automatically created and maintained.
-
User Input:
A natural language input is sent to the API. -
Classification:
The input is first classified as either for ingestion or QnA using a classification pipeline. This decision is made based on the semantics of the input. -
Routing:
- If the input is for ingestion, Agatha extracts the transaction details (using an extraction prompt) and saves them into the SQLite database.
- If the input is a question, Agatha converts the question to a SQL query (using a text-to-SQL chain), executes it on the database, and then rephrases the answer before returning it.
-
Response:
Agatha returns a JSON response containing either the ingested transaction details or the answer to the query.
-
Local Data Storage:
Agatha is designed as a self-hosted solution. When deployed, all sensitive transaction data is stored locally in an SQLite database, ensuring that your personal financial information remains private and under your control. -
User-Controlled Deployment:
Since users deploy Agatha on their own servers or machines (via Docker or a local setup), you are fully responsible for securing your environment. No data is sent to external servers by default. -
Secure Handling of API Keys:
API keys and other sensitive environment variables should be managed securely. We recommend using environment variables,.env
files (which should not be committed to source control), or Docker secrets for production deployments. -
Best Practices for Public Deployment:
If you choose to deploy Agatha publicly, implement proper authentication, use HTTPS to secure communications, and follow best practices for securing your database (such as using parameterized queries to prevent SQL injection). -
Data Privacy:
Ensure that only authorized users have access to Agatha, and consider additional layers of encryption or access control if handling highly sensitive financial data.
- URL:
/health
- Method:
GET
- Description: Returns the health status of the API.
- Example Response:
{ "status": "healthy", "version": "1.0.0" }
- URL:
/db_status
- Method:
GET
- Description: Provides the current status of the SQLite database, including the names of tables and the table count.
- Example Response:
{ "database": "transactions.db", "tables": ["transactions"], "table_count": 1 }
- URL:
/classify
- Method:
POST
- Description: Classifies a given natural language input as either "ingestion" or "chat".
- Request Body:
{ "text": "I spent 500 INR at Amazon on electronics." }
- Example Response:
{ "input_type": "ingestion" }
- URL:
/ingest
- Method:
POST
- Description: Extracts transaction details from natural language input and stores the transaction in the database.
- Request Body:
{ "text": "I bought groceries from Walmart for $50 on January 12th." }
- Example Response:
{ "status": "success", "transaction": { "date": "2024-01-12", "amount": 50.0, "vendor": "Walmart", "description": "groceries" } }
- URL:
/ask
- Method:
POST
- Description: Converts a natural language question into a SQL query, executes it, and returns the answer.
- Request Body:
{ "question": "What is my total spending?" }
- Example Response:
{ "question": "What is my total spending?", "answer": "Your total spending is $500." }
- URL:
/process
- Method:
POST
- Description: Automatically classifies the natural language input as either ingestion or QnA and routes it to the appropriate pipeline.
- Request Body:
{ "text": "How much did I spend at Starbucks last month?" }
- Example Response (if classified as chat):
{ "input_type": "chat", "question": "How much did I spend at Starbucks last month?", "answer": "$30 at Starbucks last month." }
- Example Response (if classified as ingestion):
{ "input_type": "ingestion", "transaction": { "date": "2024-02-15", "amount": 30.0, "vendor": "Starbucks", "description": "coffee and snacks" } }
-
Clone the Repository:
git clone <repository_url> cd agatha
-
Create a Virtual Environment:
python3 -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install Dependencies:
pip install -r requirements.txt
-
Environment Variables:
Create a
.env
file in the project root and set your environment variables:GROQ_API_KEY=your_groq_api_key
-
Run the API:
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
-
Access the API: Open your browser or API client at http://localhost:8000.
-
Build the Docker Image:
docker build -t agatha .
-
Run the Docker Container:
docker run -p 8000:8000 agatha
-
Access the API: The API will be available at http://localhost:8000.
Note: For secure deployment, ensure you pass environment variables (or secrets) to your container. For example:
docker run -p 8000:8000 --env-file .env agatha
-
Start the API (locally or via Docker).
-
Open Postman and create a new request.
-
Set Request Type & URL:
- For example, to test the health check:
- Method: GET
- URL:
http://localhost:8000/health
- For example, to test the health check:
-
For POST Endpoints:
- Select the Body tab, choose raw, and set the type to JSON.
- Input the JSON payload as described in the endpoint documentation above.
-
Send Request:
- Click Send and verify the JSON response matches the expected output.
The project requires the following Python packages (as listed in requirements.txt
):
fastapi
uvicorn
pydanti
python-dotenv
langchain
langchain-groq
langchain-community
langchain-core
Note: Replace <version>
with the appropriate version numbers for your langchain
related packages.
This project is licensed under the MIT License.
Agatha is a fully automated, natural language-based expense tracker API. It seamlessly handles both transaction ingestion and querying through intelligent classification, data extraction, and a text-to-SQL pipeline. Whether you're adding your expenses or querying your spending habits, Agatha streamlines the process to provide quick, accurate responses while keeping your sensitive financial data private.
Privacy Reminder: Agatha is intended for self-hosted use. When deployed on your own secure server or local machine, you remain in control of your data. For additional security, always use proper authentication, HTTPS, and environment management practices when handling sensitive information.
For any questions or issues, please open an issue on the repository.