Agatha: Natural Language Based Expense Tracker API

Agatha is a NL2SQL-driven expense tracker API that uses Llama 3.3 provided by Groq for efficient and fast data extraction and NL2SQL workflow. It allows users to either add transaction data to a SQL database or ask questions about their expenses, all via natural language. Agatha automatically determines whether an input is intended for ingestion (adding a transaction) or QnA (querying transactions) to improve the user experience.

Overview

Agatha uses three core pipelines to handle user inputs:

Ingestion Pipeline:
Extracts transaction details (date, amount, vendor, description) from natural language input and saves them into an SQLite database.
QnA Pipeline:
Converts natural language questions into SQL queries (via a text-to-SQL chain), executes these queries against the transactions database, and returns answers based on the retrieved data.
Classification Pipeline:
Automatically classifies a natural language query as either "ingestion" or "chat". Based on the classification, the API routes the query to the appropriate pipeline without requiring the user to specify the intent explicitly.

Features

Natural Language Ingestion:
Easily add expenses by describing them in plain language.
Natural Language Querying:
Ask questions about your spending (e.g., "What is my total spending?") and get answers generated by converting your question into a SQL query.
Automatic Query Classification:
The API intelligently distinguishes between ingestion and query requests to provide a seamless experience.
SQLite Database Integration:
All transactions are stored in an SQLite database that is automatically created and maintained.

API Workflow

User Input:
A natural language input is sent to the API.
Classification:
The input is first classified as either for ingestion or QnA using a classification pipeline. This decision is made based on the semantics of the input.
Routing:
- If the input is for ingestion, Agatha extracts the transaction details (using an extraction prompt) and saves them into the SQLite database.
- If the input is a question, Agatha converts the question to a SQL query (using a text-to-SQL chain), executes it on the database, and then rephrases the answer before returning it.
Response:
Agatha returns a JSON response containing either the ingested transaction details or the answer to the query.

Privacy & Security Considerations

Local Data Storage:
Agatha is designed as a self-hosted solution. When deployed, all sensitive transaction data is stored locally in an SQLite database, ensuring that your personal financial information remains private and under your control.
User-Controlled Deployment:
Since users deploy Agatha on their own servers or machines (via Docker or a local setup), you are fully responsible for securing your environment. No data is sent to external servers by default.
Secure Handling of API Keys:
API keys and other sensitive environment variables should be managed securely. We recommend using environment variables, .env files (which should not be committed to source control), or Docker secrets for production deployments.
Best Practices for Public Deployment:
If you choose to deploy Agatha publicly, implement proper authentication, use HTTPS to secure communications, and follow best practices for securing your database (such as using parameterized queries to prevent SQL injection).
Data Privacy:
Ensure that only authorized users have access to Agatha, and consider additional layers of encryption or access control if handling highly sensitive financial data.

Endpoints

Health Check

URL: /health
Method: GET
Description: Returns the health status of the API.

Example Response:

{
  "status": "healthy",
  "version": "1.0.0"
}

Database Status

URL: /db_status
Method: GET
Description: Provides the current status of the SQLite database, including the names of tables and the table count.

Example Response:

{
  "database": "transactions.db",
  "tables": ["transactions"],
  "table_count": 1
}

Classify Input

URL: /classify
Method: POST
Description: Classifies a given natural language input as either "ingestion" or "chat".

Request Body:

{
  "text": "I spent 500 INR at Amazon on electronics."
}

Example Response:
```
{
  "input_type": "ingestion"
}
```

Ingest Transaction

URL: /ingest
Method: POST
Description: Extracts transaction details from natural language input and stores the transaction in the database.

Request Body:

{
  "text": "I bought groceries from Walmart for $50 on January 12th."
}

Example Response:

{
  "status": "success",
  "transaction": {
    "date": "2024-01-12",
    "amount": 50.0,
    "vendor": "Walmart",
    "description": "groceries"
  }
}

Ask a Question

URL: /ask
Method: POST
Description: Converts a natural language question into a SQL query, executes it, and returns the answer.

Request Body:

{
  "question": "What is my total spending?"
}

Example Response:

{
  "question": "What is my total spending?",
  "answer": "Your total spending is $500."
}

Process Query

URL: /process
Method: POST
Description: Automatically classifies the natural language input as either ingestion or QnA and routes it to the appropriate pipeline.

Request Body:

{
  "text": "How much did I spend at Starbucks last month?"
}

Example Response (if classified as chat):

{
  "input_type": "chat",
  "question": "How much did I spend at Starbucks last month?",
  "answer": "$30 at Starbucks last month."
}

Example Response (if classified as ingestion):

{
  "input_type": "ingestion",
  "transaction": {
    "date": "2024-02-15",
    "amount": 30.0,
    "vendor": "Starbucks",
    "description": "coffee and snacks"
  }
}

Installation

Local Setup

Clone the Repository:
```
git clone <repository_url>
cd agatha
```

Create a Virtual Environment:

python3 -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install Dependencies:
```
pip install -r requirements.txt
```
Environment Variables:

Create a .env file in the project root and set your environment variables:
```
GROQ_API_KEY=your_groq_api_key
```

Run the API:

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Access the API: Open your browser or API client at http://localhost:8000.

Using Docker

Build the Docker Image:
```
docker build -t agatha .
```
Run the Docker Container:
```
docker run -p 8000:8000 agatha
```
Access the API: The API will be available at http://localhost:8000.

Note: For secure deployment, ensure you pass environment variables (or secrets) to your container. For example:

docker run -p 8000:8000 --env-file .env agatha

Example Usage with Postman

Start the API (locally or via Docker).
Open Postman and create a new request.
Set Request Type & URL:
- For example, to test the health check:
  - Method: GET
  - URL: http://localhost:8000/health
For POST Endpoints:
- Select the Body tab, choose raw, and set the type to JSON.
- Input the JSON payload as described in the endpoint documentation above.
Send Request:
- Click Send and verify the JSON response matches the expected output.

Dependencies

The project requires the following Python packages (as listed in requirements.txt):

fastapi
uvicorn
pydanti
python-dotenv
langchain
langchain-groq
langchain-community
langchain-core

Note: Replace <version> with the appropriate version numbers for your langchain related packages.

License

This project is licensed under the MIT License.

Summary

Agatha is a fully automated, natural language-based expense tracker API. It seamlessly handles both transaction ingestion and querying through intelligent classification, data extraction, and a text-to-SQL pipeline. Whether you're adding your expenses or querying your spending habits, Agatha streamlines the process to provide quick, accurate responses while keeping your sensitive financial data private.

Privacy Reminder: Agatha is intended for self-hosted use. When deployed on your own secure server or local machine, you remain in control of your data. For additional security, always use proper authentication, HTTPS, and environment management practices when handling sensitive information.

For any questions or issues, please open an issue on the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
api		api
logic		logic
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
agatha workflow.png		agatha workflow.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agatha: Natural Language Based Expense Tracker API

Table of Contents

Overview

Features

API Workflow

Privacy & Security Considerations

Endpoints

Health Check

Database Status

Classify Input

Ingest Transaction

Ask a Question

Process Query

Installation

Local Setup

Using Docker

Example Usage with Postman

Dependencies

License

Summary

About

Releases

Languages

License

devroopsaha744/agatha

Folders and files

Latest commit

History

Repository files navigation

Agatha: Natural Language Based Expense Tracker API

Table of Contents

Overview

Features

API Workflow

Privacy & Security Considerations

Endpoints

Health Check

Database Status

Classify Input

Ingest Transaction

Ask a Question

Process Query

Installation

Local Setup

Using Docker

Example Usage with Postman

Dependencies

License

Summary

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Languages