A Streamlit-based chatbot that allows users to query financial data using natural language. The chatbot leverages Google Gemini Flash 2.5 to generate pandas code from user queries and then executes that code to retrieve and analyze financial data.
- Natural Language Queries: Ask financial questions in plain English
- Data Visualization: Automatic chart generation for numerical data
- Code Transparency: View the generated pandas code for each query
- Export Functionality: Download chat history for future reference
- Example Queries: Get started quickly with sample questions
- Data Overview: Sidebar with information about available companies, countries, and accounts
graph TD
A[User Input] --> B[Schema Injection]
B --> C[Gemini Query Generation]
C --> D[Safe Code Execution]
D --> E[Result Processing]
E --> F[Gemini Response Formatting]
F --> G[User Response]
- User Input: "What is the revenue growth of Q2 vs LY for Falabella retail Chile?"
- Schema Injection: Load relevant schema docs into prompt
- Query Generation: Gemini generates pandas code
- Code Execution: Execute generated code safely
- Result Processing: Send raw results back to Gemini
- Response Generation: Gemini creates natural language response
financial-chatbot/
├── app.py # Main Streamlit app
├── requirements.txt # Dependencies
├── .env.example # Template for API key
├── .gitignore # Exclude .env
├── README.md # Project documentation
├── data/ # Excel files
│ └── P&L_ChatBot.xlsx # Financial data
├── schema/ # Schema documentation
│ └── P&L.md # Profit & Loss schema
└── utils/
├── data_loader.py # Excel loading logic
├── query_generator.py # LLM query generation
└── response_formatter.py # Response formatting
- Python 3.8 or higher
- Google Gemini API key
-
Clone the repository:
git clone https://github.com/felipemedlev/Financial-Data-AI-Chatbot cd Financial-Data-AI-Chatbot -
Install dependencies locally (required only for local runs):
pip install -r requirements.txt
-
Set up environment variables:
cp .env.example .env
Then edit
.envand add your Google API key:GOOGLE_API_KEY=your_actual_api_key_here
-
Clone the repository:
git clone https://github.com/felipemedlev/Financial-Data-AI-Chatbot cd Financial-Data-AI-Chatbot -
Set up environment variables:
cp .env.example .env
Then edit
.envand add your Google API key:GOOGLE_API_KEY=your_actual_api_key_here
Note: You do NOT need to run
pip installif you are using Docker. All dependencies are installed inside the container during the build process.
streamlit run app.pyThe app will open in your default browser at http://localhost:8501.
-
Build the Docker image:
docker build -t financial-chatbot . -
Run the Docker container:
docker run -p 8501:8501 --env-file .env financial-chatbot
- This maps port 8501 in the container to your local machine.
- The
--env-file .envflag loads your environment variables (API keys, etc).
-
Access the app: Open your browser and go to http://localhost:8501
- Load all Excel files at startup with
@st.cache_data - Create a unified DataFrame with month/quarter columns
- Index by company, business unit, geography for fast filtering
You are a financial data analyst. Given this schema and user question, generate pandas code.
SCHEMA:
{schema_docs}
AVAILABLE DATA:
- Companies: {company_list}
- Date Range: {date_range}
- Business Units: {business_units}
USER QUESTION: {user_question}
Generate only valid pandas code that answers the question. Return code between ```python and ```.
- Use
astmodule to validate generated code - Whitelist allowed operations (no file I/O, imports, etc.)
- Catch and handle execution errors gracefully
- Chat-like interface with message history
- Show generated pandas code (collapsible)
- Display results as tables and charts
- Export functionality for results
- Data overview (available companies, date ranges)
- Schema browser
- Query examples
- Settings (temperature, model parameters)
from google import genai
client = genai.Client(api_key=api_key)
response = client.model.generate_content(
model='gemini-2.5-flash',
contents=prompt,
generation_config={'temperature': temperature}
)- Query Generation: Generate pandas code
- Response Formatting: Convert results to natural language
- Push to GitHub (with
.envin.gitignore) - Go to share.streamlit.io
- Connect repository
- Add secrets in dashboard
- Deploy
- "What is the total revenue for Falabella Retail in 2023?"
- "Compare expenses between Chile and Peru for Sodimac"
- "Show me the profit trend for Tottus over the last 3 years"
- "What is the revenue growth of Q2 vs LY for Falabella retail Chile?"
- "Show me the operating margin for all companies in USD"
- API keys are stored in environment variables
- Generated code is validated using the
astmodule - Only safe pandas operations are allowed
- No file I/O or system commands are permitted
