A powerful tool that leverages Language Models to interact with Oracle Database metadata through natural language queries. Extract schema information, build vector embeddings, and query your database using plain English.
- Schema Extraction: Automatically extract and analyze Oracle database metadata
- Vector Embeddings: Convert database schema into searchable vector representations
- Natural Language Queries: Ask questions about your database in plain English
- SQL Generation: Automatically generate SQL queries based on natural language
- Context-Aware Responses: Get explanations and insights about your database structure
- Python 3.8+
- Oracle Database (18c or later)
- Oracle Instant Client (if not using thin mode)
-
Clone the repository:
git clone https://github.com/yourusername/oracle-db-metadata-llm.git cd oracle-db-metadata-llm -
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Copy the example environment file and update with your credentials:
cp .env.example .env
-
Edit the
.envfile with your configuration:# Database Configuration DB_USER=your_username DB_PASSWORD=your_password DB_HOST=your_database_host DB_PORT=1521 DB_SERVICE=XE # LLM Configuration (at least one API key is required) # For OpenAI (GPT models) OPENAI_API_KEY=your_openai_api_key_here # For Google Gemini GEMINI_API_KEY=your_gemini_api_key_here # Optional settings # VECTOR_STORE_DIR=./vector_store # EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 # LOG_LEVEL=INFO
- OpenAI API Key: Get it from OpenAI API Keys
- Gemini API Key: Get it from Google AI Studio
python -m oracle_rag.main extract --schema YOUR_SCHEMApython -m oracle_rag.main build --schema YOUR_SCHEMApython -m oracle_rag.main query "Show me all tables with their column counts" --schema YOUR_SCHEMAoracle_rag/
├── config/ # Configuration settings
├── database/ # Database connection and metadata extraction
├── models/ # Data models and schemas
├── services/ # Core application services
└── main.py # Main application entry point
| Variable | Description | Default |
|---|---|---|
DB_USER |
Database username | - |
DB_PASSWORD |
Database password | - |
DB_HOST |
Database host | localhost |
DB_PORT |
Database port | 1521 |
DB_SERVICE |
Database service name | XE |
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY |
OpenAI API key for GPT models | - |
GEMINI_API_KEY |
Google Gemini API key | - |
| Variable | Description | Default |
|---|---|---|
VECTOR_STORE_DIR |
Directory for vector store data | ./vector_store |
EMBEDDING_MODEL |
Model for text embeddings | sentence-transformers/all-MiniLM-L6-v2 |
LOG_LEVEL |
Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | INFO |
LLM_PROVIDER |
Default LLM provider (openai or gemini) | gemini (can be overridden in code) |
# List all tables in the schema
python -m oracle_rag.main query "List all tables in the schema" --schema HR
# Show table structure
python -m oracle_rag.main query "Describe the EMPLOYEES table" --schema HR
# Find tables with specific columns
python -m oracle_rag.main query "Which tables contain salary information?" --schema HR
# Get table relationships
python -m oracle_rag.main query "How are EMPLOYEES and DEPARTMENTS related?" --schema HR- Error: ORA-12541: TNS:no listener
- Ensure the database is running and accessible
- Verify the host and port in your
.envfile - Check if your Oracle client is properly configured
- Authentication errors
- Verify your API key is correctly set in the
.envfile - Check if you have sufficient credits/quota with the LLM provider
- Verify your API key is correctly set in the
- If you encounter memory errors with large schemas:
- Try increasing the chunk size in the configuration
- Filter the schema to include only necessary tables
- For large databases, consider extracting metadata during off-peak hours
- The vector store is persistent between sessions - you only need to rebuild it when the schema changes
- Use more specific queries to get faster and more accurate results
- Fork the repository
- Create a new branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Built with LangChain
- Uses sentence-transformers for embeddings
- Inspired by the need for better database interaction tools