Skip to content

Financial data warehouse API built with FastAPI and Cassandra for ingesting, storing, and serving Nasdaq market data with temporal data management.

License

Notifications You must be signed in to change notification settings

patrueduard03/nasdaq-cassandra-dw-fin-api

Repository files navigation

🏦 Financial Data Warehouse

A modern financial data warehouse built with FastAPI and Cassandra, featuring temporal database capabilities for complete audit trails and historical data tracking.

Academic Project - West University of Timisoara
Big Data Data Warehouse Course | Author: Patru Gheorghe Eduard | 2025

πŸš€ Features

  • πŸ•’ Temporal Database: Complete audit trail with versioned records (no data loss)
  • πŸ“ˆ Multi-Asset Support: Stocks, bonds, currencies, derivatives
  • ⚑ Real-time Data: Automated Nasdaq data ingestion with coverage tracking
  • 🎨 Modern Web UI: Responsive interface with interactive charts
  • πŸš€ High Performance: FastAPI + Cassandra scaling
  • πŸ”’ Type Safety: Full Pydantic validation and auto-generated docs
  • πŸ“Š Smart Ingestion: Automatic data source filtering and coverage extension
  • πŸ”„ Data Refresh: Temporal versioning for data updates without loss

πŸ“ Project Structure

β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ api/          # FastAPI routes and models
β”‚   β”œβ”€β”€ models/       # Data models and repositories  
β”‚   β”œβ”€β”€ services/     # Business logic
β”‚   β”œβ”€β”€ utils/        # Database utilities
β”‚   └── resources/    # Connection files
β”œβ”€β”€ web/              # Frontend interface
β”œβ”€β”€ requirements.txt  # Dependencies
└── setup_and_run.sh # Automated setup

⚑ Quick Start

Prerequisites

πŸš€ Automated Setup

git clone https://github.com/patrueduard03/nasdaq-cassandra-dw-fin-api
cd nasdaq-cassandra-dw-fin-api
chmod +x setup_and_run.sh
./setup_and_run.sh

πŸ”§ Manual Setup

git clone https://github.com/patrueduard03/nasdaq-cassandra-dw-fin-api
cd nasdaq-cassandra-dw-fin-api
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

βš™οΈ Configuration

1. Get Nasdaq API Key

2. Setup Cassandra Database

  • Create account at DataStax Astra
  • Create database with keyspace lectures
  • Download: secure connect bundle (ZIP) + token file (JSON)
  • Place files in src/resources/

3. Environment Variables

Create .env file in project root:

NASDAQ_DATA_LINK_API_KEY=your_api_key_here
SECURE_CONNECT_BUNDLE=src/resources/secure-connect-your-db.zip
SECURE_TOKEN=src/resources/your-db-token.json

4. Initialize & Start

python src/utils/create_tables.py  # Create database tables
python src/main.py                 # Start application

🌐 Access Points

πŸ§ͺ Testing

API Testing

Database Utilities

python src/utils/create_tables.py      # Initialize tables
python src/utils/test_nasdaq_datalink.py  # Test API connection  
python src/utils/truncate_tables.py    # Clear data
python src/utils/drop_tables.py        # Drop all tables

Monitoring

tail -f logs/app.log        # Application logs
tail -f logs/ingestion.log  # Data ingestion logs

πŸ“‹ API Endpoints

🏦 Assets

  • GET /assets - List active assets
  • POST /assets - Create asset
  • PUT /assets/{id} - Update asset (creates version)
  • DELETE /assets/{id} - Soft delete
  • GET /admin/assets/all - All versions + deleted

πŸ“Š Data Sources

  • GET /data-sources - List active sources
  • POST /data-sources - Create source
  • PUT /data-sources/{id} - Update source
  • DELETE /data-sources/{id} - Soft delete
  • POST /data-sources/{id}/resurrect - Restore deleted source
  • GET /admin/data-sources/all - All versions + deleted
  • GET /data-sources/provider/{provider} - Get by provider

πŸ“ˆ Time Series

  • GET /time-series/{asset_id}/{source_id} - Get data
  • GET /time-series/{asset_id}/{source_id}?start_date=2024-01-01&end_date=2024-12-31 - Date range

πŸ“₯ Data Ingestion

  • POST /ingest/nasdaq - Import Nasdaq data
  • POST /ingest/nasdaq/refresh - Refresh existing data with temporal versioning
  • GET /ingest/status - Get ingestion status for assets
  • GET /ingest/availability/{asset_id}/{data_source_id} - Check data availability
  • GET /ingest/compatible-data-sources/{asset_id} - Get compatible data sources
  • GET /ingest/progress/{session_id} - Get ingestion progress by session

πŸ› οΈ Utilities

  • GET / - Health check
  • GET /health - Detailed health check with database connectivity
  • GET /docs - API documentation
  • GET /web/ - Web interface
  • WS /ws - WebSocket endpoint for real-time progress updates

πŸ›οΈ Temporal Database

This system implements a temporal database with:

Core Principles

  • ❌ No Data Loss: Changes create new versions, never overwrite
  • πŸ“… Complete History: Track who, what, when, and why
  • πŸ•’ Point-in-Time: Query data as it existed at any date
  • πŸ—‘οΈ Soft Deletion: Mark deleted but preserve records
  • πŸ”„ Version Control: Multiple versions of each entity

Key Fields

  • valid_from/valid_to - Business time (real-world validity)
  • system_date - System time (record creation)
  • is_deleted - Soft deletion flag
  • 9999-12-31 - Far-future date for current records

πŸ› οΈ Tech Stack

  • Backend: FastAPI, Python 3.9+
  • Database: Apache Cassandra (DataStax Astra)
  • Frontend: HTML5, CSS3, JavaScript, Bootstrap 5, Chart.js
  • Data Source: Nasdaq Data Link API

Key Libraries

  • fastapi - Modern web framework
  • cassandra-driver - Database connectivity
  • pydantic - Data validation
  • nasdaq-data-link - Financial data API
  • uvicorn - ASGI server

About

Financial data warehouse API built with FastAPI and Cassandra for ingesting, storing, and serving Nasdaq market data with temporal data management.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published