- Java: Ensure you have Java 8 or Java 11 installed.
- Scala: Install Scala any version.
- SBT (Scala Build Tool): Install SBT for building and running the project. This tool come with scala install by default. If not, please install it
A python virtual environment is recommended, please run
python -m venv .venv
source .venv/bin/activate
Install requirements
pip install -r requirements.txt
- Start Services
make up
- Extract – Simulate BTC Price Stream
make extract
make consume # Check data on topic: btc-price
- Transform 1 – Moving Average & Std. Deviation
make transform1
make consume TOPIC="btc-price-moving"
- Transform 2 – Compute Z-Score
make transform2
make consume TOPIC="btc-price-zscore"
- Load
make load
- Bonus
make bonus
- Stop Services (without removing data)
make stop
- Resume Services Start services again after stopping:
make start
- Tear Down & Clean Up
make down
After running the load step, check data in MongoDB:
docker exec -it mongodb mongosh
> use crypto
> show collections
> db["btc-price-zscore-5m"].find().limit(5)
- If you have problem running the Extract script (Kafka not found) please wait patiently for the Kafka to startup
- If you have problem "could not find checkpoint..." while running Spark, please run
make down
to clean up,make up
to start the containers and run the script again