Skip to content

This project demonstrates real-world big data engineering practices using Apache Spark (PySpark). It covers the entire data pipeline — from ingestion, transformation, and validation to exploration and reporting. Ideal for data engineers and analysts looking to gain practical experience with Spark, Airflow, and data lake design.

Notifications You must be signed in to change notification settings

kshi-glitch/BigDataOps_Lab

About

This project demonstrates real-world big data engineering practices using Apache Spark (PySpark). It covers the entire data pipeline — from ingestion, transformation, and validation to exploration and reporting. Ideal for data engineers and analysts looking to gain practical experience with Spark, Airflow, and data lake design.

Topics

Stars

Watchers

Forks

Packages

No packages published