GitHub - kshi-glitch/BigDataOps_Lab: This project demonstrates real-world big data engineering practices using Apache Spark (PySpark). It covers the entire data pipeline — from ingestion, transformation, and validation to exploration and reporting. Ideal for data engineers and analysts looking to gain practical experience with Spark, Airflow, and data lake design.

kshi-glitch / BigDataOps_Lab Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

This project demonstrates real-world big data engineering practices using Apache Spark (PySpark). It covers the entire data pipeline — from ingestion, transformation, and validation to exploration and reporting. Ideal for data engineers and analysts looking to gain practical experience with Spark, Airflow, and data lake design.

0 stars 0 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
hive_scripts		hive_scripts
introduction_to_hadoop		introduction_to_hadoop
nyse_data_loader		nyse_data_loader
pyspark_examples		pyspark_examples
shell_scripts		shell_scripts
logging_basics.py		logging_basics.py