Skip to content

Crafted an ETL pipeline to handle 26 million user ratings and about 45,000 movies. The pipeline has the potential of ingesting data at an efficiency of 10,000 records per minute into AWS Redshift. Implemented a standardized data model and automated data quality checks using Airflow, contributing to a 97% success rate for regular ETL cycles.

License

Notifications You must be signed in to change notification settings

ManoharVit/MoviETL-Data-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

MoviETL-Data-Pipeline

Crafted an ETL pipeline to handle 26 million user ratings and about 45,000 movies. The pipeline has the potential of ingesting data at an efficiency of 10,000 records per minute into AWS Redshift. Implemented a standardized data model and automated data quality checks using Airflow, contributing to a 97% success rate for regular ETL cycles.

About

Crafted an ETL pipeline to handle 26 million user ratings and about 45,000 movies. The pipeline has the potential of ingesting data at an efficiency of 10,000 records per minute into AWS Redshift. Implemented a standardized data model and automated data quality checks using Airflow, contributing to a 97% success rate for regular ETL cycles.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published