Reference Architectures for Datalakes on AWS
-
Updated
May 13, 2020 - HTML
Reference Architectures for Datalakes on AWS
⛳️ PASS: Amazon Web Services Certified (AWS Certified) Machine Learning Specialty (MLS-C01) by learning based on our Questions & Answers (Q&A) Practice Tests Exams.
Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work
A command-line interface for packaging, deploying, and running your EMR Serverless Spark jobs
Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for Apache Airflow (MWAA) on AWS.
Project files for the post: Running PySpark Applications on Amazon EMR: Methods for Interacting with PySpark on Amazon Elastic MapReduce.
A VS Code Extension to make it easier to manage and develop Spark jobs on EMR
Bits of code I use during live demos
Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR
Amazon EMR Notebook to show how to read from and write to Delta tables with Amazon EMR
3NF-normalize Yelp data on S3 with Spark and load it into Redshift - automate the whole thing with Apache Airflow
⛳️ PASS: Amazon Web Services Certified (AWS Certified) Data Analytics Specialty (DAS-C01) by learning based on our Questions & Answers (Q&A) Practice Tests Exams.
Sample CI/CD pipeline for using GitHub Actions with Amazon EMR Serverless Spark.
This repo provides cross-account integration code samples using Amazon S3 Access points
Orchestrate an Amazon EMR on Amazon EKS Spark job with AWS Step Functions
📓 Repository/Tutorial for initiallizing Jupyter Notebook and Spark cluster on Amazon EMR
Configure Hadoop YARN CapacityScheduler on Amazon EMR on Amazon EC2 for multi-tenant heterogeneous workloads
Samples related to data engineering, e.g. spark, embulk, airflow, etc.
Add a description, image, and links to the amazon-emr topic page so that developers can more easily learn about it.
To associate your repository with the amazon-emr topic, visit your repo's landing page and select "manage topics."