Material about software engineering, devops, sql, ml.
A working knowledge of the Python programming language as well as its ML stack If not go to https://github.com/djib2011/python_ml_tutorial
The objective of this repository is to bridge the gap between the local / casual use of data science and give a glimpse of technologies that are used in industry (Docker, Kubernetes, SQL, Hadoop, Spark, Kafka). The repository is organised in Missions. Each Mission assumes some knowledge of the previous missions
-
Mission 1: You need to do a fresh start
- Learn about virtual envs, Docker and pytest
-
Mission 2: Eat your queries
- Learn the basic syntax of SQL
-
Mission 3: Eat your queries (part 2)
- Learn advanced SQL concepts
-
Mission 4: Serve it right
- Take a trained ML model and serve it through a flask app
-
Mission 5: Big in data
- Create a docker cluster, install HDFS and do a Map Reduce job
-
Mission 6: A little spark kindles a great fire
-
Mission 7: Kafka is a writer, not a fucking platform
-
Mission 8: Que te la mongo
There are also some secret missions in order to better understand some ml or data science concepts
- Secret Mission 1: The firm tree does not fear the storm
- Secret Mission 2: In need for support