Skip to content

Latest commit

 

History

History
99 lines (65 loc) · 3.05 KB

File metadata and controls

99 lines (65 loc) · 3.05 KB

alt text

Springboard Data Science Career Track

Hi!

My name is Mikiko Bazeley and this is my repo for the Springboard Data Science Track.

From Oct 2018 to April 2019 I completed a number of projects, including two capstones, as part of the DS track.

All of the documentation, code, and notes can be found here, as well as links to other resources I found helpful for successfully completing the program.

For questions or comments, please feel free to reach out on LinkedIn.

If you find my repo useful, let me know OR ☕ consider buying me a coffee! https://www.buymeacoffee.com/mmbazel ☕.

Regards, Mikiko

alt text


Project List by Unit of Study

For a comprehensve list of the projects and corresponding skills needed, please see the list below.

1. The Python Data Science Stack

Topics covered:

  • Python
  • Matplotlib, Seaborn—visualization tools in Python
  • Writing clear, elegant, readable code in Python using the PEP8 standard

2. Data Wrangling

Topics covered:

  • Deep dive into Pandas for data wrangling
  • Data in files: Work with a variety of file formats from plain text (.txt) to more structured and nested formats files like csv and JSON
  • Data in databases: Get an overview of relational and NoSQL databases and practice data querying with SQL
  • APIs: Collect data from the internet using Application Programming Interfaces (APIs)

Projects:

3. Data Story

4. Statistical Inference

Topics covered:

  • Theory of inferential statistics
  • Statistical significance
  • Parameter estimation
  • Hypothesis testing
  • Correlation and regression
  • Exploratory data analysis
  • A/B testing

5. Machine Learning

Topics covered:

  • Scikit-learn
  • Supervised and unsupervised learning
  • Top machine learning techniques:
    • Linear and logistic regression
    • naive bayes
    • support vector machines
    • decision trees
    • clustering
  • Ensemble learning with random forests and gradient boosting
  • Best practices
  • Evaluating and tuning machine learning systems

6. Capstone Project 1: Building a Data Product

7. The Natural Language Processing (NLP) Track

Topics covered:

  • How to work with text and natural language data
  • NLP in Python, using common libraries such as NLTK and spaCy
  • Basics of Deep Learning in NLP using word2vec and TensorFlow
  • Data Science at Scale using Spark
  • Software Engineering for Data Scientists

8. Second Capstone Project: NLP