Skip to content

Projects and notebooks from IBM’s SQL for Data Science course, demonstrating SQL concepts and real-world data analysis using MySQL, SQLite, and Python

Notifications You must be signed in to change notification settings

mfaria-p/SQL_DataScience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 

Repository files navigation

SQL for Data Science - Course Projects

This project showcases the work I completed during the SQL for Data Science course offered by IBM on edX. It includes various exercises and projects demonstrating my learning journey with SQL and Python for database management and analysis.

Project Overview

The project is structured to reflect my progression in the course, starting with MySQL and later transitioning to SQLite and Python. The initial theoretical modules focus on MySQL syntax, where I used phpMyAdmin to create and manage databases using SQL scripts. Later, I shifted my focus to Python and Jupyter Notebooks for data analysis with SQLite, leading to the projects shown in the practical implementation. Although the course primarily utilized Jupyter Notebook, I chose to complete all Python-related projects in Google Colab, as I am more familiar with that environment.

Key Learning Areas:

  • MySQL Database Management: Creating, modifying, and deleting tables using SQL scripts in phpMyAdmin.
  • SQL Query Techniques: Refining query results using string patterns, sorting, grouping, and built-in database functions.
  • Working with Multiple Tables: Using subqueries, joins, and nested selects to retrieve meaningful data.
  • Metadata Retrieval: Extracting database schema information from MySQL and SQLite.
  • Python & SQLite Integration: Managing SQLite databases using Python with DB-API, SQL Magic, and Pandas.
  • Real-World Data Analysis: Applying SQL and Python to analyze datasets from Chicago Public Schools and socioeconomic indicators.

Project Structure

Markdown Files (SQL Concepts & Implementation)

  • 01_Relational_Databases_DDL.md – Introduction to relational databases and Data Definition Language (DDL) (MySQL).
  • 02_Refining_Results.md – Techniques for refining SQL query results (MySQL).
  • 03_Built_in_DataBase_Functions.md – Overview of aggregate, scalar, string, and date/time functions (MySQL).
  • 04_SubQueries_Nested_Selects.md – Explanation and examples of subqueries and nested selects.
  • 05_Multiple_Tables.md – Working with multiple tables using joins and subqueries.
  • 06_SQLite_Python.md – Accessing SQLite databases using Python (DB-API, SQL Magic, and Pandas).
  • 07_Metadata_SQL.md – Retrieving database metadata from SQLite and MySQL.

Jupyter Notebooks (Practical Implementation)

  • 00_SQLite_with_Python.ipynb – Creating and accessing SQLite databases using Python.
  • 01_RealWorldDataset.ipynbChicago Public Schools performance data analysis.
  • 02_RealWorldDataset.ipynb – Socioeconomic indicators analysis for Chicago.
  • 03_RealWorldDataset.ipynb – Further analysis of real-world datasets using SQL and Python.

Learning Outcomes

By working through this project, I gained:

  • Hands-on experience with MySQL, SQLite, and Python for database management.
  • A strong understanding of SQL queries, data extraction, and data analysis.
  • Practical skills in analyzing real-world datasets using SQL and Python.

This project serves as a portfolio of my SQL and Python database management skills developed through the course.

About

Projects and notebooks from IBM’s SQL for Data Science course, demonstrating SQL concepts and real-world data analysis using MySQL, SQLite, and Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published