Add "Advanced Python DS Ecosystem" course materials #3

ccauet · 2023-10-09T10:23:47Z

No description provided.

…rdingly.

ccauet · 2023-10-30T18:10:29Z

@clstaudt we created some new content for a customer covering topics from oop, poetry, databases, polars, and dashboards.

Would you be interested to have a look over the material and give some feedback before we merge?

clstaudt · 2023-10-30T18:58:52Z

@ccauet Certainly. There might even be some thematic overlap with new material I am building.

clstaudt · 2023-10-30T19:17:47Z

Polars material

Not quite in the familiar form of notebooks from Data Science Learning Paths yet:

This would include a notebook title followed by a bit of introductory text (e.g. when and why should I use polars instead of pandas?)
Also probably more instruction / explanation for the code blocks (e.g. splitting up bigger code blocks and explaining step by step.)

Technical:

format all code with black
stick to the notebook file name scheme
write temporary data not to the notebook folder but a separate, gitignored data folder
index notebook ape-advanced-python-ds-ecosystem-2day.ipynb is duplicated

Nice to have: Comparison of PySpark and Polars API - since they look very similar.

clstaudt · 2023-10-30T19:48:23Z

Object Oriented Programming material

allows us to organize code around real-world entities.

Really? A bit vague and misleading. Start with the idea of grouping data and logic together.

Python uses access modifiers to define the visibility of attributes and methods, helping to encapsulate data and ensure that unwanted changes cannot be made from outside the class.

C++ and Java have access modifiers that enforce visibility rules. The conventions explained here are usually not called like that. Suggestion:

In Python, naming conventions are used to indicate the intended visibility and accessibility of attributes and methods, rather than strict access modifiers.

Also, this is not strictly correct:

Can only be accessed within the defining class, denoted by a prefix of double underscore

Instead of:

Polymorphism is the ability of interacting with different objects, from different classes, through a common interface (methods).

... consider:

Polymorphism is the ability of objects from different classes to be treated as instances of the same class through a common interface (methods).

Mention the term "duck typing" here.
Abstract base classes: The main use case in my view goes unmentioned, namely ensuring that any subclass of the abstract class implements certain methods.

Nice to have: A more elaborate example where an OOP design really makes code elegant and easy to manage. For example the state machine design pattern -> check out https://github.com/clstaudt/cpp-patterns/blob/main/State/music.py

Nice to have: A practical example for how OOP is used in a data science library. For example, scikit-learn Estimators and Transformers. Exercise: Build your own Estimator...

clstaudt · 2023-10-30T20:04:06Z

1. Development of Python Packages with Poetry

material missing or not linked in the TOC?

clstaudt · 2023-10-30T20:32:49Z

Working with Databases

ORM: SQLAlchemy

missing a notebook title and introductory text (e.g. What is an ORM?...)

- "pyramid scheme" but "database schema"

nice to have: an Entity Relationship Diagram for the database schema (one should probably start a DB design with sketching one)
query output: are the log outputs meant to be displayed?

NoSQL databases with PyMongo

start with notebook title and introduction
What does NoSQL mean and why do I need that?

Pandas + SQL(Alchemy)

This explains pandas + SQL. If we are already using SQLAlchemy to interact with the DB, should I write raw SQL queries to read data into pandas or rather something like this?

# Using the session in a with statement
with Session() as session:
    # Inserting data
    sample_users = [User(name="Alice", age=30), User(name="Bob", age=25), User(name="Charlie", age=35)]
    session.add_all(sample_users)
    session.commit()

    # Querying data
    users_query = session.query(User).all()

# Convert the query result to a pandas DataFrame
df = pd.DataFrame([(user.id, user.name, user.age) for user in users_query], 
                  columns=["ID", "Name", "Age"])

clstaudt · 2023-10-30T20:50:58Z

streamlit

Would be great to have a streamlit example here, but this particular demo may be too German for this repo...

Nice to have: Demo that shows off a lot of the interactive stuff you can do with streamlit.

…ooks.

…ro notebook to fit the usual fromat.

…nchmark' notebook. Update index accordingly.

…dex.

Add course overview

50e8272

ccauet self-assigned this Oct 9, 2023

ccauet and others added 28 commits October 9, 2023 20:23

Add DB notebooks

e1dae85

Fix data path

bfa40ee

add polars notebooks

bd76b0d

Update dependencies

792d457

Add object-oriented programming notebook

af812b0

Fix some typos and add links to db notebooks in index.

da2a63a

streamlit example of stromnetz

426e9d3

Update scipy and statsmodels

d4057fe

Upgrade scikit-learn

d059226

Update project dependencies

37612d3

Update copyright notice

db71a8d

Add copyright notice

c981687

Fix relativ path

7345877

Linting

9cee91a

Update requirement.txt

63833b9

Remove an incompatible package by hand

3b9387d

Create requirements file by hand

4664ba3

Remove python from req file

e659751

No versions constraints

a93828c

Add compose file to setup aux services

dc5475e

Update port mapping

538043f

Update MongoDB client to include port and authentication.

048c534

Add polars notebook links to index.

4ddb681

Minor changes in polars notebooks.

ce3d77c

Update dependencies

839e152

update streamlit example

18f0c25

add streamlit config

92f778a

add oop and streamlit to index

18d2140

sdungs and others added 7 commits October 24, 2023 23:41

add exercise streamlit

59e0f1b

fix path to data

38ab2b7

re-add pathlib

8c6077e

add licences

869b536

Add 'timeit' to polars notebook.

22572c6

Add the current course index to the first level and update paths acco…

177334c

…rdingly.

Add type hints to OOD notebook.

f99fb8d

clstaudt marked this pull request as ready for review October 30, 2023 18:58

sdungs and others added 11 commits November 13, 2023 08:04

add output of db schema information

8451101

Fix some typos and remove unnecessary lines of code from polars noteb…

f35f3ff

…ooks.

Remove duplicated index, update paths in ape-index, update polars int…

befd5b2

…ro notebook to fit the usual fromat.

Add APE to official data-science-learning-paths index.

8347d06

Rework polars notebooks. Split in exercise and solution and add a 'be…

5040f00

…nchmark' notebook. Update index accordingly.

Delete old polars notebook.

ae78aaf

Try to use black format in polars notebooks.

591b4ae

Add more information for the DB-API sqlite notebook and update the in…

8e47f5b

…dex.

Add introductory text for ORM notebook.

dd10538

Add introductory text for NoSQL notebook with MongoDB.

55ba215

Minor update in pandas-sql notebook.

8a525ad

clstaudt added the enhancement New feature or request label Jan 31, 2024

clstaudt marked this pull request as draft August 30, 2024 11:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "Advanced Python DS Ecosystem" course materials #3

Add "Advanced Python DS Ecosystem" course materials #3

ccauet commented Oct 9, 2023

ccauet commented Oct 30, 2023 •

edited

Loading

clstaudt commented Oct 30, 2023

clstaudt commented Oct 30, 2023 •

edited

Loading

clstaudt commented Oct 30, 2023 •

edited

Loading

clstaudt commented Oct 30, 2023

clstaudt commented Oct 30, 2023 •

edited by Vanessa-Mueller

Loading

clstaudt commented Oct 30, 2023 •

edited

Loading

Add "Advanced Python DS Ecosystem" course materials #3

Are you sure you want to change the base?

Add "Advanced Python DS Ecosystem" course materials #3

Conversation

ccauet commented Oct 9, 2023

ccauet commented Oct 30, 2023 • edited Loading

clstaudt commented Oct 30, 2023

clstaudt commented Oct 30, 2023 • edited Loading

clstaudt commented Oct 30, 2023 • edited Loading

clstaudt commented Oct 30, 2023

clstaudt commented Oct 30, 2023 • edited by Vanessa-Mueller Loading

clstaudt commented Oct 30, 2023 • edited Loading

ccauet commented Oct 30, 2023 •

edited

Loading

clstaudt commented Oct 30, 2023 •

edited

Loading

clstaudt commented Oct 30, 2023 •

edited

Loading

clstaudt commented Oct 30, 2023 •

edited by Vanessa-Mueller

Loading

clstaudt commented Oct 30, 2023 •

edited

Loading