-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "Advanced Python DS Ecosystem" course materials #3
base: main
Are you sure you want to change the base?
Conversation
@clstaudt we created some new content for a customer covering topics from oop, poetry, databases, polars, and dashboards. Would you be interested to have a look over the material and give some feedback before we merge? |
@ccauet Certainly. There might even be some thematic overlap with new material I am building. |
Polars material Not quite in the familiar form of notebooks from Data Science Learning Paths yet:
Technical:
Nice to have: Comparison of PySpark and Polars API - since they look very similar. |
Object Oriented Programming material
Really? A bit vague and misleading. Start with the idea of grouping data and logic together.
C++ and Java have access modifiers that enforce visibility rules. The conventions explained here are usually not called like that. Suggestion:
Also, this is not strictly correct:
Instead of:
... consider:
Nice to have: A more elaborate example where an OOP design really makes code elegant and easy to manage. For example the state machine design pattern -> check out https://github.com/clstaudt/cpp-patterns/blob/main/State/music.py Nice to have: A practical example for how OOP is used in a data science library. For example, scikit-learn Estimators and Transformers. Exercise: Build your own Estimator... |
1. Development of Python Packages with Poetry material missing or not linked in the TOC? |
Working with Databases ORM: SQLAlchemy
- "pyramid scheme" but "database schema"
NoSQL databases with PyMongo
Pandas + SQL(Alchemy) This explains pandas + SQL. If we are already using SQLAlchemy to interact with the DB, should I write raw SQL queries to read data into pandas or rather something like this? # Using the session in a with statement
with Session() as session:
# Inserting data
sample_users = [User(name="Alice", age=30), User(name="Bob", age=25), User(name="Charlie", age=35)]
session.add_all(sample_users)
session.commit()
# Querying data
users_query = session.query(User).all()
# Convert the query result to a pandas DataFrame
df = pd.DataFrame([(user.id, user.name, user.age) for user in users_query],
columns=["ID", "Name", "Age"]) |
streamlit Would be great to have a streamlit example here, but this particular demo may be too German for this repo... Nice to have: Demo that shows off a lot of the interactive stuff you can do with streamlit. |
…ro notebook to fit the usual fromat.
…nchmark' notebook. Update index accordingly.
No description provided.