This repository houses Jupyter notebooks for various courses. Each notebook references external datasets and Python scripts stored in a separate data repository. Follow the instructions below to correctly add and integrate your notebooks.
Before adding your notebook, ensure that all required data files and scripts are uploaded to the data repository:
📌 Follow the instructions here: Data Repository README
- Submit a pull request in the data repo: Data PRs
- Link your data pull request in your notebook pull request to provide visibility.
Modify your notebook so it properly accesses the uploaded data files via URLs instead of local paths.
Before:
pd.read_csv("iris.csv")
After:
pd.read_csv("https://cal-icor.github.io/textbook.data/<YOUR_COURSE_NAME>/iris.csv")
If your notebook imports a local Python file, update it to load from the data repository.
Before:
import data101grader # Assuming data101grader.py exists locally
After:
import httpimport
with httpimport.remote_repo("https://cal-icor.github.io/textbook.data/<YOUR_COURSE_NAME>"):
import data101grader
- Execute the entire notebook and confirm that all cells run successfully.
- Fix any broken links, incorrect imports, or missing dependencies.
If your notebook requires additional Python packages, add them to requirements.txt
.
The _toc.yml
file manages the structure of the notebook collection. Update it to include your new notebook.
- Find the subject section in
_toc.yml
. - Add your notebook file name (
- file: <YOUR NOTEBOOK NAME>
) at the same indentation level.
Example: Adding a new ESPM notebook
- caption: Environmental Science, Policy, and Management
numbered: true
chapters:
- file: espm-regression
- file: <your file name> ###################
- Create a new chapter section at the end of
_toc.yml
. - Add a
file
entry for each notebook.
Example: Adding a new subject (e.g., Data Ethics)
- caption: Data Ethics
numbered: true
chapters:
- file: data-ethics-intro
- file: fairness-in-ml
Want to add a new notebook? Follow these steps:
- Fork this repository.
- Clone the forked repo to your local machine.
- Create a new branch named after your course (
git checkout -b <YOUR_COURSE_NAME>
). - Make your changes (update your notebook, fix errors, update
_toc.yml
, etc.). - Commit & Push your changes (
git push origin <YOUR_BRANCH_NAME>
). - Open a Pull Request (PR) in this repository.
- Link your data PR from the data repo.
- Wait for review and approval before merging. 🚀
When submitting a PR, include the following details:
- Added a new notebook:
<NOTEBOOK_NAME>.ipynb
- Updated
_toc.yml
to include the notebook - Updated
requirements.txt
to include new dependencies - Linked data PR: Data PR #123
- My notebook runs without errors
- All external file references use URLs
- Updated
_toc.yml
- Updated
requirements.txt
(if necessary) - Linked the corresponding data PR
🔹 Issue: Notebook fails to load data
✅ Solution: Ensure that the file URL is correct and formatted as:
pd.read_csv("https://cal-icor.github.io/textbook.data/<YOUR_COURSE_NAME>/your_file.csv")
🔹 Issue: ImportError for a local Python file
✅ Solution: Use httpimport
to import files from the data repo. Example:
import httpimport
with httpimport.remote_repo("https://cal-icor.github.io/textbook.data/<YOUR_COURSE_NAME>"):
import my_script
🔹 Issue: Missing Packages When Running the Notebook
✅ Solution: Add required packages to requirements.txt
and ensure it's up to date.
🔹 Issue: Notebook Not Listed in Table of Contents (_toc.yml
)
✅ Solution: Double-check _toc.yml
and make sure your notebook is added correctly under the correct subject.
This repository is licensed under the MIT License. See LICENSE for details.
✅ Linked your data PR in your notebook PR
✅ Updated all data references to use URLs
✅ Ensured the notebook runs without errors
✅ Updated requirements.txt
if needed
✅ Updated _toc.yml
to include the new notebook(s)
If you encounter any issues, feel free to ask for help by opening an issue or reaching out to jonathanferrari AT berkeley.edu.
📌 Reminder: PRs that do not follow these guidelines may be rejected.