|
2 | 2 |
|
3 | 3 | Total: 20 points (of 24 available -- this allows multiple paths through the module) |
4 | 4 |
|
5 | | -## 1. Data Reading and Preprocessing (**4 points**) |
| 5 | +## 1. Data Reading and Preprocessing (**5 points**) |
6 | 6 |
|
7 | | -**4 points**: The notebook correctly reads in the data, handles missing or incorrect data appropriately, and clearly documents these steps. |
| 7 | +**5 points**: The notebook correctly reads in the data, handles missing or incorrect data appropriately, and clearly documents these steps. |
8 | 8 |
|
9 | | -## 2. Data Analysis and Plotting (**4 points**) |
| 9 | +## 2. Data Visualization (**5 points**) |
10 | 10 |
|
11 | | -**4 points**: The data analysis is thorough, appropriate for the data, and well-executed. Plots are clear, well-labeled, and enhance understanding of the data. Figures are clearly labeled with legends, titles and axes labels. |
| 11 | +**5 points**: The data analysis is thorough, appropriate for the data, and well-executed. Plots are clear, well-labeled, and enhance understanding of the data. Figures are clearly labeled with legends, titles and axes labels. |
12 | 12 |
|
| 13 | +## 3. Code Quality and Documentation (**5 points**) |
13 | 14 |
|
| 15 | +**5 points**: The code should be concise, semantically meaningful. The code does not introduce additional libraries or methods not necessary for the task or beyond the scope of basic data analysis methods covered in the course so far. |
14 | 16 |
|
15 | | -## 3. Code Quality and Documentation (**4 points**) |
16 | | - |
17 | | -**4 points**: The code should be concise, semantically meaningful. The code does not introduce additional libraries or methods not necessary for the task or beyond the scope of basic data analysis methods covered in the course so far. |
18 | | - |
19 | | -Avoid common LLM-based code-junk that does not follow best practices: |
| 17 | +Avoid common LLM-based code-junk that does not follow best practices in data science notebooks: |
20 | 18 | - Do not use `print` statements in code cells. Cells should do one clear isolated task. The final value on a cell is auto-printed by Jupyter without needing a `print` statement -- this should be used appropriately (e.g. to display small tables or plots). |
21 | 19 | - Code should not include unnecessary error handling, such as `try` statements. Use concise, working code for the data. |
22 | 20 | - Code should not create function definitions unless they serve a clear use in making the code more concise and readable. |
23 | | - - Do not include code comments in most cases. Codes should be self-documenting, with clear variable names and structure. Comments should be brief and only to clarify technical details. Use markdown cells to explain the overall logic and flow of the notebook. |
| 21 | + - Do not include code comments in most cases. Codes should be *self-documenting*, with clear variable names and structure. Comments should be brief and only to clarify technical details. Use markdown cells to explain the overall logic and flow of the notebook. |
24 | 22 | - avoid long chunks of code that are not broken up into smaller, logical steps. Each code cell should do one thing, and be clearly labeled with a markdown cell above it to explain what it does. |
25 | 23 |
|
26 | | -## 4. Presentation and Clarity (**4 points**) |
| 24 | +AVOID the failure condition of fabricated data. LLMs will sometimes fail to read data in from authorative sources becuase as we see in this assignment, 'parsing' data is hard! LLMs can get desperate and will then make up data that roughly matches the pattern the expect. Sometimes they are transparent about this ("I cannot read the file, but if you imagine the file gave you these numbers than this is what you would do...") and sometimes less so. Regardless, plotting made up or fictious generated data instead of numbers from authorative data sources could be considered fraud and get you fired in real-world employment. For us, it is a "failure condition", **-10 points**. |
| 25 | + |
| 26 | +## 5. Narrative (**5 points**) |
27 | 27 |
|
28 | | -**4 points**: The notebook is well-formatted and visually appealing, with a logical flow. Text and figures are clear, well-integrated, and free of spelling or grammatical errors. |
| 28 | +**5 points**: The notebook makes good use of markdown chunks to tell a clear story. Instructor-provided headings like "exercise II" are replaced by topical headings like "Global Temperature". Data used and plots generated are clearly described ("the plot shows...") to help a reader understand what they are seeing, where the data comes from, key take-aways. <https://climate.nasa.gov/vital-signs> is a great example of supporting narrative describing the data and results, but tell the story in your own words. |
29 | 29 |
|
30 | 30 |
|
31 | 31 | ## 5. Use of GitHub (**4 points**) |
|
0 commit comments