Skip to content

Commit c785d0b

Browse files
authored
add book refs, bit more info
1 parent 132ff9c commit c785d0b

File tree

1 file changed

+12
-12
lines changed

1 file changed

+12
-12
lines changed

README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,38 +11,38 @@ To get a copy: [Inspection copy for instructors](https://www.cambridge.org/highe
1111

1212
## Acknowledgments
1313

14-
We'd like to say thanks for [Ágoston Reguly](https://github.com/regulyagoston) who created the template for the initial coding supplement in R to the Data Analysis handbook. We followed his steps in writing the Python-version of the teaching material.
14+
We'd like to say thanks for [Ágoston Reguly](https://github.com/regulyagoston) who created the template for Coding for Data Analysis series. We followed his steps in writing the Python-version of the teaching material.
1515

1616

1717
## Status
1818

19-
This is version 0.1, as of August 29, 2022.
19+
This is version 0.1, as of 13 September, 2022.
2020

21-
Comments are really welcome in email or as a GitHub issue.
21+
Comments are really welcome -- just add a GitHub issue.
2222

2323

2424
## Overview
2525

26-
The course is an introducton to the Python programming language, its software environment, and also to data exploration, data transformation, visualization, and more advanced data analysis.
26+
The course is an introducton to the Python programming language, its software environment, and also to data exploration, data transformation, visualization, and more advanced data analysis. The idea is that people will learn working with Python along with learning to carry out data analysis.
2727

2828
The material primarily consists of `Jupyter notebooks`, and is sometimes supplemented with additional data. In most cases, however, we used the [textbook's datasets](https://gabors-data-analysis.com/datasets/) to bring the course as close to the original textbook as possible.
2929

30-
Lectures 0 to 6 are general introductions to Python and its concepts. These notebooks focus on coding principles, Python's main building blocks, and introduce the data analyst's most important data structure: Pandas dataframes.
30+
Lectures 0 to 9 mostly complements [Part I: Data Exploration (Chapter 1-6)](https://gabors-data-analysis.com/chapters/#part-i-data-exploration).Lectures 0 to 6 are general introductions to Python and its concepts. These notebooks focus on coding principles, Python's main building blocks, and introduce the data analyst's most important data structure: Pandas dataframes. Lecture 7 gives insight how to use Python for data exploration. Lectures 8 and 9 expands the toolkit for advanced data analytics techniques.
3131

32-
Lecture 7 gives insight how to use Python for data exploration. Lectures 8 and 9 expands the toolkit for advanced data analytics techniques.
33-
34-
Lectures 10 to 16 cover everything you need to know about linear regression in Python on an introductionary level. We start with simple linear regression on cross-sectional data, then we explore binary models, and multiple linear regression. Finally we discuss the basic time-series regression model and its intricacies.
32+
Lecture 10 to 16 complements [PART II: Regression Analysis (Chapter 7-12)](https://gabors-data-analysis.com/chapters/#part-ii-regression-analysis) and cover everything you need to know about linear regression in Python on an introductionary level. We start with simple linear regression on cross-sectional data, then we explore binary models, and multiple linear regression. Finally we discuss the basic time-series regression model and its intricacies.
3533

3634

3735
## Philosopy and how to use
3836

3937
We tried to put together a benchmark course to supplement the Data Analysis texbook and to help anyone, students and intructors alike, follow the book's material. Anyone is free to use the notebooks in their current or in any modified form, with proper reference to the original material.
4038

41-
While we try to teach the basics on Python, this is not a classical coding course material. The notebooks take the reader through the data analysis workflow of the first 12 chapters of the textbook providing assitance in Python along the way. It is possible to learn the very basics of Python using these notebooks, but simply completing the exercises won't make anyone a programmer. Using the codebase _and_ the textbook together however, does help in understanding statistical and data analytics concepts and see the theory in practice.
39+
While we teach the basics on Python, this is not a classical coding course material. The notebooks take the reader through the data analysis workflow of the first 12 chapters of the textbook providing assitance in Python along the way. You will learn gradually what is needed to carry out analytical steps from loading data to running regressions. We will suggest additional resources to learn more coding tools and enhance your skills.
40+
41+
It is possible to learn the very basics of Python using these notebooks, but simply completing the exercises won't make anyone a programmer. Using the codebase _and_ the textbook together however, does help in understanding statistical and data analytics concepts and see the theory in practice.
4242

4343
The lectures are pre-written, which an educated reader can follow and understand. Nevertheless, instructors may want to modify and tailor-make the codes according to their own teaching habits and philosophy. Homeworks are not part of the codebase, giving instructors another task in the practical coding sessions of their data analytics courses.
4444

45-
The material's main focus is the manipulation and analysis of tabular data. Pandas dataframes provide most of the tools for these manipulation exercises, and we use the `statsmodels` package for running linear regressions. We added a basic a matplotlib intro but we use `plotnine`, the Python-implementation of _ggplot_, for visualization and graphical representation.
45+
The material's main focus is the manipulation and analysis of tabular data. `Pandas` dataframes provide most of the tools for these manipulation exercises, and we use the `statsmodels` package for running linear regressions. As for data vizualization, we added a basic intro to the most popular `matplotlib`pacakge, but rely heavily on a new favorite: `plotnine`, the Python-implementation of R's _ggplot_, for visualization and graphical representation.
4646

4747

4848
## Course content
@@ -70,6 +70,6 @@ The material's main focus is the manipulation and analysis of tabular data. Pand
7070

7171

7272

73-
## Note
73+
## Technical Note: environment
7474

75-
Most data science courses use the Anaconda environment for Python. We, however, use `pip` and `pipenv`, and run Jupyter notebooks from the course's environment. Anaconda is a great tool for data analysis and data science, but once someone goes beyond ad-hoc adata analysis and needs to develop and deploy advanced data solutions in a production environment in Python, `pip` is going to be the way to go.
75+
Most data science courses use the Anaconda environment for Python. We, however, use `pip` and `pipenv`, and run Jupyter notebooks from the course's environment. Anaconda is a great tool for data analysis and data science, but once someone goes beyond ad-hoc adata analysis and needs to develop and deploy advanced data solutions in a production environment in Python, `pip` is going to be the way to go.

0 commit comments

Comments
 (0)