Skip to content

contains ALL code generated for Principles of Urban Informatics Class at NYU CUSP Fall 2015,16,17,18

Notifications You must be signed in to change notification settings

fedhere/UInotebooks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UInotebooks

Diversity is considered a resource that enriches us culturally and intellectually in this class. No instances of harassment or attempts to marginalize students will be tolerated in my class. If you have concerns please come talk to me

This is a repo contains ALL code generated for Principles of Urban Informatics Class at NYU CUSP Fall 2016-18 for all the lectures taught by FBB.

(previous years: 2015 is here )

For the current homework sets and labs go to this Github repo.

This repo includes notebooks that will be explored in class, notebooks to be used as templates for lab sessions, and notebooks that generate plots used in the class, and class slides.

To run the notebooks, make sure that you have a directory for the plots, and outputs to be stored in, identified by environmental variables PUI2015, PUI15PLOTS, PUI2016 etc, as needed.

Numerous python packages are needed: numpy, scipy, sklearn, statsmodules, pandas. Mostly I try to be PEP8 compliant, but I am sloppy sometimes (while you should never be!).

If you are a PUI student you may want to make a copy of this repository, and you definitely want to check for updates regularly.

If you just want to have your own copy of the repository on your machine what you want to do it clone the repository on your machine as

git clone https://github.com/fedhere/UInotebooks.git

That way if I make changes you can update them locally on your machine with

git pull

but you will not be able to make changes to my repository (as it should be).

Otherwise, if you want the repository on your account, with the ability to make changes to your own copy, you can fork the repository and have your own copy in your github account (it will be at http://github.com//PUI2018_fb55). In which case you want to keep it synchronized following these instructions to coniguting a remote for a fork and syncing a fork

If you see a mistake (there are many of course) and you want to propose a change you can submit a pull request, but be very clear about the change and it motivation in the description of your pull request, otherwise I will likely dismiss it as something you did accidentally.

The material here is for a class based on the following syllabus:

Lecture: philosophy and good practices of data science: the flow chart of a data-driven project from idea to divulgation,  the concepts of falsifiability, reproducibility, open science, the importance of version control, iPython Notebooks
Lab: command line tools. github repositories, setting up your environment, Python vs iPython, and iPython notebooks
Lecture: acquiring and preparing data (CSV, TSV, downloadable ascii files, basic SQL) merging data from different files, plotting histograms and scatter plots, data types incl ordinal, continuous, categorical data
Lab: read and clean data, Citibikes, Pluto, Census, introduction to data structures (dictionaries, lists, arrays), style guides
Lecture: Introduction to the statistics, why everything is gaussian (...or not), bias, basic distributions, moments, Hypothesis testing (chi-square, z-test, p-value)
Lab: basic statistics on Pluto, Census, Citibikes data, moment extraction, deviations from gaussianity/poissonity, histograms, proper binning.
Lecture: PDF/CDF, data dredging, sample of 1, correlation vs causality, error analysis, testing models (KS, anderson darling, KL divergence), basic plotting.
Lab: hypothesis testing
Lecture: Likelihood, OLS, WLS, basic bayesian concepts (maybe), missing data, small data, Unstructured data, API.
Lab: goodness of fit, geopandas
Lecture: (time)-series  techniques: smoothing, detrending, stationary, non-stationary,  homeo- & hetero-scedastic noise, vectorization
Lab: trash time series
Lecture: Visualizations. Communication through visualizations, history, significance, good and bad visualization examples, what have we learnt since the 1800s?
Lab: create a viz with pylab, small-multiples with pylab
Guest Lecture: Data Hygiene
Lab: Data Hygiene
Lecture:  Spatial+Temporal Urban data (multidimensional data), clustering. 
Lab:  ZBP data
Lecture: GeoPandas
Lab: Geopandas
Lecture: Spatial interpolation - Kriging
Lecture: Trees
Lecture: Monte Carlo methods and MCMC

Missing Topics that could or should be included:

optimization techniques (gradient descent, annealing...)
designing experiments 
web/mobile coding
Audio vs Video data
multiprocessing/multithreading
Visualizations w D3
NPL
Graph Theory
GIS 
D3+GIS visualizations 

PUI2018_fb55

About

contains ALL code generated for Principles of Urban Informatics Class at NYU CUSP Fall 2015,16,17,18

Resources

Stars

Watchers

Forks

Packages

No packages published