2015 CS109A: Harvard's Data Science

Hubway Clustering

Learning from data in order to gain useful predictions and insights. This third iteration of the course continues on the same ideas as the previous two; use methods of the five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries.

Python was used for all programming assignments and projects. All lectures are posted here.

  • Joe Blitzstein, Statistics
  • Hanspeter Pfister, Computer Science
  • Verena Kaynig-Fittkau, Computer Science
  • Rahul Dave, Head TF