This class aims to provide you with a basic set of tools for data literacy as well as general view of the impact of data on society and elements of common sense data analysis and reasoning. A significant piece of the class will be learning foundations of R. R is a statistical software environment and programming language that we’ll use to analyze and visualize datasets. Learning simple R will take some work; however, if you’re able to master the basics covered in this class, you’ll gain a concrete, marketable skill that may very well be extremely useful in your academic and professional life.
On the statistical side, we’ll cover basic topics from statistics and probability that are required to argue persuasively using data (a list of some of the topics to be covered can be found below). This is not a “typical” Statistics 101 class; instead of covering an exhaustive list of topics and asking you to memorize many formulas, our goal is to focus only on the most important topics for convincingly analyzing data now by solving “hands on” weekly data puzzles.
This class is taught in unique manner – students have to solve “data puzzles” (one or more weekly) and defend their solutions in class in the so called “Court of Data”. Students compete in the semester long competition for the titles of Data Masters (aggregated score for all data puzzles and the project)
One of the objectives of the class is to show the danger of false, random conclusions from data and learning right methodology of “healthy skepticism”.
We will also discuss how not to be fooled by data and show examples of rushed and ad hoc conclusions from so called “big data” in the news and on the web. In addition we will examine both upside and downside of big data on the web. We will talk about privacy, anonymity vs personalization and data ownership when we increasingly rely on online services.
In Final project your data findings should have real consequences, preferably “actionable” and consequential in the real society.