This class is related to 541 (graduate databases) and to 539 (physical databases) – but it is much systems oriented and focused on practical hands on aspects of storing and managing of massive amounts of structured and unstructured data.
The main course requirement will be a semester-long course project, involving Apache Spark and/or Deep Learning. See Course Project Slides for details. Related to this course project, there will be homework assignments during the semester (Note: currently just the intermediate report).
Additionally, there will be small in-class quizzes.
The objective is to teach skills to manage (query, analyze, learn) very large amounts of structured and unstructured data.