• Course Number: 16:198:526
  • Course Type: Graduate
  • Semester 1: Spring
  • Credits: 3
  • Description:

    This class is a foundational class for Data Science within the Computer Science Department.

    To become proficient in the current major techniques and systems for algorithmic data analysis, exploration, visual interaction, and summarization.

    Students will complete a competitive group project that will incorporate all the facets of software product development, namely Conceptualization, Data collection, Algorithm identification and Implementation, User Interface, and Evaluation.

  • Instructor Profile: Abello Monedero, James
  • M.S. Course Category: Visualization
  • Category: B (M.S.), B (Ph.D.)
  • Prerequisite Information:

     

     

  • Topics:

    Module Number

    Module Name

    0

    Introduction to Data Interaction & Visual Analytics

    1

    Visual analytics Introduction, Plotly, Tree Maps, Barycentric Representations & Sample Project Presentations

    2

    Backend to Frontend Design & Tools, Libraries, Plots, and Interactions, Linear Regression

    3

    Introduction to Massive Computing and MapReduce

    3

    Hadoop Distributed File System (HDFS) and Sample Project Presentation

    4

    Streaming Data and Introduction to Apache Spark and CAP Theorem

    4.5

    Decision Trees and MapReduce

    5

    External Memory Algorithms, I/O Model​, 

    5

    Graph Visualization

     

    Stage 1 Project Presentations

    5

    Force Directed Layout Algorithms 

    6

    Graph Visualization Literature Review

    Graph Cities: Giga Graph Visualization

    Giga Graph Visualization (supplement)

     

    Spring Break

    7

    Data Analytics (Part I: Data Reduction and Dictionaries)

    Jaccard Similarity, Conditional Probability

    Data Analytics (Part II: Scalable Algorithms and Associative Statistics)

    Pearson Coefficients, Estimators, Histograms

     

    Stage 2 Project Presentations

    7

    ANOVA Supplement

    Null Hypothesis, Sum of Squares, F Statistic, P Value, Two Sample t-test, Bonferroni Correction

    7

    Data Analytics (Part III: Multi-variate Regression)

    Linear Regression (Mean Square Error, Gradient Descent)

    Logistic Regression (Module Formulation, Update Rule, Newton-Rapson Algorithm, Generalized least squares estimator)

    8

    Data Management & Data Mining

    9

    Time and Space & Infrastructure

    10

    Perception and Cognitive aspects, Evaluation & Recommendations

    Review

    Final Review

     

    Final Project Presentations and Best Class Project Announcements

  • Expected Work: Bi-Weekly Homeworks, Midterm Exam(s), Mid Project, Final Project Demo, Final Exam Students in this class will become proficient with the major techniques and systems for algorithmic data analysis, exploration, visualization, interaction, and summarization. A competitive group project will be completed that incorporates all the major facets of computer-human interface development.
  • Notes:

    Objective: Expose and train students in all the facets of computer-user interface development.

    Guiding evaluation principles will be: the “value” of the extracted information from a variety of data sets at different scales, the methods and models used, interface interactivity, and the final application usefulness.

    Reference Textbooks: The following (optional) books are useful:
    • VisMaster-book
    • Interactive Visual Data Analysis, Christian Tominski and Heidrun Schumann, AK Peters
    Visualization Series, CRC Press, IVDA book, 2021
    • Unsupervised Learning for Dimensionality Reduction and Data Visualization by B.K. Tripathy, A. Sundareswaran, S. Ghela, CRC Press (2022)
    • Semiology of Graphics, Jacques Bertin, Translated by William J. Berg, ESRI Press, 2010
    • Deep Learning Volume 1, Volume 2, Andrew Glassner, The Imaginary Institute, Seattle,
    WA (2020).
    • Computational Social Science, Edited by R. Michael Alvarez, Cambridge University Press,
    2016
    • Selected papers from the literature on Algorithmic Analytics, Visualization, and
    Computer Human Interaction.

    Grading:

    1. Bi-Weekly Homeworks 
    2. Midterm Exam 
    3. Mid Project Draft 
    4. Final Project Demo  – Evaluated by a Faculty-Industry Panel