Skip to content Skip to navigation
PhD Defense
6/5/2014 02:00 pm
CoRE A(Room 301)

Model Based Validation for Operator Mistakes

Andrew Tjang, Rutgers University

Defense Committee: Thu Nguyen (Chair,) Richard Martin, Vinod Ganapathy and Raj Rajagopalan (HP Labs)

Abstract

In studies separated by decades, operator mistakes have been identified as a significant source of unavailability in Internet services. They can be caused by anything from static misconfiguration to physical placement of wires and machines. Detecting and repairing these mistakes can often be time consuming and for many services, unavailability results in loss of revenue and/or clients.

In this defense we present a series of tools that assist those who are charged with designing and creating the infrastructure of those Internet services to mitigate the results of operator mistakes. We propose an assertion based language, A, that is a formalized specification of correct behavior and can be used to bolster system understanding, as well as help to flag operator mistakes in a distributed system. We look at examples of these mistakes, their effects and manifestations in both an academic environment and in real-world applications. We then explore methods to simplify the assertion writing process using machine learning techniques. With such a large attribute space and such small data sets, we investigate a variety of optimizations including a refinement loop and various filtering algorithms.