Machine learning (ML) systems are increasingly deployed in safety- and security-critical domains such as self-driving cars and malware detection, where the system correctness for corner case inputs are crucial. Existing testing of ML system correctness depends heavily on manually labeled data and therefore often fails to expose erroneous behaviors for rare inputs.
In this talk, I will present the first framework to test and repair ML systems, especially in an adversarial environment. In the first part, I will introduce DeepXplore, a whitebox testing framework of real-world deep learning (DL) systems. Our evaluation shows that DeepXplore can successfully find thousands of erroneous corner case behaviors, e.g., self-driving cars crashing into guard rails and malware masquerading as benign software. In the second part, I will introduce machine unlearning, a general, efficient approach to repair an ML system exhibiting erroneous behaviors. Our evaluation, on four diverse learning systems and real-world workloads, shows that machine unlearning is general, effective, fast, and easy to use.