Despite decades of investigation, software systems remain vulnerable to code defects. Even fewer assurance guarantees can be made about the reliability of new domains such as autonomous systems that are controlled by neural networks. The structure of a neural network poses significant challenges in reasoning about its lifetime operational behavior in a complex real-world environment.
In this talk, inspired by recent successful applications of data-driven techniques to real-world problems (e.g., AlphaGo), I will describe how to address the aforementioned conundrum by exploring data-driven techniques to discover high-level abstract software specifications and demonstrate how such techniques can enable scalable automated formal verification of programming systems.
Viewing a software verifier as a source of data, providing examples, counterexamples, and in-put-output examples to a hypothesis program specification, I will first present SynthHorn, a verification framework that allows machine learning algorithms to interact with a verification engine to synthesize meaningful specifications for real-world programming systems.
Second, as a continued exploration of the data-driven verification theme found in SynthHorn, I will show how to leverage data-driven formal verification to realize trustworthy machine-learning systems. Specifically, I will present SynthML, a general program synthesis framework that synthesizes programmatic specifications of complex reinforcement learning models (e.g., neural networks). It samples environment states that can be encountered by a cyber-physical system controlled by a deep neural network, synthesizing a deterministic program that closely approximates the behavior of the neural controller on these states and that additionally satisfies desired safety constraints. Executing a neural policy in tandem with a verified program distilled from it can retain performance, provided by the neural policy, while maintaining safety, provided by the program.
I will conclude the talk by discussing my two-fold vision: 1) applying machine-learning driven verification techniques to build trustworthy distributed systems, operating systems, and emerging AI-based systems, as well as 2) enabling transparent, robust and safe machine learning by means of language-based verification techniques.