NJPLS at Rutgers University

September 30, 2016

This instance of NJPLS will take place at Rutgers University. The talks will be in Core Lecture hall (101) and lunch will be in Core 301.

Registration for this event is now closed.

Travel instructions may be found here. If you are driving, you can park your car in Lots 64, 60A, and 60B without permits (There will be Event Parking signage). Maps of the various parking lots is available here.

Agenda

9:00amBreakfast
9:30am
10:30amCoffee
11:00am
12:15pmLunch
1:45pm
3:15pmCoffee
3:30pm

This meeting is organized by Santosh Nagarakatte.

Abstracts

Checked C

David Tarditi, Microsoft Research

This talk will present the Checked C research project, which is investigating how to extend the C programming language so that programmers can write more secure and reliable C programs. The project is developing an extension to C called Checked C that adds checking to C to detect or prevent common programming errors such as buffer overruns, out-of-bounds memory accesses, and incorrect type casts. The extension is designed to be used for existing system software written in C.

Top-to-bottom verification of well synchronized C programs on weakly consistent machines

Santiago Cuellar, Princeton University

Well synchronized programs should not have data races, and pro- grams whose sequentially consistent executions have no data races should run on weakly consistent multiprocessors as if they were sequentially consistent. The former claim comes from Dijkstra and Hoare through formalizations by O’Hearn and Brookes; the latter was stated by Saraswat et al. as a desideratum for compilers and multiprocessors.

We prove these claims for Concurrent Separation Logic, the CompCert C compiler, and the Intel x86 (IA-32) computer architec- ture. C programs provable in Concurrent Separation Logic, when compiled with this optimizing C compiler and run on that machine, will have the (sequentially consistent) behavior proved at the source level. We prove it in a modular way: a theorem about the soundness of CSL with respect to the operational semantics of C (and our 5 synchronization functions); a theorem about the correctness of the C compiler; a theorem about the Intel x86-TSO architecture.

ApproxHadoop: Approximation with MapReduce Programs

Thu Nguyen, Rutgers University

Verifying Constant Time Implementations

Michael Emmi, Nokia Bell Labs

The constant-time programming discipline is an effective countermeasure against timing attacks, which can lead to complete breaks of otherwise secure systems. However, adhering to constant-time programming is hard on its own, and extremely hard under additional efficiency and legacy constraints. This makes automated verification of constant-time code an essential component for building secure software.

We propose a novel approach for verifying constant time security of real-world code. Our approach is able to validate implementations that locally and intentionally violate the constant-time policy, when such violations are benign and leak no more information than the public outputs of the computation. Such implementations, which are used in cryptographic libraries to obtain important speedups or to comply with legacy APIs, would be declared insecure by all prior solutions.

We implement our approach in a publicly available, cross-platform, and fully automated prototype, ct-verif, that leverages the SMACK and Boogie tools and verifies optimized LLVM implementations. We present verification results obtained over a wide range of constant-time components from the NaCl, OpenSSL, FourQ and other off-the-shelf libraries. The diversity and scale of our examples, as well as the fact that we deal with top-level APIs rather than being limited to low-level leaf functions, distinguishes ct-verif from prior tools.

Veritical Composition of Reversible Atomic Objects

Eric Koskinen, Yale University

The classic Herlihy/Wing notion of concurrent objects has had great success in theories and implementations (e.g. java.util.concurrent), providing programmers with the simple abstraction of an atomic object. Since then, software transactions have appeared, also touting the the goal of providing an atomicity abstraction. However, despite some vertical composition strategies within particular STM implementations, a fundamental concept of vertical composition has remained elusive.

In this talk, I will present our recent results in which we distill the essence of vertical composition, with the notion of reversible atomic objects. By restricting occurrences of transactions to the method boundary and requiring that every object method construct its own inverse, we obtain a cleaner semantics that supports vertical composition. In fact, we do not even require that one layer use the same implementation (e.g. pessimism versus optimism) as another, nor that the object be transactional at all. Formally, we begin with a semantics in which abstract-level operations are composed from constituent base operations, accounting for conflict and inverses. These transactional implementations are put in the context of an environment that includes a novel a deadlock-mitigating contention manager that ensures progress. The contention manager may, at any point, apply inverses on behalf of a currently executing transaction. Our work has the first proof that programs composed with implementations in this framework are a contextual refinement of the same program instead composed with atomic specifications and that layers can be composed vertically.

Our compositional treatment in terms of a single shared log gives rise to a novel transactional variant of the universal construction. I will discuss our library of reversible atomic objects, demonstrating that it is easy and intuitive to build complex concurrent implementations (e.g. a multi-threaded layered filesystem) by vertical composition of atomic components. This is joint work with Paul Gazzillo, Timos Antonopoulos, & Zhong Shao

A User-Guided Approach to Program Analysis

Xin Zhang, Georgia Tech

Program analyses often produce undesirable output due to various approximations. We present a novel approach that allows user feedback to guide such approximations towards producing the desired output. We formulate the problem of user-guided program analysis in terms of solving a combination of hard rules and soft rules: hard rules capture soundness while soft rules capture degrees of approximations and preferences of users. We solve the rules in a manner that is sound (satisfies all hard rules), optimal (maximally satisfies soft rules), and scales to sophisticated analyses and complex programs. We demonstrate the effectiveness of our approach on diverse analyses expressed in Datalog on large, widely-used Java programs.

Analyzing Build Systems with Partial Makefile Evaluation

Paul Gazillo, Yale University

Collections of Makefiles are frequently used to specify the build-time configuration of software projects. The Linux kernel, for example, has over 14,000 configuration variables and 20,000 lines of Makefile code that specify the kernel build units. These Makefiles are written in a macro-like language and must be evaluated to obtain a given build configuration specification. Given this complexity, it is not surprising that bugs lurk in these Makefile collections resulting in invalid configurations once expanded, and that evaluating Makefiles over all configurations is intractable.

In this talk we introduce a technique for automatically analyzing configuration Makefiles. This is the first analysis capable of obtaining the complete enumeration of all possible Linux kernel compilation units and their configuration properties, critical information for project-wide software engineering tools such as bug-finders, refactorings, and code browsers. The key challenge is that Makefile programs rely heavily on meta string manipulation: Even variable names can be manipulated as strings at runtime, making static analysis difficult. Our solution is a form of partial evaluation, combining symbolic and concrete execution. The algorithm evaluates statements symbolically but unfolds an expression to its many concrete values, balancing tractability with meaningful results. Variables are treated symbolically when used in predicates, enabling efficient handling of configuration properties, but treated concretely when used as string values, producing useful program state.

We have implemented our approach as a new tool called Kmax and evaluated it on the Linux kernel build system. Kmax is the first tool that is able to find all kernel compilation units and their configuration properties. Our results show that Kmax is precise---previous tools miss over a thousand files compared to Kmax---and performant, taking only about a minute and a half for the x86 version of the Linux kernel, only two times slower than the fastest compared tool.

Loopy: Programmable and Formally Verified Loop Transformations

Nimit Singhania, University of Pennsylvania

This talk presents a system, Loopy, for programming loop transformations. Manual loop transformation can be tedious and error prone, while fully automated methods do not guarantee improvements. Loopy takes a middle path: a programmer specifies a loop transformation at a high level, which is then carried out automatically by Loopy, and formally verified to guard against specification and implementation mistakes. Loopy’s notation offers considerable flexibility with assembling transformations, while automation and checking prevent errors. Loopy is implemented for the LLVM framework, building on a polyhedral compilation library. Experiments show substantial improvements over fully automated loop transformations, using simple and direct specifications.

Executable Categorical Models of Type Theory

Gershom Bazerman, S P Global Market Intelligence

We present work in progress on an approach to the problem of developing strongly typed embedded languages in Haskell in conjunction with type-safe interpreters and normalizers. Our approach is to follow the logic of the Curry-Howard-Lambek correspondence, proceeding from categorical (presheaf) models of typing judgments. We then render these models directly into Haskell via an "arrows-first" encoding using Generalized Algebraic Data Types. This yields succinct, type-safe specifications with extremely compact interpreters and normalizers. Crucial use is made of the representation of exponentials in presheaf categories as a way to capture higher-order behavior. With regards to variable binding, this approach "covers" both nested abstract syntax (type-safe De Bruijn notation) and higher-order abstract syntax, yielding some insight into their origin and relationship.

User Interactions and Permission Use on Android

Kristopher Mcinski, University of Maryland

Android and other mobile operating systems ask users for authorization before allowing apps to access sensitive resources such as contacts and location. We hypothesize that such authorization systems could be improved by becoming more integrated with the app’s user interface. In this paper, we conduct two studies to test our hypothesis. First, we use AppTracer, a dynamic analysis tool we developed, to measure to what extent user interactions and sensitive resource use are related in existing apps. Second, we conduct an online survey to examine how different interactions with the UI affect users’ expectations about whether an app accesses sensitive resources. The results of our studies suggest that user interactions such as button clicks can be interpreted as authorization, reducing the need for separate requests; but that accesses not directly tied to user interactions should be separately authorized, possibly when apps are first launched.