Skip to content Skip to navigation
Pre-Defense
1/23/2020 12:00 pm
CoRE 305

Programming and Managing Data-Driven Applications Between the Edge and the Cloud

Eduard Gibert Renart, Department of Computer Science

Defense Committee: Manish Parashar (Advisor), Ulrich Kremer, Srinivas Narayana

Abstract

Due to the proliferation of the Internet of Things (IoT), the number of devices connected to the Internet is growing. These devices are generating unprecedented amounts of data at the edge of the infrastructure. Although the generated data provides great potential, identifying and processing relevant data points hidden in streams of unimportant data, and doing this in near real time, remains a significant challenge. The prevalent model of moving data from the edge to the cloud of the network is quickly becoming unsustainable. Resulting in  an impact on latency, network congestion, storage cost, and privacy, limiting the potential impact of IoT.

To address these challenges, this dissertation presents an IoT Edge Framework, called R-Pulsar, that extends cloud capabilities to local devices and provides a programming model for deciding what, when, where and how data get collected and processed. This thesis makes the following contributions: (1) A content- and location-based programming abstraction for specifying what data gets collected and where the data gets analyzed. (2) A rule-based programming abstraction for specifying when to trigger data-processing tasks based on data observations. (3) A programming abstraction for specifying how to split a given dataflow and place operators across edge and cloud resources. (4) An operator placement strategy that aims to minimize an aggregate cost which covers the end-to-end latency (time for an event to traverse the entire dataflow), the data transfer rate (amount of data transferred between the edge and the cloud) and the messaging cost (number of messages transferred between edge and the cloud). (5) Performance optimizations on the data-processing pipeline in order to achieve real-time performance on constrained devices.