System and Compiler Support for Component-Based Construction of Scalable Internet Services

NSF Information

PI: Thu D. Nguyen, co-PIs: Richard P. Martin and Barbara Ryder

Abstract. We propose to investigate system support for an emerging class of continuously available applications. This support is motivated by the emergence of the Internet as the global, ubiquitous networking infrastructure, and its accompanying computing model where much of the computing takes place on servers rather than local machines. Continuously available applications typically offer remote services to a large number of users who are potentially located far apart geographically; examples include auction systems, stock exchange systems, web servers, and email servers.

Continuously available applications can be as computationally intense as the traditional large scale computing problems such as the grand challenge or massive data processing problems: for example, data from Geocities, a popular web-hosting service, indicates that their servers must process an average of 2000 requests per second. (Note that the peak processing rate is likely much higher than this since usage of such a site is not uniform throughout a typical day.) In addition, these applications pose new challenges both in terms of availability and soft real-time performance. Consider the provider of calendaring services over the Internet. For such a service to be useful, a user's calendar must be available wherever the user is, whenever the user wishes to access it. That is, the service must be location independent as well as continuously available at all hours of the day. Users in Los Angles will not want to miss their meetings because a snow storm causes a power outage in New York, where the calendaring service may be based. Furthermore, the service must perform similarly whether the user is in Seattle or in Boston; that is, operations on the calendar should complete in comparable time.

We believe that, together, these requirements are significantly different from the traditional fault-tolerance requirements. Coupled with the opportunity to leverage the weaker consistency requirements of many continuously available applications (e.g., a web search engine may return stale data), these differences fundamentally require revisiting how we design and program highly available systems. In particular, we are posing the challenge of how can we build scalable, high-performance applications that are continuously available 24 hours a day, 7 days a week, 365.25 days a year across a span of 10 years. Further, by available we do not mean that a request eventually gets serviced but rather that all requests must be serviced within some time bound.

While meeting the above challenge in its entirety is our eventual goal, in this proposal, we limit our attention to exploring the following building blocks:

The above research agenda represents concrete steps towards meeting the availability requirements of continuously available applications, which we believe will represent the core of Internet computing. In particular, our highly available data structures should be particularly useful to researchers exploring the construction of future continuously available applications.