Efficient and Scalable Network Servers

Ricardo Bianchini and Enrique V. Carrera

Department of Computer Science
Rutgers University


The Network Servers Project

Overview

The number of Internet users has increased rapidly over the last few years. This large user base is placing significant stress on the computing resources of popular services available on the Internet. Clusters of commodity computers interconnected by high-performance networks are currently being used to provide the scalability required by several of these services. The commercial HotBot search engine is an interesting example of a popular Internet service that performs several million queries per day using a very large cluster-based server. Other examples of services based on clustered servers abound.

The cluster-based servers (such as the CISCO Local Director and the IBM Network Dispatcher) usually distribute the clients' requests across the cluster nodes according to a simple load balancing mechanism or according to the type of file requested (image, text, etc). In these servers, each node is responsible for serving the requests assigned to it independently of other nodes. Thus, the main performance problems with these servers are that the memories of the cluster are utilized as a set of independent caches of disk data and the request distribution is oblivious to the contents of the different caches. Although this caching strategy performs well when a small set of files accounts for a large percentage of requests and the average size of the files is small, performance suffers when these conditions do not hold. Furthermore, the files serviced are becoming larger and more numerous, especially with the proliferation of multimedia files and content hosting servers. For instance, a WWW hosting service, such as that provided by most ISPs and dozens of other providers, is especially demanding, as WWW pages from a large number of renters (individuals and small corporations) are managed by the same set of nodes.

Under such adverse scenarios, these locality-oblivious servers may suffer from high cache miss rates (when the server's working set size exceeds the size of each local memory) among other problems. As a result, using the set of memories of the cluster as a single large cache and distributing requests among the nodes according to cache locality prove extremely profitable. In more detail, a locality-conscious server can schedule a request on a node that is likely to be caching the requested file locally. A strict implementation of such a server does not allow replication of cached files, which can increase cache hit rates significantly, but can also produce severe load imbalance due to excessively popular files. A looser implementation, in which some amount of replication is allowed, can promote cache locality and load balancing at the same time and, as a result, achieve superior performance. Pai et al. [ASPLOS98], Aron et al. [USENIX99,USENIX00], and ourselves [HPDC00,WWW00] have proposed and evaluated such servers.

Even though these servers achieve performance that far surpasses that of the locality-oblivious servers for large working set/memory size ratios, they rely on fairly complex TCP connection hand-off mechanisms, in which the connection initially accepted by some cluster node is transferred to the node that will actually service the request, which in turn responds directly to the requesting client. Unfortunately, the software implementing these mechanisms is not portable across different operating systems or even across different versions of the same operating system, since it involves modifications or extensions to the underlying operating system kernel.

To achieve portability and retain the performance properties of locality-conscious servers, we also proposed a server (called PRESS) that avoids connection hand-offs by transferring a requested file back to the client through the cluster node that initially accepted the connection. Furthermore, intra-cluster communication in PRESS is implemented with the Virtual Interface Architecture standard for user-level communication, which has implementations for virtually all modern system-area networks, such as Myrinet and Giganet.

PRESS has so far been evaluated as a cluster-based Web server running on Linux boxes connected by a Giganet network. The paper [PPoPP01] describing the portable scheduling of requests for static Web content in PRESS and the corresponding source code are now available. Also available is our evaluation of the benefit of user-level communication features for content-aware network servers such as PRESS [HPCA02]. Our journal paper [TPDS05] describes the entire design and evaluation of PRESS.

Using PRESS, we have also done work on power and energy conservation and Internet service availability.

Papers

People

Source Code

The source code for PRESS is now available here. Note that we would like you to cite our PPoPP '01 paper (full reference above) in case you actually use our server in experiments to be reported in the computing literature.