Members of the department have garnered two best paper awards at FAST’16, the 14th USENIX Conference on File and Storage Technologies. FAST’16 recognized two of the published papers this year with best paper awards. We are delighted to announce that both papers have Rutgers authors on them. FAST is being held February 22-25 in Santa Clara, California, and is the leading conference that publishes research on file systems and storage technologies. The conference received 115 submissions, of which 27 were selected for publication, and two selected as best papers.
Prof. Martin Farach-Colton is a co-author, together with collaborators from Stony Brook University, Facebook, Two Sigma and MIT, on the paper “Optimizing Every Operation in a Write-optimized File System.”
File systems that employ write-optimized dictionaries (WODs) can perform random-writes, metadata updates, and recursive directory traversals orders of magnitude faster than conventional file systems. However, previous WOD-based file systems have not obtained all of these performance gains without sacrificing performance on other operations, such as file deletion, file or directory renaming, or sequential writes. Using three techniques, late-binding journaling, zoning, and range deletion, this paper shows that there is no fundamental trade-off in write-optimization. These dramatic improvements can be retained while matching conventional file systems on all other operations.
BetrFS 0.2, the file system described in the paper, delivers order-of-magnitude better performance than conventional file systems on directory scans and small random writes and matches the performance of conventional file systems on rename, delete, and sequential I/O. For example, BetrFS 0.2 performs directory scans 2.2x faster, and small random writes over two orders of magnitude faster, than the fastest conventional file system. But unlike BetrFS 0.1, it renames and deletes files commensurate with conventional file systems and performs large sequential I/O at nearly disk bandwidth. The performance benefits of these techniques extend to applications as well. BetrFS 0.2 continues to outperform conventional file systems on many applications, such as as rsync, git-diff, and tar, but improves git-clone performance by 35% over BetrFS 0.1, yielding performance comparable to other file systems.
A copy of this paper, posted on the FAST website, appears here: https://www.usenix.org/conference/fast16/technical-sessions/presentation/yuan
Ph.D. student Ioannis Manousakis, Prof. Thu Nguyen, and former faculty member Prof. Ricardo Bianchini (now Chief Efficiency Strategist at Microsoft) are co-authors, together with collaborators from GoDaddy and Microsoft, on the paper “Environmental Conditions and Disk Reliability in Free Cooled Data Centers.”
Free cooling lowers datacenter costs significantly, but may also expose servers to higher and more variable temperatures and relative humidities. It is currently unclear whether these environmental conditions have a significant impact on hardware component reliability. The authors of this paper use data from nine hyperscale datacenters to study the impact of environmental conditions on the reliability of server hardware, with a particular focus on disk drives and free cooling. Based on this study, they derive and validate a new model of disk lifetime as a function of environmental conditions. Furthermore, the paper also quantifies the tradeoffs between energy consumption, environmental conditions, component reliability, and datacenter costs. Finally, based on the analyses and model, the paper derives server and datacenter design lessons.
The paper draws many interesting observations, including (1) relative humidity seems to have a dominant impact on component failures; (2) disk failures increase significantly when operating at high relative humidity, due to controller/adaptor malfunction; and (3) though higher relative humidity increases component failures, software availability techniques can mask them and enable free-cooled operation, resulting in significantly lower infrastructure and energy costs that far outweigh the cost of the extra component failures.
A copy of this paper, posted to the FAST website, appears here: https://www.usenix.org/conference/fast16/technical-sessions/presentation/manousakis