A Model for Concurrent Priority Queue on Manycore Architectures

Chaozhang Huang, Rutgers

Defense Committee: Professor Zheng Zhang (chair) Professor Desheng Zhang Professor Yongfeng Zhang

Abstract

The concurrent priority queue is one of the shared memory data structures that can be dynamically maintained and updated for communication among co-running threads, which allows data with high priority to be served before others in applications that require complex asynchronous communication patterns. For instance, the following algorithms can take advantages of a concurrent priority queue: Dijkstra’s shortest path algorithm, Prim’s minimal spanning tree algorithm, data compression and A* search in artificial intelligence, which are building blocks for complex system software including operating system scheduler, interruption handler, and garbage collection. However, it is challenging to exploit the performance of heap-based priority queue on many-core architectures like GPUs due to control divergence and memory irregularity. This thesis focuses on the development and evaluation of a model for concurrent priority queue on GPU. We present an efficient model for heap-based concurrent priority queue, called the generalized heap, which exploits both inter-node and intra-node parallelism by incorporating with multi-state locking mechanism and batch processing on data nodes. The performance of the concurrent priority queue based on the generalized heap model is thoroughly evaluated against previous serial and GPU implementations. We show a maximum of 19.49X and 2.11X speedup compared to previous serial and GPU implementation, respectively. We also apply our generalized heap model in real-world applications such as single-source shortest path and 0/1 knapsack problem with up to 1.23X and 12.19X speedup, respectively.

Go Back to Colloquiua Listing