CS Events

PhD Defense

Performance Profilers and Debugging Tools for OpenMP Applications


Download as iCal file

Wednesday, December 16, 2020, 11:00am - 01:00pm


Speaker: Nader Boushehrinejad Moradi

Location : Remote via Zoom


Prof. Santosh Nagarakatte (Chair)

Prof. Badri Nath

Prof. Srinivas Narayana

Prof. Martha Kim (Columbia University)

Event Type: PhD Defense

Abstract: OpenMP is a popular application programming interface (API) used to write shared-memory parallel programs. It supports a wide range of parallel constructs that allow expressing different types of parallelism, including fork-join and task-based parallelism. Using OpenMP, developers can incrementally parallelize a program by adding parallelism to it until their performance goals are met. In this dissertation, we address the problem of assisting developers in meeting the two primary goals of writing parallel programs in OpenMP: performance and correctness. First, writing OpenMP programs that achieve scalable performance is challenging. An OpenMP program that achieves reasonable speedup on a low core count system may not achieve scalable speedup when ran on a system with a larger number of cores. Traditional profilers report program regions where significant serial work is performed. In a parallel program, optimizing such regions may not improve performance since it may not improve its parallelism. To address this problem, we introduce OMP-Adviser, a parallelism-centric performance analysis tool for OpenMP programs with what-if analyses. We propose a novel OpenMP series-parallel graph (OSPG) that precisely captures the series-parallel relations between different fragments of the program's execution. The OSPG, along with fine-grained measurements, constitute OMP-Adviser's performance model. OMP-Adviser identifies serialization bottlenecks by measuring inherent parallelism in the program and its OpenMP constructs. OMP-Adviser's what-if analysis technique enables developers to estimate the increase in parallelism in user-specified code regions before designing concrete optimizations. OMP-Adviser's what-if analysis technique assists developers to identify regions that must be optimized first for the program to achieve scalable speedup. While a lack of inherent parallelism is a sufficient condition for a program not to have scalable speedup, too much parallelism can lead to excessive runtime and scheduling overheads, resulting in lower performance. We address this issue by extending OMP-Adviser to measure tasking overheads. By attributing this additional information to different OpenMP constructs in the program, OMP-Adviser can identify tasking cut-offs that achieve the right balance between a program's parallelism and its scheduling overheads for a given input. Further, we design a differential analysis technique that enables OMP-Adviser to identify program regions experiencing scalability bottlenecks caused by secondary effects of execution. Second, writing correct parallel programs is challenging due to the possibility of bugs such as deadlocks, livelocks, and data races that do not manifest when writing a serial program. A data race occurs when two parallel fragments of the program access the same memory location while one of the accesses is a write. Data races are a common cause of bugs in OpenMP applications. Manually identifying and reproducing data races is challenging due to the exponential number of possible instruction and thread interleaving in parallel applications. We introduce OMP-Racer, a dynamic data race detector for OpenMP. To detect apparent data races, OMP-Racer constructs the program's OSPG to encode the logical series-parallel relation for different fragments of the program. By capturing the logical series-parallel relations in the program, OMP-Racer can detect data races in other thread interleavings for a given input. Compared to the state-of-the-art OpenMP data race detectors, OMP-Racer can correctly identify races in a larger subset of OpenMP programs that use task-dependencies and locks. Our results with testing the OMP-Adviser and OMP-Racer prototypes with 40 OpenMP applications and benchmarks indicate their respective effectiveness in performance analysis and data race detection. Furthermore, it demonstrates the usefulness of our proposed OSPG data structure in enabling the creation of different types of analysis tools for OpenMP applications.


Join Zoom Meeting

Join by SIP
This email address is being protected from spambots. You need JavaScript enabled to view it.
Meeting ID: 969 4175 1519
Password: 797192

One tap mobile
+13017158592,,96941751519# US (Washington D.C)
+13126266799,,96941751519# US (Chicago)
Join By Phone
+1 301 715 8592 US (Washington D.C)
+1 312 626 6799 US (Chicago)
+1 646 558 8656 US (New York)
+1 253 215 8782 US (Tacoma)
+1 346 248 7799 US (Houston)
+1 669 900 9128 US (San Jose)

Meeting ID: 969 4175 1519

Find your local number: https://rutgers.zoom.us/u/acTCYJdQsI

Join by Skype for Business