Adjusting Thread Parallelism Dynamically to Accelerate Dynamic Programming with Irregular Workload Distribution on GPGPUs

View Sample PDF

Author(s): Chao-Chin Wu (National Changhua University of Education, Changhua, Taiwan), Jenn-Yang Ke (Tatung University, Taipei, Taiwan), Heshan Lin (Virginia Tech, Blacksburg, VA, USA)and Syun-Sheng Jhan (Ling Tung University, Taichung, Taiwan)
Copyright: 2014
Volume: 6
Issue: 1
Pages: 20
Source title: International Journal of Grid and High Performance Computing (IJGHPC)
Editor(s)-in-Chief: Emmanuel Udoh (Sullivan University, USA)and Ching-Hsien Hsu (Asia University, Taiwan)
DOI: 10.4018/ijghpc.2014010101

Keywords: Computer Science & IT / Grid & High Performance Computing / High Performance Computing / Information Science Reference

Purchase

View Adjusting Thread Parallelism Dynamically to Accelerate Dynamic Programming with Irregular Workload Distribution on GPGPUs on the publisher's website for pricing and purchasing information.

Abstract

Dynamic Programming (DP) is an important and popular method for solving a wide variety of discrete optimization problems such as scheduling, string-editing, packaging, and inventory management. DP breaks problems into simpler subproblems and combines their solutions into solutions to original ones. This paper focuses on one type of dynamic programming called Nonserial Polyadic Dynamic Programming (NPDP). To run NPDP applications efficiently on an emerging General-Purpose Graphic Processing Unit (GPGPU), the authors have to exploit more parallelism to fully utilize the computing power of the hundreds of processing units in it. However, the parallelism degree varies significantly in different phases of the NPDP applications. To address the problem, the authors propose a method that can adjust the thread-level parallelism to provide a sufficient and steadier parallelism degree for different phases. If a phase has insufficient parallelism, the authors split threads into subthreads. On the other hand, the authors can limit the total number of threads in a phase by merging threads. The authors also examine the difference between the conventional problem of finding the minimum on a GPU and the NPDP-featured problem of finding the minimums of many independent sets on a GPU. Finally, the authors examine how to design an appropriate data structure to apply the memory coalescing optimization technique. The experimental results demonstrate our method can obtain the best speedup of 13.40 over the algorithm published previously.

The IRMA Community

Research IRM

Adjusting Thread Parallelism Dynamically to Accelerate Dynamic Programming with Irregular Workload Distribution on GPGPUs

Purchase

Abstract

Related Content

IRMA Sponsors