IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Adjusting Thread Parallelism Dynamically to Accelerate Dynamic Programming with Irregular Workload Distribution on GPGPUs

Adjusting Thread Parallelism Dynamically to Accelerate Dynamic Programming with Irregular Workload Distribution on GPGPUs
View Sample PDF
Author(s): Chao-Chin Wu (National Changhua University of Education, Changhua, Taiwan), Jenn-Yang Ke (Tatung University, Taipei, Taiwan), Heshan Lin (Virginia Tech, Blacksburg, VA, USA)and Syun-Sheng Jhan (Ling Tung University, Taichung, Taiwan)
Copyright: 2014
Volume: 6
Issue: 1
Pages: 20
Source title: International Journal of Grid and High Performance Computing (IJGHPC)
Editor(s)-in-Chief: Emmanuel Udoh (Sullivan University, USA)and Ching-Hsien Hsu (Asia University, Taiwan)
DOI: 10.4018/ijghpc.2014010101

Purchase


Abstract

Dynamic Programming (DP) is an important and popular method for solving a wide variety of discrete optimization problems such as scheduling, string-editing, packaging, and inventory management. DP breaks problems into simpler subproblems and combines their solutions into solutions to original ones. This paper focuses on one type of dynamic programming called Nonserial Polyadic Dynamic Programming (NPDP). To run NPDP applications efficiently on an emerging General-Purpose Graphic Processing Unit (GPGPU), the authors have to exploit more parallelism to fully utilize the computing power of the hundreds of processing units in it. However, the parallelism degree varies significantly in different phases of the NPDP applications. To address the problem, the authors propose a method that can adjust the thread-level parallelism to provide a sufficient and steadier parallelism degree for different phases. If a phase has insufficient parallelism, the authors split threads into subthreads. On the other hand, the authors can limit the total number of threads in a phase by merging threads. The authors also examine the difference between the conventional problem of finding the minimum on a GPU and the NPDP-featured problem of finding the minimums of many independent sets on a GPU. Finally, the authors examine how to design an appropriate data structure to apply the memory coalescing optimization technique. The experimental results demonstrate our method can obtain the best speedup of 13.40 over the algorithm published previously.

Related Content

Honglong Xu, Zhonghao Liang, Kaide Huang, Guoshun Huang, Yan He. © 2024. 17 pages.
Sherin Eliyas, P. Ranjana. © 2024. 10 pages.
Shuang Li, Xiaoguo Yao. © 2024. 16 pages.
Jialan Sun. © 2024. 21 pages.
Mei Gong, Bingli Mo. © 2024. 15 pages.
Qian He, Ke Wang. © 2024. 19 pages.
Sunil Kumar, Rashmi Mishra, Tanvi Jain, Achyut Shankar. © 2024. 12 pages.
Body Bottom