# Chapter 3 High-Performance Customizable Computing

**Domingo Benitez** 

University of Las Palmas de Gran Canaria, Spain

### ABSTRACT

Many accelerator-based computers have demonstrated that they can be faster and more energy-efficient than traditional high-performance multi-core computers. Two types of programmable accelerators are available in high-performance computing: general-purpose accelerators such as GPUs, and customizable accelerators such as FPGAs, although general-purpose accelerators have received more attention. This chapter reviews the state-of-the-art and current trends of high-performance customizable computers (HPCC) and their use in Computational Science and Engineering (CSE). A top-down approach is used to be more accessible to the non-specialists. The "top view" is provided by a taxonomy of customizable computers. This abstract view is accompanied with a performance comparison of common CSE applications on HPCC systems and high-performance microprocessor-based computers. The "down view" examines software development, describing how CSE applications are programmed on HPCC computers. Additionally, a cost analysis and an example illustrate the origin of the benefits. Finally, the future of the high-performance customizable computing is analyzed.

### INTRODUCTION

Frequently, automated solutions to Computational Science and Engineering (CSE) problems require that billions to trillions of complex operations be

DOI: 10.4018/978-1-61350-116-0.ch003

applied to input data acquired from the real world. In many cases, these solutions must be reported without delay, they are time critical, and frequently, they must also be of the highest precision. Both, availability and precision of information are key elements in resolving CSE problems and so making living more comfortable and longer. In order to reach this performance goal, highperformance computing is a research and development domain which aids the solution of CSE problems with a combination of high-performance computers and parallel programs. For many years, the fastest computers integrated central processing units (CPUs) or microprocessors that were specialized in performing the greatest number of operations per second. However, nowadays, the architectures of the fastest high-performance computers are dominated by a large population of multi-core programmable processors, many of which can be also integrated into desktop or server computers.

In this time of transition, new high-performance processors can provide higher levels of performance than their predecessors due mainly to an increase in the number of processing cores that are integrated on-chip. Increasing the numbers of processing cores on a single chip offers increased computer performance at somewhat lower power dissipation than a complex singlecore microprocessor with an equivalent number of transistors on-chip.

Nevertheless, the multi-core approach does not address three basic problems. Firstly, the available computing power on-chip is not efficiently utilized by programs. Secondly, the connection from the processor to the external memory becomes more loaded as the number of cores increases. This and the difference in operating frequency between multi-core processor and external memory can become a bottleneck of parallel processing and stall some or all cores. The third problem is caused because effective programming of multi-core systems is difficult, and in many cases, software is ultimately responsible for the lack of performance scalability as the amount of cores increases (Mackin & Woods, 2006).

An alternative approach has arisen; *High-Performance Customizable Computing (HPCC)* is a different paradigm of high-performance computing. Instead of having only programmable processors, customizable computers also

integrate hardware coprocessors with non-fixed architectures. These high-performance computing elements can be customized for a portion of a specific program and so accelerate the execution of key steps in the application software.

Customizable hardware devices offer the advantage of speeding up several software applications because its hardware flexibility allows the same chip to be specialized and reused. This is the main property that is applied to High-Performance Customizable Computing. This property is very useful in exploiting the inherent parallelism of many CSE problems. Customizable devices have shown a big potential for use in high-performance computing with much better power efficiency than programmable processors. New customizable devices are providing ever higher performance because their clock frequency and the number of transistors dedicated to specialized processing both are increasing. Additionally, customizable devices have other advantages that are exploited in embedded hardware engineering, such as reducing both the non-recurrent engineering costs (Dehon, 2008) and development time of a product (Guccione, 2008).

Two types of computing systems that integrate customizable devices are common nowadays: configurable and reconfigurable systems. *Configurable Systems* are built from baseline chip designs that are partially specialized during design-time and before fabrication (Leibson, 2006). After chip fabrication, these systems can be softwareprogrammed but cannot be specialized anymore. On the other hand, *Reconfigurable Systems* are based on field-programmable devices that can be completely customized after fabrication (Chang, 2008).

The main goal of this chapter is to help readers understand how customizable hardware systems can be exploited to provide high performance, i.e., how to get 10X, 100X or 1000X the performance of the equivalent number of transistors in a microprocessor-based computer with much better power efficiency. The reader will gain insight into the 28 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the publisher's webpage:

www.igi-global.com/chapter/high-performance-customizablecomputing/60355

## **Related Content**

### Technology Transfer and Innovation Management: The Brazilian TTOs Challenges

Luan Carlos Santos Silva, Silvia Gaia, Carla Schwengber ten Catenand Renata Tilemann Facó (2020). Disruptive Technology: Concepts, Methodologies, Tools, and Applications (pp. 1057-1074). www.irma-international.org/chapter/technology-transfer-and-innovation-management/231232

## Performance Analysis of Mail Clients on Low Cost Computer With ELGamal and RSA Using SNORT

Sreerama Murthy Kattamuri, Vijayalakshmi Kakulapatiand Pallam Setty S. (2018). *Handbook of Research on Pattern Engineering System Development for Big Data Analytics (pp. 332-353).* www.irma-international.org/chapter/performance-analysis-of-mail-clients-on-low-cost-computer-with-elgamal-and-rsausing-snort/202850

#### Emergence of a Digital Platform Based Disruptive Mobile Payments Service

Yasmin Mahgoub, Niklas Arvidssonand Alberto Urueña (2020). *Disruptive Technology: Concepts, Methodologies, Tools, and Applications (pp. 979-999).* www.irma-international.org/chapter/emergence-of-a-digital-platform-based-disruptive-mobile-payments-service/231227

#### Software-Defined Storage

Himanshu Sahuand Ninni Singh (2018). Innovations in Software-Defined Networking and Network Functions Virtualization (pp. 268-290).

www.irma-international.org/chapter/software-defined-storage/198203

# Cataract Classification and Gradation From Retinal Fundus Image Using Ensemble Learning Algorithm

Moumita Sahoo, Somak Karanand Soumya Roy (2023). *Novel Research and Development Approaches in Heterogeneous Systems and Algorithms (pp. 59-80).* 

www.irma-international.org/chapter/cataract-classification-and-gradation-from-retinal-fundus-image-using-ensemblelearning-algorithm/320124