A Machine Learning Based Meta-Scheduler for Multi-Core Processors

Jitendra Kumar Rai, University of Hyderabad and ANURAG, India
Atul Negi, University of Hyderabad, India
Rajeev Wankar, University of Hyderabad, India
K. D. Nayak, ANURAG, India

ABSTRACT

Sharing resources such as caches and memory buses between the cores of multi-core processors may cause performance bottlenecks for running programs. In this paper, the authors describe a meta-scheduler, which adapts the process scheduling decisions for reducing the contention for shared L2 caches on multi-core processors. The meta-scheduler takes into account the multi-core topology as well as the L2 cache related characteristics of the processes. Using the model generated by the process of machine learning, it predicts the L2 cache behavior, i.e., solo-run-L2-cache-stress, of the programs. It runs in user mode and guides the underlying operating system process scheduler in intelligent scheduling of processes to reduce the contention of shared L2 caches. In these experiments, the authors observed up to 12 percent speedup in individual as well as overall performance, while using meta-scheduler as compared to default process scheduler (Completely Fair Scheduler) of Linux kernel.

Keywords: L2 Caches, Machine Learning, Multi-Core Processors, Process Scheduling, Shared Resources

INTRODUCTION

Multi-core processors have multiple execution cores in each processor package. Topology of multi-core processors involves sharing of various resources like caches and memory bus between the cores. The sharing of these resources brings new opportunities and challenges for system software.

For example programs running on cores which share L2 cache, may compete for space in the L2 cache, whereby may cause eviction of cache lines allocated to each other. This causes degradation in performance of the applications running on multi-core processors due to poor performance isolation (Fedorova et al., 2005; Fedorova et al., 2007; Zhao et al., 2007).

DOI: 10.4018/jaras.2010100104
Different programs have different requirements of space in L2 cache, which itself change along with the progress of programs during runtime. Hence achieving optimal performance requires optimizations to be done in a dynamic and adaptive manner.

Process scheduling plays an important role in allocation of cpu resources to running programs. An ideal scheduler should consider the requirement as well as contention for the resources for making the scheduling decisions. Thus system software components like process scheduler should be made aware of multi-core topologies and the characteristics of the processes (Siddha et al., 2007) to get next level of performance.

In this paper we describe a meta-scheduler for reducing the contention for shared L2 caches on multi-core processors. The meta-scheduler takes into account the multi-core topology as well as the L2 cache related characteristics of the processes. It uses the model generated by the process of machine learning to predict the L2 cache behavior (solo-run-L2-cache-stress) of the programs. The meta-scheduler runs in user mode and guides the underlying operating system process scheduler in intelligent scheduling of processes to reduce the contention of shared L2 caches.

OVERVIEW OF THE META-SCHEDULER

The meta-scheduler relies on reducing the contention between the programs for level-2 (L2) caches, while those programs run on the cores which share the L2 caches. In turn it also reduces the contention for front side bus shared by cores as on Intel multi-core processors. The meta-scheduler utilizes the hardware performance counters and offline built regression model to know about the L2 cache behavior of running programs. Hardware performance counters are special purpose registers, which are provided by the Performance Monitoring Unit (PMU) of modern processors. The hardware performance counters can be configured to measure various performance events of interest like number of instructions retired, L2 cache misses etc. Details regarding the number of performance counters and the way to configure them are described in processor manuals (Intel Manuals).

The meta-scheduler periodically samples the data from the hardware performance counters for the programs running on a multi-core processor; and guides the scheduling decisions of underlying operating system process scheduler to alleviate the contention for shared last level cache (L2-cache). The scheduling decisions are made in dynamic manner as mentioned under section mentioning policy framework of meta-scheduler.

METRIC TO CHARACTERIZE L2 CACHE BEHAVIOUR

The meta-scheduler uses the solo-run-L2-cache-stress of programs to characterize their L2 cache behaviors. Solo-run-L2-cache-stress of a program is the number of misses and pre-fetches of L2 cache per kilo (10^3) instructions, experienced by that program while not sharing the L2 cache with programs running on other core.

The L2 cache stress observed (using the hardware performance counters) for a program running on a core is different from its solo-run-L2-cache-stress, if there is another program running on other core sharing the L2 cache with first. This is so because the observed L2 cache stress results from the interaction of their individual cache access patterns and working sets based on their locality. The meta-scheduler uses regression model to deduce the solo-run-L2 cache stress of running programs. The regression model was built off-line through the process of training machine learning algorithm.

MODEL BUILDING

We generated the training instances from hardware counters data collected during the run of workload on experimental platform. The class variable (dependent variable) solo-run-L2-cache-stress (i.e., to be predicted later...
Related Content

Supporting Business Cases for PHM: Return on Investment and Availability Impacts
www.irma-international.org/chapter/supporting-business-cases-phm/69683/

Dynamics and Improved Robust Adaptive Control Strategy for the Finite Time Synchronization of Uncertain Nonlinear Systems

Decision Models in the Design of Adaptive Educational Hypermedia Systems
www.irma-international.org/chapter/decision-models-design-adaptive-educational/56069/

The Logic of Deferring the Design Process
www.irma-international.org/chapter/logic-deferring-design-process/4217/

E-Learning Tools with Intelligent Assessment and Feedback for Mathematics Study
www.irma-international.org/chapter/learning-tools-intelligent-assessment-feedback/56078/