IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Addressing Noise and Class Imbalance Problems in Heterogeneous Cross-Project Defect Prediction: An Empirical Study

Addressing Noise and Class Imbalance Problems in Heterogeneous Cross-Project Defect Prediction: An Empirical Study
View Sample PDF
Author(s): Rohit Vashisht (Jamia Millia Islamia, New Delhi, India & KIET Group of Institutions, Delhi-NCR, Ghaziabad, India)and Syed Afzal Murtaza Rizvi (Jamia Millia Islamia, India)
Copyright: 2023
Volume: 19
Issue: 1
Pages: 27
Source title: International Journal of e-Collaboration (IJeC)
Editor(s)-in-Chief: Jingyuan Zhao (University of Toronto, Canada)
DOI: 10.4018/IJeC.315777

Purchase


Abstract

When a software project either lacks adequate historical data to build a defect prediction (DP) model or is in the initial phases of development, the DP model based on related source project's defect data might be used. This kind of SDP is categorized as heterogeneous cross-project defect prediction (HCPDP). According to a comprehensive literature review, no research has been done in the field of CPDP to deal with noise and class imbalance problem (CIP) at the same time. In this paper, the impact of noise and imbalanced data on the efficiency of the HCPDP and with-in project defect prediction (WPDP) model is examined empirically and conceptually using four different classification algorithms. In addition, CIP is handled using a novel technique known as chunk balancing algorithm (CBA). Ten prediction combinations from three open-source projects are used in the experimental investigation. The findings show that noise in an imbalanced dataset has a significant impact on defect prediction accuracy.

Related Content

Jianfei Shen. © 2024. 12 pages.
Bilal Ahmad Ali Al-khateeb. © 2024. 15 pages.
. © 2024.
. © 2024.
. © 2024.
. © 2024.
. © 2024.
Body Bottom