IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

REPROPREP: Reproducible Preprocessing Validation Framework - A Systematic Framework for Preprocessing Validation in Business Analytics

REPROPREP: Reproducible Preprocessing Validation Framework - A Systematic Framework for Preprocessing Validation in Business Analytics
View Sample PDF
Author(s): Miguel Angel Jimenez Garcia (Universidad Americana de Europa, Mexico)and Richard De Jesus Gil Herrera (Universidad Internacional de la Rioja, Spain)
Copyright: 2026
Volume: 13
Issue: 1
Pages: 26
Source title: International Journal of Business Analytics (IJBAN)
Editor(s)-in-Chief: John Wang (Montclair State University, USA)
DOI: 10.4018/IJBAN.406288

Purchase


Abstract

Preprocessing strategy selection in business analytics typically relies on convention rather than systematic evidence, despite consuming 60–80% of project effort. This study introduces REPROPREP (v1.0), a methodological framework for validating preprocessing effectiveness assumptions through statistical analysis and cost-benefit assessment. The framework applies Benjamini-Hochberg false discovery rate correction, quality degradation protocols, and cost-effectiveness evaluation. A demonstration across 10 UCI datasets, three preprocessing strategies, and gradient boosting classifiers with 5-fold stratified cross-validation yielded no statistically significant performance differences after multiple comparisons correction (mean effect size: 0.001 AUC), with implementation cost differences ranging from $150–$800. Focused on numeric preprocessing, REPROPREP provides organizations with a rigorous, context-specific methodology for evaluating preprocessing assumptions. Generalizability requires validation beyond tested conditions. Reproducible code is publicly available.

Related Content

André Guimarães, Rosivalda Pereira, Maria Teresa Pereira, Afonso Carvalho, Pedro Reis, Antonio J. Marques Marques Cardoso. © 2026. 17 pages.
Miguel Angel Jimenez Garcia, Richard De Jesus Gil Herrera. © 2026. 26 pages.
María Belén Navarro, César Joel Ybañez, Gisela Analy Fernández Hurtado. © 2026. 24 pages.
Shalina Sultana Champa, Richard S. Segall. © 2026. 38 pages.
Muhammed Golec, Lifeng Zhu, Emir Sahin Hatay, Han Wang, Sukhpal Singh Gill. © 2025. 26 pages.
Ahmet Alkan Çelik, Yavuz Selim Balcıoğlu, Erkut Altındağ. © 2025. 14 pages.
Susana Álvarez-Díez, J. Samuel Baixauli-Soler, Anna Kondratenko. © 2025. 25 pages.
Body Bottom