IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Robustness and Predictive Performance of Homogeneous Ensemble Feature Selection in Text Classification

Robustness and Predictive Performance of Homogeneous Ensemble Feature Selection in Text Classification
View Sample PDF
Author(s): Poornima Mehta (Jaypee Institute of Information Technology, Noida, India)and Satish Chandra (Jaypee Institute of Information Technology, Noida, India)
Copyright: 2024
Pages: 17
Source title: Research Anthology on Bioinformatics, Genomics, and Computational Biology
Source Author(s)/Editor(s): Information Resources Management Association (USA)
DOI: 10.4018/979-8-3693-3026-5.ch061

Purchase

View Robustness and Predictive Performance of Homogeneous Ensemble Feature Selection in Text Classification on the publisher's website for pricing and purchasing information.

Abstract

The use of ensemble paradigm with classifiers is a proven approach that involves combining the outcomes of several classifiers. It has recently been extrapolated to feature selection methods to find the most relevant features. Earlier, ensemble feature selection has been used in high dimensional, low sample size datasets like bioinformatics. To one's knowledge there is no such endeavor in the text classification domain. In this work, the ensemble feature selection using data perturbation in the text classification domain has been used with an aim to enhance predictability and stability. This approach involves application of the same feature selector to different perturbed versions of training data, obtaining different ranks for a feature. Previous works focus only on one of the metrics, that is, stability or accuracy. In this work, a combined framework is adopted that assesses both the predictability and stability of the feature selection method by using feature selection ensemble. This approach has been explored on univariate and multivariate feature selectors, using two rank aggregators.

Related Content

Alessandra Lima da Silva, Diego Mariano, Mariana Parise, Angie L. A. Puelles, Tatiane Senna Bialves, Luana Luiza Bastos, Lucas Santos, Rafael Pereira Lemos. © 2025. 22 pages.
Seyyed Mohammad Amin Mousavi Sagharchi, Mohsen Sheykhhasan, Atousa Ghorbani, Elina Afrazeh, Naresh Poondla, Naser Kalhor, Hamid Tanzadehpanah, Hanie Mahaki, Hamed Manoochehri. © 2025. 46 pages.
Eduarda Guimarães Sousa, Lucas Gabriel Rodrigues Gomes, Fernanda Diniz Prates, Talita Pereira Gomes, Gabriel Camargos Gomes, Janaíne Aparecida de Paula, Ana Lua de Oliveira Vinhal, Bernardo Buhr Alves Mendonça, Mariana Letícia Costa Pedrosa, Luiza Pereira Reis, Aline Ferreira Maciel de Oliveira, Marcus Vinicius Canário Viana, Arun Kumar Jaiswal, Siomar de Castro Soares, Vasco Ariston de Carvalho Azevedo. © 2025. 38 pages.
Diego Mariano, Lucas Moraes dos Santos, Raquel Cardoso de Melo-Minardi. © 2025. 30 pages.
Alessandra G. Cioletti, Frederico C. Carvalho, Lucas M. Dos Santos, Raquel C. M. Minardi. © 2025. 32 pages.
Leandro Morais de Oliveira, Luana Luiza Bastos, Vivian Morais Paixão, Leticia Aparecida Gontijo, Tatiane Senna Bialves, Diego Mariano, Raquel Cardoso de Melo Minardi. © 2025. 40 pages.
Angie Atoche Puelles, Luana Luiza Bastos, Vivian Morais Paixão, Sheila Cruz Araujo, Raquel Cardoso de Melo Minardi. © 2025. 28 pages.
Body Bottom