An Optimized Semi-Supervised Learning Approach for High Dimensional Datasets

View Sample PDF

Author(s): Nesma Settouti (Tlemcen University, Algeria), Mostafa El Habib Daho (Tlemcen University, Algeria), Mohammed El Amine Bechar (Tlemcen University, Algeria)and Mohammed Amine Chikh (Tlemcen University, Algeria)
Copyright: 2018
Pages: 28
Source title: Applying Big Data Analytics in Bioinformatics and Medicine
Source Author(s)/Editor(s): Miltiadis D. Lytras (Deree - The American College of Greece, Greece)and Paraskevi Papadopoulou (Deree - The American College of Greece, Greece)
DOI: 10.4018/978-1-5225-2607-0.ch012

Keywords: Bioinformatics / Biotechnology and Bioinformatics / Engineering Science Reference / Medicine & Healthcare

Purchase

View An Optimized Semi-Supervised Learning Approach for High Dimensional Datasets on the publisher's website for pricing and purchasing information.

Abstract

The semi-supervised learning is one of the most interesting fields for research developments in the machine learning domain beyond the scope of supervised learning from data. Medical diagnostic process works mostly in supervised mode, but in reality, we are in the presence of a large amount of unlabeled samples and a small set of labeled examples characterized by thousands of features. This problem is known under the term “the curse of dimensionality”. In this study, we propose, as solution, a new approach in semi-supervised learning that we would call Optim Co-forest. The Optim Co-forest algorithm combines the re-sampling data approach (Bagging Breiman, 1996) with two selection strategies. The first one involves selecting random subset of parameters to construct the ensemble of classifiers following the principle of Co-forest (Li & Zhou, 2007). The second strategy is an extension of the importance measure of Random Forest (RF; Breiman, 2001). Experiments on high dimensional datasets confirm the power of the adopted selection strategies in the scalability of our method.

Related Content

A Concise Overview of Bioinformatics

Genomics: A Piece of the Multi-Omics Puzzle

State of the Art of Immunoinformatics

Eduarda Guimarães Sousa, Lucas Gabriel Rodrigues Gomes, Fernanda Diniz Prates, Talita Pereira Gomes, Gabriel Camargos Gomes, Janaíne Aparecida de Paula, Ana Lua de Oliveira Vinhal, Bernardo Buhr Alves Mendonça, Mariana Letícia Costa Pedrosa, Luiza Pereira Reis, Aline Ferreira Maciel de Oliveira, Marcus Vinicius Canário Viana, Arun Kumar Jaiswal, Siomar de Castro Soares, Vasco Ariston de Carvalho Azevedo. © 2025. 38 pages.

Computational Modeling and Machine Learning in Bioinformatics

Deep Learning in Bioinformatics: Principles and Applications

Structural Bioinformatics

Virtual Screening: Introduction and Importance in Drug Discovery

IRMA Offers Over 2,500 Full Text Open Access Research Papers for Free Download Click to Start Searching Free IRM Research!

IRMA Sponsors

Encyclopedia of Information Science and Technology, Fourth Edition