IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Speech-Based Speaker Gender Identification Using Deep Learning

Speech-Based Speaker Gender Identification Using Deep Learning
View Sample PDF
Author(s): G. Aishwarya Laxmi (Avinashilingam Institute for Home Science and Higher Education for Women, India), Judith Justin (Avinashilingam Institute for Home Science and Higher Education for Women, India), R. Vanithamani (Avinashilingam Institute for Home Science and Higher Education for Women, India)and Pavithra Suchindran (Avinashilingam Institute for Home Science and Higher Education for Women, India)
Copyright: 2026
Pages: 30
Source title: Advancements in Speech Processing for Human-Computer Interaction
Source Author(s)/Editor(s): Abeer Saber (Damietta University, Egypt), Tamer Z. Emara (Damietta University, Egypt), Mohammad Sultan Mahmud (Shenzhen University, China), Muhammad Azhar (Hong Kong Shue Yan University, Hong Kong)and Esraa Hassan (Kafrelsheikh University, Egypt)
DOI: 10.4018/979-8-3373-3048-8.ch006

Purchase

View Speech-Based Speaker Gender Identification Using Deep Learning on the publisher's website for pricing and purchasing information.

Abstract

Speech recognition is a key area in Artificial Intelligence that involves extracting meaningful information from speech signals, such as gender, age, or emotion. While humans can easily identify a speaker's gender through conversation, computers require advanced models to perform this task effectively. This study proposes a deep learning-based approach for gender classification using features extracted from speech samples. Two feature extraction methods: Mel Frequency Cepstral Coefficients (MFCC) and Linear Prediction Cepstral Coefficients (LPCC) were evaluated using Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) models. MFCC with CNN achieved 89.1% accuracy, and with LSTM, 90.5%. In contrast, LPCC with CNN and LSTM produced lower accuracies of 84% and 85.2%, respectively. Results show that MFCC features significantly improve the performance of both models in classifying speaker gender.

Related Content

Amena Mahmoud, Wael A. Awad. © 2026. 24 pages.
T. Venkat Narayana Rao, J. V. P. Udaya Deepika, Vardhan Uppala, C. Swetha. © 2026. 30 pages.
Wasswa Shafik. © 2026. 46 pages.
R. N. Ravikumar, S. Aarthi, Maryam Ahmad Usmani, Pappu Kumar Rai, Muhabbat Jumaniyozova, Maqsuda Narboshova. © 2026. 32 pages.
Sachin Sharma. © 2026. 30 pages.
G. Aishwarya Laxmi, Judith Justin, R. Vanithamani, Pavithra Suchindran. © 2026. 30 pages.
C. V. Suresh Babu, R. Tamilvanddan, S. Nanda Kumar, K. Barath. © 2026. 40 pages.
Body Bottom