The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition
Abstract
Automatic Speech Emotion Recognition (SER) is a current research topic in the field of Human Computer Interaction (HCI) with a wide range of applications. The purpose of speech emotion recognition system is to automatically classify speaker's utterances into different emotional states such as disgust, boredom, sadness, neutral, and happiness. The speech samples in this paper are from the Berlin emotional database. Mel Frequency cepstrum coefficients (MFCC), Linear prediction coefficients (LPC), linear prediction cepstrum coefficients (LPCC), Perceptual Linear Prediction (PLP) and Relative Spectral Perceptual Linear Prediction (Rasta-PLP) features are used to characterize the emotional utterances using a combination between Gaussian mixture models (GMM) and Support Vector Machines (SVM) based on the Kullback-Leibler Divergence Kernel. In this study, the effect of feature type and its dimension are comparatively investigated. The best results are obtained with 12-coefficient MFCC. Utilizing the proposed features a recognition rate of 84% has been achieved which is close to the performance of humans on this database.
Related Content
Jaime Salvador, Zoila Ruiz, Jose Garcia-Rodriguez.
© 2020.
12 pages.
|
Stavros Pitoglou.
© 2020.
11 pages.
|
Mette L. Baran.
© 2020.
13 pages.
|
Yingxu Wang, Victor Raskin, Julia M. Rayz, George Baciu, Aladdin Ayesh, Fumio Mizoguchi, Shusaku Tsumoto, Dilip Patel, Newton Howard.
© 2020.
15 pages.
|
Yingxu Wang, Lotfi A. Zadeh, Bernard Widrow, Newton Howard, Françoise Beaufays, George Baciu, D. Frank Hsu, Guiming Luo, Fumio Mizoguchi, Shushma Patel, Victor Raskin, Shusaku Tsumoto, Wei Wei, Du Zhang.
© 2020.
18 pages.
|
Nayem Rahman.
© 2020.
24 pages.
|
Amir Manzoor.
© 2020.
27 pages.
|
|
|