Improving Speech Emotion Recognition using Signal Processing and Feature Extraction Techniques

Divyansh Kumar, Vatsal Kumar Sharma, Avni Chauhan, Gungun Singh, Gurwinder Singh

Open Access Article Go Back

Improving Speech Emotion Recognition using Signal Processing and Feature Extraction Techniques

Divyansh Kumar¹ , Vatsal Kumar Sharma² , Avni Chauhan³ , Gungun Singh⁴ , Gurwinder Singh⁵

Section:Research Paper, Product Type: Journal Paper
Volume-11 , Issue-01 , Page no. 177-183, Nov-2023

Online published on Nov 30, 2023

Copyright © Divyansh Kumar, Vatsal Kumar Sharma, Avni Chauhan, Gungun Singh, Gurwinder Singh . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: Divyansh Kumar, Vatsal Kumar Sharma, Avni Chauhan, Gungun Singh, Gurwinder Singh, “Improving Speech Emotion Recognition using Signal Processing and Feature Extraction Techniques,” International Journal of Computer Sciences and Engineering, Vol.11, Issue.01, pp.177-183, 2023.

MLA Style Citation: Divyansh Kumar, Vatsal Kumar Sharma, Avni Chauhan, Gungun Singh, Gurwinder Singh "Improving Speech Emotion Recognition using Signal Processing and Feature Extraction Techniques." International Journal of Computer Sciences and Engineering 11.01 (2023): 177-183.

APA Style Citation: Divyansh Kumar, Vatsal Kumar Sharma, Avni Chauhan, Gungun Singh, Gurwinder Singh, (2023). Improving Speech Emotion Recognition using Signal Processing and Feature Extraction Techniques. International Journal of Computer Sciences and Engineering, 11(01), 177-183.

BibTex Style Citation:
@article{Kumar_2023,
author = {Divyansh Kumar, Vatsal Kumar Sharma, Avni Chauhan, Gungun Singh, Gurwinder Singh},
title = {Improving Speech Emotion Recognition using Signal Processing and Feature Extraction Techniques},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {11 2023},
volume = {11},
Issue = {01},
month = {11},
year = {2023},
issn = {2347-2693},
pages = {177-183},
url = {https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=1430},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=1430
TI - Improving Speech Emotion Recognition using Signal Processing and Feature Extraction Techniques
T2 - International Journal of Computer Sciences and Engineering
AU - Divyansh Kumar, Vatsal Kumar Sharma, Avni Chauhan, Gungun Singh, Gurwinder Singh
PY - 2023
DA - 2023/11/30
PB - IJCSE, Indore, INDIA
SP - 177-183
IS - 01
VL - 11
SN - 2347-2693
ER -

Abstract

Emotional responses play a crucial role in daily social interactions, enabling us to perceive and understand others’ moods and feelings. The field of emotion detection and recognition is rapidly evolving, with Speech Emotion Recognition (SER) emerging as a prominent research area. SER involves the analysis and identification of human emotions through speech patterns, offering significant potential applications in human-computer interaction, healthcare, and education. Current systems for emotion recognition from speech signals employ a variety of techniques, including natural language processing, signal processing, and machine learning. These techniques extract relevant features from speech signals and classify them into different emotional categories. Given the rich characteristics of speech, it serves as an excellent resource for computational linguistics. While previous studies have proposed various methods for speech emotion classification, there is a pressing need to enhance the effectiveness of voice-based emotion identification. This is primarily due to the limited knowledge on the fundamental temporal link of the speech waveform. This paper aims to advance speech emotion recognition by uncovering valuable insights through the utilization of signal processing and feature extraction techniques.

Key-Words / Index Term

Emotional responses, Speech Emotion Recognition (SER), Human-computer interaction, Feature extraction, Natural language processing, Machine learning.

References

[1] C. Busso, M. Bulut, C. C. Lee, A. Kazemzadeh, E. Mower, S. Kim, J. N. Chang, and S. Narayanan, "IEMOCAP: Interactive emotional dyadic motion capture database," Journal of Language Resources and Evaluation, Vol.42, No.4, pp.335-359, 2008.
[2] F. Eyben, M. Wöllmer, and B. Schuller, "Opensmile: the Munich versatile and fast open-source audio feature extractor," in Proceedings of the International Conference on Multimedia, pp.1459-1462, 2010.
[3] T. Ganchev, N. Fakotakis, and G. Kokkinakis, "Comparative evaluation of various MFCC implementations on the speaker verification task," in Proceedings of the International Conference on Speech and Computer, pp.191-194, 2005.
[4] K. Han, Y. Yun, and H. C. Rim, "Speech emotion recognition using convolutional and recurrent neural networks," in Proceedings of the International Conference on Human-Computer Interaction, pp.595-602, 2014.
[5] A. Nogueiras, A. Moreno, A. Bonafonte, and J. B. Marino, "Speech emotion recognition using hidden Markov model," in Eurospeech, 2001.
[6] P. Shen, Z. Changjun, and X. Chen, "Automatic Speech Emotion Recognition Using Support Vector Machine," in International Conference on Electronic and Mechanical Engineering and Information Technology, 2011.
[7] J. E. Kim and E. André, "Emotion recognition based on physiological changes in music listening," IEEE Transactions on Affective Computing, Vol.4, No.4, pp.366-379, 2013.
[8] J. Lee and I. Tashev, "High-level feature representation using recurrent neural network for speech emotion recognition," in Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp.4910-4914, 2015.
[9] V. Chernykh, G. Sterling, and P. Prihodko, "Emotion recognition from speech with recurrent neural networks," arXiv preprint arXiv:1701.08071, 2017.
[10]B. Schuller, G. Rigoll, and M. Lang, “Hidden Markov models for speech emotion recognition,” IEEE Transactions on Affective Computing, Vol.1, No.2, pp.109-117, 2010.
[11]J. Deng, J. Guo, and Z. Wu, “Emotion recognition using speech features and support vector machines,” in Proceedings of the International Conference on Machine Learning and Cybernetics, pp.3933-3938, 2007.
[12]S. Kim, E. M. Provost, and I. A. Essa, “Audio-based context recognition,” in Proceedings of the International Conference on Multimedia, pp.1281-1284, 2013.
[13]Li, H., Zhang, L., and He, X. Speech emotion recognition using a novel deep neural network. Neurocomputing, 333, pp.154-160, 2019.
[14] Wang, L., and Huang, Y. Speech emotion recognition based on transfer learning and deep neural network. In Proceedings of the 4th International Conference on Robotics, Control and Automation, pp.105-108, 2019.
[15]Zhang, S., Lan, M., and Yang, C. (2021). Speech emotion recognition based on multi-view fusion convolutional neural network. IEEE Access, 9, pp.36762-36773, 2021.
[16]Zhang, X., Huang, C., and Wang, Y. (2019). Speech emotion recognition based on convolutional neural network and softmax regression. In Proceedings of the 14th IEEE Conference on Industrial Electronics and Applications, pp.1804-1808, 2019.
[17]Koolagudi, S. G., and Rao, K. S. Speech emotion recognition using wavelet transform and support vector machines. Journal of Computing, 4(4), pp.147-152, 2012.
[18] K. Han, D. Yu, and I. Tashev, "Speech emotion recognition using deep neural network and extreme learning machine," in Proceedings of the Annual Conference of the International Speech Communication Association, 2014.
[19]Koduru, Anusha, Hima Bindu Valiveti, and Anil Kumar Budati. "Feature extraction algorithms to improve the speech emotion recognition rate." International Journal of Speech Technology 23, no. 1: pp.45-55, 2020.
[20] Ancilin, J., and A. Milton. "Improved speech emotion recognition with Mel frequency magnitude coefficient." Applied Acoustics 179 : 108046, 2021.
[21] El Ayadi, Moataz, Mohamed S. Kamel, and Fakhri Karray. "Survey on speech emotion recognition: Features, classification schemes, and databases." Pattern recognition 44, No.3, pp.572-587, 2011.
[22]Singh, Youddha Beer, and Shivani Goel. "A systematic literature review of speech emotion recognition approaches." Neurocomputing 2022.

Citations	2325
h-index	16
i10-index	47