Deep Learning Approach towards Emotion Recognition Based on Speech

Authors

  • Padmanabh Butala School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India
  • Dr. Rajendra Pawar School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India.
  • Dr. Nagesh Jadhav Department of Computer Science and Engineering, MIT ADT University, Pune, India.
  • Manas Kalangan School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India.
  • Aniket Dhumal School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India.
  • Sahil Kakad School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India.

Keywords:

MFCC, CNN, Region, SVM, SER

Abstract

Feelings are incredibly vital in the internal actuality of humans. It's a means of communicating one's point of view or emotional condition to others [5]. The birth of the speaker's emotional state from his or her speech signal is appertained to as Speech Emotion Recognition (SER) [2]. There are a many universal feelings that any intelligent system with finite processing coffers can be trained to honour or synthesize as demanded, including Neutral, wrathfulness, Happiness, and Sadness. Because both spectral and prosodic traits contain emotional information, they're utilized in this study for speech emotion identification. One of the spectral parcels is Mel- frequency cepstral portions (MFCC). Prosodic variables similar as abecedarian frequency, loudness, pitch, and speech intensity, as well as glottal factors, are utilized to model colorful feelings. For the computational mapping between feelings and speech patterns, possible features are recaptured from each utterance. The named features can be used to identify pitch, which can also be used to classify gender. In this study, the gender is classified using a Support Vector Machine (SVM) on Ravdess dataset. The Radial Base Function and Back Propagation Network are used to honour feelings grounded on specified features, and it has been shown that the radial base function produces more accurate results for emotion recognition than the reverse propagation network.

References

Murugan, Harini. Speech Emotion Recognition Using CNN. International Journal of Psychosocial Rehabilitation (2020).

Dissanayake, V.; Zhang, H.; Billinghurst, M.; Nanayakkara, S. Speech Emotion Recognition’in the Wild’Using an Autoencoder, Interspeech 2020

Li, H.; Ding, W.; Wu, Z.; Liu, Z. Learning Fine-Grained Cross Modality Excitement for Speech Emotion Recognition. 2020

Alli, Madhavi & Valentina, Albert & Karakavalasa, Mounika & Boddeda, Rohit & Nagma, Sheripally. (2021), Comparative Analysis of Different Classifiers for Speech Emotion Recognition.

Wategaonkar, Dhanashree, Rajendra Pawar, Prathamesh Jadhav, Tanaya Patole, Rohit R. Jadhav, and Saisrijan Gupta. "Sign Gesture Interpreter for Better Communication between a Normal and Deaf Person" Journal of Pharmaceutical Negative Results, 5990–6000. https://doi.org/10.47750/pnr.2022.13.S07.731, Sept 2022

Pawar, R., Ghumbre, S., & Deshmukh, R. (2019). Visual Similarity Using Convolution Neural Network over Textual Similarity in Content- Based Recommender System. International Journal of Advanced Science and Technology, 27, 137 – 147.

Pawar, R., Ghumbre, S. and Deshmukh, R., 2018. Developing an Improvised E-Menu Recommendation System for Customer. Recent Findings in Intelligent Computing Techniques: Proceedings of the 5th ICACNI 2017, Volume 2, 708, p.333.

Garg, A., Anekar, N., Pawar, R.G., Ramekar, H., Tiwari, V., Padval, A., Marne, N. and Jadhav, P., 2022. Slider-Crank Four-Bar Mechanism-Based Ornithopter: Design and Simulation. ICT Systems and Sustainability: Proceedings of ICT4SD 2022, 516, p.267.

S.N. Roopa, M. Prabhakaran, P. Betty, Speech emotion recognition using deep learning. Int. J.Recent Technol. Eng. (2018).

R. G. Pawar, Dr. S. U. Ghumbre, Dr. R. R. Deshmukh, Dr. K. R. Kolhe ,”A Hybrid Approach towards Improving Performance of Recommender System Using Matrix Factorization Techniques” in International Journal of Future Generation Communication and Networking , Vol. 13, No. 4, (2020), pp. 467–477

M. Ragot, N. Martin, S. Em, N. Pallamin, J.M. Diverrez, Emotion recognition using physiological signals: Laboratory vs. wearable sensors, in International Conference on Applied Human Factors and Ergonomics. Springer, pp. 15–22 (2017).

Vitthal Gutte, Dr. Kamatchi Iyer “Cost and Communication Efficient Framework for Privacy Assured Data Management Cloud” International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-8 Issue-4, April 2019

Published

2024-05-30

How to Cite

Padmanabh Butala, Dr. Rajendra Pawar, Dr. Nagesh Jadhav, Manas Kalangan, Aniket Dhumal, & Sahil Kakad. (2024). Deep Learning Approach towards Emotion Recognition Based on Speech. JOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH, 6(3). Retrieved from http://mail.joaasr.com/index.php/joaasr/article/view/948