A review of the methods of recognition multimodal emotions in sound, image and text

Hosseini, S.; Yamaghani, M. R.; Poorzaker Arabani, S.

doi:10.71885/ijorlu-2024-1-657

[Home ] [Archive]

Main Menu

Home

Journal Information

Articles archive

For Authors

Registration

Contact us

Site Facilities

Editorial Workflow

Search in website

Receive site information

Volume 12, Issue 1 (1-2024)

2024, 12(1): 29-41

Back to browse issues page

A review of the methods of recognition multimodal emotions in sound, image and text

S. Hosseini

, M. R. Yamaghani

, S. Poorzaker Arabani

Department of Computer Engineering and Information Technology, Lahijan Branch, Islamic Azad University, Lahijan, Iran

Abstract: (3754 Views)

The study of recognizing multifaceted emotions through auditory, visual, and textual cues is a rapidly growing interdisciplinary field, encompassing the domains of psychology, computer science, and artificial intelligence. This paper investigates the spectrum of methodologies utilized to isolate and identify complex emotional states across these modalities, with the objective of delineating advancements and identifying areas for future investigation. Within the realm of sound, we explore progress in signal processing and machine learning techniques that facilitate the extraction of nuanced emotional indicators from vocal inflections and musical arrangements. Visual emotion recognition is evaluated through the effectiveness of facial recognition algorithms, analysis of body language, and integration of contextual environmental information. Text-based emotion recognition is examined using natural language processing techniques to perceive sentiment and emotional connotations from written language. Moreover, the paper considers the amalgamation of these distinct sources of emotional data, contemplating the challenges in constructing coherent models capable of interpreting multimodal inputs. Our methodology encompasses a meta-analysis of recent studies, evaluating the effectiveness and precision of diverse approaches and identifying commonly employed metrics for their assessment. The results suggest a preference towards deep learning and hybrid models that harness the strengths of multiple analytical techniques to enhance recognition rates. However, challenges such as the subjective nature of emotion, cultural disparities in expression, and the necessity for extensive, annotated datasets persist as significant hurdles. In conclusion, this review advocates for more nuanced datasets, enhanced interdisciplinary cooperation, and an ethical framework to govern the implementation of emotion recognition technologies. The potential applications for these technologies are expansive, ranging from healthcare to entertainment, and necessitate a concerted endeavor to refine and ethically integrate emotion recognition into our digital interactions.

Keywords: Multimodal Emotions, Fusion, Machine Learning, Deep Learning, Regression, CNN, RNN.

Full-Text [PDF 392 kb] (1907 Downloads)

Type of Study: Applicable | Subject: General
Received: 2023/09/18 | Accepted: 2023/12/25 | Published: 2024/01/21

Send email to the article author

Add your comments about this article

‎ 10.71885/ijorlu-2024-1-657

Mendeley

Zotero

RefWorks

Hosseini S, Yamaghani M R, Poorzaker Arabani S. A review of the methods of recognition multimodal emotions in sound, image and text. International Journal of Applied Operational Research 2024; 12 (1) :29-41
URL: http://ijorlu.liau.ac.ir/article-1-657-en.html

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Volume 12, Issue 1 (1-2024)

Back to browse issues page

International Journal of Applied Operational Research - An Open Access Journal

Persian site map - English site map - Created in 0.07 seconds with 37 queries by YEKTAWEB 4732