Deakin University
Browse
abdar-ahybridlatentspacedata-2019.pdf (2.73 MB)

A Hybrid Latent Space Data Fusion Method for Multimodal Emotion Recognition

Download (2.73 MB)
journal contribution
posted on 2019-01-01, 00:00 authored by S Nemati, R Rohani, M E Basiri, Moloud Abdar, N Y Yen, V Makarenkov
© 2013 IEEE. Multimodal emotion recognition is an emerging interdisciplinary field of research in the area of affective computing and sentiment analysis. It aims at exploiting the information carried by signals of different nature to make emotion recognition systems more accurate. This is achieved by employing a powerful multimodal fusion method. In this study, a hybrid multimodal data fusion method is proposed in which the audio and visual modalities are fused using a latent space linear map and then, their projected features into the cross-modal space are fused with the textual modality using a Dempster-Shafer (DS) theory-based evidential fusion method. The evaluation of the proposed method on the videos of the DEAP dataset shows its superiority over both decision-level and non-latent space fusion methods. Furthermore, the results reveal that employing Marginal Fisher Analysis (MFA) for feature-level audio-visual fusion results in higher improvement in comparison to cross-modal factor analysis (CFA) and canonical correlation analysis (CCA). Also, the implementation results show that exploiting textual users' comments with the audiovisual content of movies improves the performance of the system.

History

Journal

IEEE Access

Volume

7

Pagination

172948 - 172964

Publisher

IEEE

Location

Piscataway, N.J.

eISSN

2169-3536

Language

eng

Publication classification

C1 Refereed article in a scholarly journal