Enhancing Cross-Language Multimodal Emotion Recognition With Dual Attention Transformers

Muhammad Zaidi, Syed Aun; Latif, Siddique; Qadir, Junaid

المؤلف	Muhammad Zaidi, Syed Aun
المؤلف	Latif, Siddique
المؤلف	Qadir, Junaid
تاريخ الإتاحة	2025-07-08T03:58:10Z
تاريخ النشر	2024
اسم المنشور	IEEE Open Journal of the Computer Society
المصدر	Scopus
المعرّف	http://dx.doi.org/10.1109/OJCS.2024.3486904
الرقم المعياري الدولي للكتاب	26441268
معرّف المصادر الموحد	http://hdl.handle.net/10576/66082
الملخص	Despite the recent progress in emotion recognition, state-of-the-art systems are unable to achieve improved performance in cross-language settings. In this article we propose a Multimodal Dual Attention Transformer (MDAT) model to improve cross-language multimodal emotion recognition. Our model utilises pre-trained models for multimodal feature extraction and is equipped with dual attention mechanisms including graph attention and co-attention to capture complex dependencies across different modalities and languages to achieve improved cross-language multimodal emotion recognition. In addition, our model also exploits a transformer encoder layer for high-level feature representation to improve emotion classification accuracy. This novel construct preserves modality-specific emotional information while enhancing cross-modality and cross-language feature generalisation, resulting in improved performance with minimal target language data. We assess our model's performance on four publicly available emotion recognition datasets and establish its superior effectiveness compared to recent approaches and baseline models.
راعي المشروع	Funding text 1: This work was supported in part by Qatar University High Impact Internal under Grant QUHI-CENG23/24-127, and in part by Qatar National Library. The statements made herein are solely the responsibility of the authors. Open Access publication supported by Qatar National Library.; Funding text 2: The authors would like to acknowledge support from Qatar University High Impact Internal Grant (QUHI-CENG23/24-127). Open access funding is provided by Qatar National Library. The statements made herein are solely the responsibility of the authors.
اللغة	en
الناشر	IEEE
الموضوع	Co-attention networks graph attention networks multi-modal learning multimodal emotion recognition
العنوان	Enhancing Cross-Language Multimodal Emotion Recognition With Dual Attention Transformers
النوع	Article
الصفحات	684-693
رقم المجلد	5
dc.accessType	Open Access

الملفات في هذه التسجيلة

الاسم:: Enhancing_Cross-Language_Multi ...
الحجم:: 2.007Mb
الصيغة:: PDF

عرض / فتح

هذه التسجيلة تظهر في المجموعات التالية

علوم وهندسة الحاسب [‎2484‎ items ]

عرض بسيط للتسجيلة

Enhancing Cross-Language Multimodal Emotion Recognition With Dual Attention Transformers

الملفات في هذه التسجيلة

هذه التسجيلة تظهر في المجموعات التالية

Video