Does the inclusion of other modalities enhance the performance of speech emotion recognition systems?
Reference
Nineteenth Australasian International Conference on Speech Science and Technology, Melbourne, Australia, 03 Dec 2024 - 05 Dec 2024. Proceedings of the Nineteenth Australasian International Conference on Speech Science and Technology. Australasian Speech Science and Technology Association. 32-36. 01 Dec 2024
Degree Grantor
Abstract
The pursuit of natural human-computer interaction has driven the advancement of emotion recognition technology. Speech emotion recognition (SER) has gained widespread attention due to its high applicability. Recently, some researchers have been interested in developing multi-modal emotion recognition (MER)systemsthat integrate speech with text and video modalities to enhance robustness and accuracy. We analyse the performance of these systems using the IEMOCAP and RAVDESS datasets, highlighting the impact of different modality combinations on emotion recognition accuracy. This paper aims to guide future research in optimising MER by leveraging the complementary advantages of various modalities.
Description
DOI
Keywords
ANZSRC 2020 Field of Research Codes
Collections
Permanent Link
Rights
Copyright: ASSTA