Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

BACKGROUND AND OBJECTIVE: Emotion recognition in conversations using artificial intelligence (AI) has gained significant attention due to its potential to provide insights into human social behavior. This study extends AI-based emotion recognition to the recognition of emotional climate (EC), which reflects the joint emotional atmosphere dynamically created and perceived by peers during conversations. The objective is to propose and evaluate a novel approach, MLBispec, for EC recognition using speech signals. METHODS: The MLBispec approach involves time-windowed bispectral analysis of conversational speech signals to extract features related to nonlinear harmonic interactions. These features are combined with peers' affect dynamics, derived from emotion labeling for the same time windows, to form an extended feature set. The combined feature set is then fed into machine learning (ML) classifiers. MLBispec was evaluated on the IEMOCAP, K-EmoCon, and SEWA open-access datasets, which provide 2D emotion annotations (arousal and valence) divided into low/high classes. Additionally, cross-lingual experiments were conducted to test the framework's generalization across languages. RESULTS: Experimental results demonstrated that MLBispec outperformed previous deep learning-based state-of-the-art approaches in speech emotion recognition, achieving accuracies of 82.6% for arousal and 75.4% for valence. The framework's incorporation of both qualitative and quantitative EC measurements enhanced its ability to characterize the dynamic speech representations of conversational affective structures. Cross-lingual experiments further validated the robustness of MLBispec. CONCLUSIONS: The findings highlight the effectiveness of MLBispec in objectively recognizing peers' EC during conversations, setting a new standard for practical emotionally-aware applications. These include point-of-care healthcare, human-computer interfaces (HCI), and large-language models (LLMs). By enabling dynamic and reliable EC recognition, MLBispec paves the way for advancements in emotionally intelligent systems.

Original publication

DOI

10.1016/j.cmpb.2025.108695

Type

Journal

Comput Methods Programs Biomed

Publication Date

18/03/2025

Volume

265

Keywords

Bispectrum, Conversational speech signals, Emotion recognition in conversations, Emotional climate, MLBispec