Deep Bispectral Analysis of Conversational Speech Towards Emotional Climate Recognition

Alhussein G., Alkhodari M., Khandoker AH., Hadjileontiadis LJ.

Peers' conversational speech plays a significant role in shaping the emotional climate (EC) during interactions. Machine-based recognition of EC provides insights into the emotional perception of conversations by both peers and external observers. In this paper, we propose DeepBispec, a novel approach for EC recognition using deep bispectral analysis. DeepBispec applies windowed bispectral analysis to the 1D conversational speech signal. By capturing higher-order spectral correlations, the bispectrum magnifies the nonlinear characteristics present in speech signals. The estimated 2D -bispectrum magnitude contours, representing these interactions, are transformed into colored images and fed into a convolutional neural network (CNN). The CNN learns deep features from the bispectrum magnitude contours, enabling it to predict the valence (V) and arousal (A) labels associated with the EC. Evaluating DeepBispec on the K- EmoCon dataset using 10-fold cross-validation, we achieve an accuracy of 0.789 (A)/0.771 (V), an F1 score of 0.850 (A)/0.836 (V), and an area under the curve (AUC) of 0.812 (A)/0.788 (V). These results surpass existing benchmarks, demonstrating the effectiveness of bispectrum in capturing nonlinear characteristics and improving EC recognition. DeepBispec introduces an innovative approach to analyzing conversational speech for enhanced EC recognition. By leveraging deep bispectral analysis and CNN, it uncovers the higher-order spectral correlations and nonlinear dynamics of speech signals. This contributes to a deeper understanding of emotional dynamics in conversations and provides valuable insights into EC perception.

DOI

10.1109/IICAIET59451.2023.10291940

Type

Conference paper

Publication Date

2023-01-01T00:00:00+00:00

Pages

170 - 175

Total pages

5

Permalink More information Close