Abstract:
In speech production, emotional cues can be detected via three main aspects: excitation source, vocal tract and prosodic pattern. This paper addressed the first one, extracting six time and frequency related features from glottal pulse signals, transformed from stressed vowels. Four sustained vowels incorporating five regular emotions, which were selected from sentence recordings of the Berlin emotional speech database were investigated. The effectiveness of those glottal pulse features towards emotion recognition was proven through double round Robin quadratic classification in both singlegender and cross-gender stages, reaching average overall hitrate of 63%, 64% and 53% for male, female and cross-gender respectively.