Audio-visual emotion recognition

Published:
January 27, 2023

This Audio-Visual Database of Emotional Speech contains 100 recordings with acted-emotional content. These files are divided into three modalities (full AV, video-only, and audio-only) and two vocal channels (speech and song). Each file contains a single actor representing an emotion that could be one of the eight following categories: calm, neutral, happy, sad, angry, fearful, surprised, and disgusted.

Dataset Technical Specification

Number of files:
100
Total dataset size:
Duration:
Format:
wav
Sample rate:
Resolution:

Dataset Demographics

Country:
Worldwide
Gender:
M/F 50-50%
Age:
18-55
Number of participants:
50