A dataset of videos of talking faces with transcriptions

Data were collected from 100 subjects, yielding over thousand instances of synchronized data
Files
1000
Size
Format
wav
Duration
Country
Worldwide
Participants
100
Languages
Updated
January 27, 2023

Description

A large-scale multimodal dataset developed to support machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human–computer interaction, biometric authentication, recognition systems, domain transfer, and speech recognition.

Version Info

Version:
Last updated:
Owner:

Dataset Technical Specification

Number of files:
1000
Total dataset size:
Duration:
Format:
wav
Sample rate:
Resolution:

Dataset Demographics

📍 Country:
Worldwide
🧍 Gender:
M/F 50-50%
📅 Age:
18-55
👥 Number of participants:
100

🛡️ Consent & Compliance