Casual Conversations Dataset

Casual Conversations is a large scale multimodal (video + audio) benchmark dataset built to evaluate and audit computer vision and speech models for accuracy across diverse ages, genders, apparent skin tones, and lighting conditions.
Files
45186
Size
15 GB
Format
MP4
Duration
Average per video length: ~1 Minute
Country
USA
Participants
3011
Languages
Updated
December 11, 2025

Description

The Casual Conversations dataset contains 45,000+ videos from 3,011 consented participants, created specifically to evaluate the performance and reliability of pre-trained AI models in computer vision and audio applications. The videos feature paid individuals who agreed to participate in the project and explicitly provided age and gender labels themselves. The videos were recorded in the U.S. with a diverse set of adults in various age, gender and apparent skin tone groups. A group of trained annotators labeled the participants’ apparent skin tone using the Fitzpatrick scale in addition to annotations of whether the videos were recorded in low ambient lighting conditions.

To support multimodal and speech research, all spoken content is manually transcribed by human annotators and is available with the dataset. The dataset is intended for use under the permitted purposes defined in the data user agreement.

Labels
Age (self-provided): 3,011
Gender (self-provided): 3,011
Skin Tone (human labelled): 3,011
Lighting (human labelled): 45,186
Speech Transcriptions (human labelled): 45,186

Licence

Partner Proprietary License

Version Info

Version:
Last updated:
Owner:
1
December 11, 2025
Meta

Dataset Technical Specification

Number of files:
45186
Total dataset size:
15 GB
Duration:
Average per video length: ~1 Minute
Format:
MP4
Sample rate:
Resolution:

Dataset Demographics

📍 Country:
USA
🧍 Gender:
📅 Age:
👥 Number of participants:
3011

🛡️ Consent & Compliance

Participants agreed to participate in the project and explicitly provided their age and gender labels themselves.

License Limited; see full license language for use

Summary of license permissions

- You can evaluate models on the provided labels

- You cannot train any model with the provided labels