This data contains recordings of various Licence Plates throughout Europe, annotated to train LPR models. Footage has been recorded at 1080p and is MP4 format.
This corpus contains 15 spontaneous dialogues and multi-participant conversations by deaf signers, 10 of which were recorded in authentic settings like a deaf club and a bar, 5 were recorded in the lab.
Casual Conversations is a large scale multimodal (video + audio) benchmark dataset built to evaluate and audit computer vision and speech models for accuracy across diverse ages, genders, apparent skin tones, and lighting conditions.
Human lipreading performance increases for longer words, indicating the importance of features capturing temporal context in an ambiguous communication channel.