The Best Facial Recognition Datasets of 2022

Finding high-quality facial recognition datasets can be tricky, especially if you need particular demographics, lighting conditions, or video durations. 

That’s why at Twine we specialize in helping companies create custom facial recognition datasets and will help you get the data you need, no matter the demographics or requirements. 

If you’re specifically looking for an off-the-shelf dataset then we’ve done the hard work for you. 

Without further ado, here are the best facial recognition datasets for you.


Let’s dive in.

Here are our top picks for Facial Recognition datasets:

Flickr-Faces-HQ Dataset (FFHQ)

Flickr-Faces-HQ Dataset (FFHQ) is a dataset consisting of human faces and includes datasets categorized by age, ethnicity, and image background. It also has great coverage of accessories such as eyeglasses, sunglasses, hats, etc. The images were crawled from Flickr and then automatically aligned and cropped. It consists of 70,000 high-quality PNG images at 1024×1024 resolution.

Access the dataset

Tufts Face Dataset

Tufts Face Dataset is a comprehensive, large-scale face dataset that contains 7 image modalities: visible, near-infrared, thermal, computerized sketch, LYTRO, recorded video, and 3D images. The dataset contains over 10,000 images, where 74 females and 38 males from more than 15 countries with an age range between 4 to 70 years old are included.
This is a helpful dataset to benchmark facial recognition algorithms for sketches, thermal, NIR, 3D face recognition, and heterogamous face recognition.

Access the dataset

Labeled Faces in the Wild (LFW) Dataset

Labeled Faces in the Wild (LFW) Dataset is a database of face photographs designed for studying the problem of unconstrained face recognition. Labeled Faces in the Wild is a public benchmark for face verification, also known as pair matching. The dataset is 173MB and it consists of over 13,000 images of faces collected from the web.

Access the dataset

UTKFace Dataset

UTKFace dataset is a large-scale face dataset with a long age span (range from 0 to 116 years old). The dataset consists of over 20,000 face images with annotations of age, gender, and ethnicity. The images cover large variations in pose, facial expression, illumination, occlusion, resolution, etc. This dataset could be used on a variety of tasks, e.g., face detection, age estimation, age progression/regression, landmark localization, etc.

Access the dataset

The Yale Face Database

The Yale Face Database (size 6.4MB) contains 165 grayscale images in GIF format of 15 individuals. There are 11 images per subject, one per different facial expression or configuration: center-light, w/glasses, happy, left-light, w/no glasses, normal, right-light, sad, sleepy, surprised, and wink.

Access the dataset

Face Images with Marked Landmark Points Dataset

Face Images with Marked Landmark Points is a Kaggle dataset to predict keypoint positions on face images. This dataset is 497MP and contains 7049 facial images and up to 15 key points marked on them.

This dataset can be used as a building block to track faces in images and video, analyze facial expressions, detect dysmorphic facial signs for medical diagnosis, and biometrics or facial recognition.

Access the dataset

Google Facial Expression Comparison Dataset

This dataset is a large-scale facial expression dataset that consists of face image triplets along with human annotations that specify, which two faces in each triplet form the most similar pair in terms of facial expression. This dataset is 200MB, which includes 500K triplets and 156K face images. The aim of this dataset is to aid researchers working on topics related to facial expression analysis such as expression-based image retrieval, expression-based photo album summarization, emotion classification, expression synthesis, etc.

Access the dataset

Wrapping up

To conclude, here are top picks for the best Facial Recognition datasets for your projects:

  1. Flickr-Faces-HQ Dataset (FFHQ)
  2. Tufts Face Dataset
  3. Labeled Faces in the Wild (LFW) Dataset
  4. UTKFace Dataset
  5. The Yale Face Database
  6. Face Images with Marked Landmark Points Dataset
  7. Google Facial Expression Comparison Dataset

We hope that this list has either helped you find a dataset for your project or, realize the myriad of options available to you. 

If there are any datasets you would like us to add to the list then please let us know here.

If you would like to learn more about how we could help build a custom dataset for your project, please don’t hesitate to contact us!

Let us help you do the math – check our AI dataset project calculator.

Ready to learn more? Check out our Dataset Archives:

Twine AI

Harness Twine’s established global community of over 400,000 freelancers from 190+ countries to scale your dataset collection quickly. We have systems to record, annotate and verify custom video datasets at an order of magnitude lower cost than existing methods.