Facial Key Point Detection Using Convolutional Neural Network

Authors: Sandhya Yamala

DOI Link: https://doi.org/10.22214/ijraset.2026.77624

Abstract

These days, facial keypoint detection is a hot issue with many people drawn to its applications, which include the services like how old are you on Snapchat? Finding the facial key points in a particular face is the goal of facial keypoint detection, which is extremely difficult because every person has a completely diverse set of facial traits. Deep learning concepts, such as neural networks and cascaded neural networks, have been used to this issue. Furthermore, these structures produce much superior outcomes than cutting-edge techniques like dimension reduction algorithms and feature extraction. It\'s challenging to address the problem of facial keypoint recognition. Individual variations in facial traits can be observed due to factors such as 3D posture, size, location, viewing angle, and lighting conditions, and even within a single person. Although there has been significant progress in addressing these problems, there are still many areas where computer vision research may be strengthened. In our research, we want to use deep architectures to find the key points in each image to reduce losses for the task of detection and speed up training and testing for practical uses. As baselines, we have built two fundamental neural network architectures: a convolutional neural network and a hidden layer neural network. Additionally, we have suggested a method that uses factors other than raw input to determine the coordinates of face key points more accurately. The study\'s findings demonstrate the value of deep structures for face key point detection tasks, and employing the convolutional neural network model has marginally enhanced detection performance over baseline techniques.

Introduction

Facial key point detection is a critical task in computer vision that focuses on identifying and locating essential facial landmarks (eyes, nose, mouth corners, eyebrows, etc.) to extract nonverbal cues such as identity, emotions, intentions, and head pose. Accurately detecting these points is foundational for applications in face recognition, emotion analysis, gaze tracking, augmented reality, human-computer interaction, security, medical imaging, and entertainment.

Key points are categorized as:

Advantage points – define the precise positions of facial features
Interpolation points – connect features along contours

The process of facial key point detection enables efficient preprocessing for higher-level computer vision tasks, improving accuracy and robustness across varied faces and conditions.

Technologies and Approaches

Convolutional Neural Networks (CNNs) and Hidden Layer Neural Networks are commonly used for automatic key point detection.
Deep learning models reduce computation complexity while capturing local features effectively, leveraging sparsely connected layers and filters of varying sizes.
Pretrained models improve location prediction compared to baseline architectures.

Literature Review Highlights

Levi & Hassner: Proposed CNN-based age and gender classification using the Adience dataset, focusing on smaller networks to prevent overfitting.
Emotion Recognition Systems: CNNs trained on JAFFE and KDEF datasets for facial emotion detection achieved 78.1% accuracy after preprocessing and data augmentation.
Comparative Studies: Random Forests outperformed other traditional ML algorithms, while ANN and CNN were most effective among deep learning models in terms of performance on IoT-related classification datasets.

Methodology

The facial key point detection system consists of several stages:

Dataset:
- Source: Kaggle “Facial Key Points Detection”
- Size: 7049 images, 96×96 pixels
- Features: 15 key points (eyes, eyebrows, nose, mouth)
- Post preprocessing: 7000 images with 8 key features retained after dropping columns with high null values
Data Visualization and Preprocessing:
- Identifying and handling missing values
- Standardizing image sizes and extracting relevant key points
Data Splitting:
- Training set: 90% (6300 images)
- Validation set: 10% (700 images)
- Test set: 1783 images
Algorithm:
- Deep learning with CNNs
- Layers: Input → Convolution → Pooling → Activation → Fully Connected → Output
- Advantages: Efficient visual feature extraction, hierarchical learning, suitable for facial landmark localization
Model Architecture and Training:
- Multiple convolutional and pooling layers followed by dense layers
- Optional techniques: dropout, batch normalization, skip connections
- Training performed over 10 epochs using the fit() method
Evaluation:
- Performance metrics: Accuracy and Loss
- Validation on unseen data to assess generalization

Applications

Facial key point detection supports a wide range of applications:

Face alignment and tracking
Emotion recognition
Augmented reality and filters
Biometric authentication
Human-computer interaction
Medical imaging and analysis

Conclusion

In conclusion, considerable progress in the area of computer vision has been shown by the creation and assessment of a face key point identification system using convolutional neural networks (CNNs). Using CNNs and other deep learning approaches, the facial key point detection model has shown impressive performance in correctly identifying facial important points in a range of photos. The CNN-based system\'s performance assessment has shown its resilience and effectiveness in identifying facial features with high accuracy and precision, even under difficult circumstances such as changing illumination, facial emotions, and postures. This demonstrates how CNNs may be used to efficiently extract and process intricate spatial connections from face pictures, which can result in more accurate and adaptable facial key point recognition systems. Even though the findings of our research are encouraging, but there are still certain limits to be aware of and opportunities for future development. Problems including biases in the datasets, the demand for computing resources, and the sporadic inability to identify important sites in harsh environments draw attention to the need for continuous study and improvement. Overall, in our research the 99.2% accuracy achieved highlights the efficacy of the selected methodology, which most likely included rigorous data preparation, the creation of an intricate Convolutional Neural Network (CNN) architecture, cautious hyperparameter tweaking, and reliable training protocols.

References

[1] Levi, G. and Hassncer, T. (2015) ‘Age and gender classification using Convolutional Neural Networks’, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) [Preprint]. doi:10.1109/cvprw.2015.7301352. [2] Harvey, A. (2014) Adience benchmark, Exposing.ai. Available at: https://exposing.ai/adience/ (Accessed: 26 February 2024). [3] Chatfield, K. et al. (2014) Return of the devil in the details: Delving deep into convolutional nets, arXiv.org. Available at: https://arxiv.org/abs/1405.3531 (Accessed: 26 February 2024). [4] Ghaffar, F. (2022) Facial emotions recognition using the convolutional neural net, arXiv.org. Available at: https://arxiv.org/abs/2001.01456 (Accessed: 27 February 2024). [5] Lyons, M., Kamachi, M. and Gyoba, J. (2023) The Japanese Female Facial Expression (Jaffe) dataset, Zenodo. Available at: https://zenodo.org/records/3451524 (Accessed: 27 February 2024). [6] Lundqvist, D., flykt, A., & hman, A. (1998). the Karolinska directed emotional faces-KDEF (CD ROM). Stockholm Karolinska Institute, Department of Clinical Neuroscience, Psychology Section. - references - scientific research publishing. Available at: https://www.scirp.org/(S(351jmbntvnsjt1aadkposzje))/reference/ReferencesPapers.aspx?ReferenceID=1567781 (Accessed: 27 February 2024). [7] Vakili, M., Ghamsari, M. and Rezaei, M. (2020) Performance analysis and comparison of machine and deep learning algorithms for IOT Data Classification, arXiv.org. Available at: https://arxiv.org/abs/2001.09636 (Accessed: 06 March 2024). [8] Sambare, M. (2020) Fer-2013, Kaggle. Available at: https://www.kaggle.com/datasets/msambare/fer2013 (Accessed: 26 February 2024).

Copyright

Copyright © 2026 Sandhya Yamala. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET77624

Publish Date : 2026-02-22

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here