Deepfake technology has developed into a powerful tool capable of producing highly realistic synthetic media, which raises significant concerns related to misinformation and digital security. This study proposes a deep learning–based method for identifying deepfake images using multiple Convolutional Neural Network (CNN) architectures. The system utilizes five different models, including GoogLeNet, InceptionV3, VGG16, DenseNet121, and Xception, to perform binary classification between authentic and manipulated images. The dataset undergoes preprocessing and enhancement through various data augmentation techniques to improve the generalization ability of the models. Transfer learning is employed to take advantage of pretrained weights, thereby reducing computational cost. The models are trained and evaluated using standard performance metrics such as accuracy, precision, recall, and F1-score. The experimental findings indicate that VGG16 achieves the highest accuracy among all models, while DenseNet121 and Xception also show strong performance. This study demonstrates the effectiveness of CNN-based techniques in detecting manipulated media and highlights the significance of selecting appropriate models to improve detection accuracy.
Introduction
Recent advancements in artificial intelligence (AI) and deep learning have enabled the creation of highly realistic synthetic media, or deepfakes, which can manipulate facial features in images and videos. While deepfakes have applications in entertainment and virtual reality, they raise serious concerns including identity misuse, misinformation, and privacy violations. Detecting deepfakes is increasingly challenging, as traditional image analysis methods often fail to capture subtle manipulations, leading to the adoption of Convolutional Neural Networks (CNNs) for automated detection.
This study aims to develop a robust deepfake detection framework by implementing and comparing multiple CNN architectures, including GoogLeNet, InceptionV3, VGG16, and DenseNet121, leveraging transfer learning and pretrained ImageNet weights. The methodology involves systematic data collection, preprocessing, augmentation, and real-time loading, ensuring efficient training and reliable evaluation. Each CNN model is fine-tuned for binary classification of real versus fake images, with performance assessed through metrics like accuracy, confusion matrices, and classification reports.
The proposed framework addresses the growing need for automated, reliable deepfake detection and evaluates which CNN architecture is most effective for identifying manipulated media, providing a foundation for mitigating risks associated with deepfake proliferation in social media, journalism, and cybersecurity.
Conclusion
The proposed system for deepfake detection demonstrates the effectiveness of Convolutional Neural Networks in identifying manipulated images. Multiple CNN architectures, including GoogLeNet, InceptionV3, VGG16, DenseNet121, and Xception, were implemented and evaluated for performance comparison.
Among these, VGG16 achieved the best accuracy, mainly due to its deep architecture and strong feature extraction capability. DenseNet121 and Xception also showed competitive results, emphasizing the importance of advanced architectures and efficient feature utilization. The findings indicate that transfer learning plays a key role in improving performance while reducing training time and computational requirements [6]–[10].
The application of data augmentation further improved model generalization by increasing variability within the dataset. Evaluation metrics such as confusion matrix, precision, recall, and F1-score provided a comprehensive analysis of performance [12].
Additionally, the project was extended to a real-world application by deploying the trained model using Flask. This enables users to upload images and receive predictions instantly, making the system practical and user-friendly.
Overall, this study highlights the potential of deep learning methods in addressing deepfake-related challenges and underscores the importance of selecting suitable architectures for improved detection accuracy [4], [5].
References
[1] J. J. Bird and A. Lotfi, “CIFAKE: Image Classification and Explainable Identification of AI-Generated Synthetic Images,” IEEE Access, vol. 12, 2024.
[2] H. Farid, “Image Forgery Detection: A Survey,” IEEE Signal Processing Magazine, vol. 26, no. 2, pp. 16–25, Mar. 2009.
[3] A. Rössler et al., “FaceForensics++: Learning to Detect Manipulated Facial Images,” Proc. IEEE ICCV, 2019, pp. 1–11.
[4] Y. Li and S. Lyu, “Exposing DeepFake Videos By Detecting Face Warping Artifacts,” Proc. IEEE CVPR Workshops (CVPRW), 2019.
[5] D. Güera and E. J. Delp, “Deepfake Video Detection Using Recurrent Neural Networks,” Proc. IEEE AVSS, 2018, pp. 1–6.
[6] C. Szegedy et al., “Going Deeper with Convolutions,” Proc. IEEE CVPR, 2015.
[7] C. Szegedy et al., “Rethinking the Inception Architecture for Computer Vision,” Proc. IEEE CVPR, 2016.
[8] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Proc. ICLR, 2015.
[9] G. Huang et al., “Densely Connected Convolutional Networks,” Proc. IEEE CVPR, 2017.
[10] F. Chollet, “Xception: Deep Learning with Depthwise Separable Convolutions,” Proc. IEEE CVPR, 2017.
[11] B. Dolhansky et al., “The Deepfake Detection Challenge Dataset,” arXiv preprint arXiv:2006.07397, 2020.
[12] H. Nguyen, J. Yamagishi and I. Echizen, “Capsule-Forensics: Using Capsule Networks to Detect Forged Images and Videos,” Proc. IEEE ICASSP, 2019.
[13] X. Yang, Y. Li and S. Lyu, “Exposing Deep Fakes Using Inconsistent Head Poses,” Proc. IEEE ICASSP, 2019.
[14] S. Agarwal et al., “Protecting World Leaders Against Deepfakes,” Proc. IEEE CVPR Workshops, 2019.
[15] Z. Wang et al., “CNN-generated Images are Surprisingly Easy to Spot… for Now,” Proc. IEEE CVPR, 2020.