FakeVision AI: Detecting and Explaining AI-Generated Images with Deep Learning

Authors: Janakiraman S, Poovarasan K

DOI Link: https://doi.org/10.22214/ijraset.2025.72291

Abstract

The advancement of artificial intelligence (AI) has led to the development of powerful generative models such as StyleGAN, DALL•E, and Stable Diffusion, which are capable of creating highly realistic synthetic images. The CIFAKE dataset serves as a benchmark for training deep learning models to distinguish between real and AI-generated images. In this study, we propose an AI-based framework for detecting synthetic imagery using deep learning and explainable AI (XAI) methods. Our approach incorporates Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) to classify images as either authentic or artificially generated. Additionally, explainability tools such as Grad-CAM and SHAP are employed to highlight the most influential features contributing to the model’s predictions. This system promotes greater transparency in AI decision processes and strengthens the trustworthiness of digital content authentication.

Introduction

Background

Advances in deep learning, particularly Generative Adversarial Networks (GANs), have enabled the creation of highly realistic synthetic images.
While useful in entertainment, education, and data augmentation, these technologies raise ethical and security concerns due to their potential for deception.

???? Proposed Solution: FakeVision AI

FakeVision AI is a detection system that combines Convolutional Neural Networks (CNNs) with Explainable AI (XAI) to identify computer-generated images.
Trained on the CIFAKE dataset, containing 60,000 real and 60,000 fake images from CIFAR-10, the system achieves a 92.98% accuracy.

???? System Components

Interpretability Module – Explains classifier decisions to enhance transparency.
Image Storage System – Archives metadata and attributes of images.

???? Workflow Modules

Dataset Collection – Balanced gathering of real and fake images.
Image Preprocessing – Resizing (224×224) and normalization for CNN input.
Model Training – CNN trained for binary classification using binary cross-entropy loss.
Model Evaluation – Metrics: accuracy, precision, recall, F1-score, with visual tools (e.g., confusion matrix).
Prediction Interface – Accepts new images for prediction and displays visual explanations.

???? Results

FakeVision AI effectively distinguishes real from AI-generated images.
Explainable AI integration provides users with insight into classification decisions, increasing trust and interpretability alongside high detection performance.

Conclusion

The swift progress in generative AI technologies has made it increasingly challenging to differentiate between authentic and synthetic images. Image classification has become essential for verifying the credibility and trustworthiness of digital media. By utilizing advanced deep learning architectures such as Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), along with explainable AI methods, detection accuracy can be significantly enhanced while maintaining transparency in AI-driven decisions. This strategy not only tackles present difficulties but also establishes a strong foundation for defending against emerging AI-generated visual content. In summary, CIFAKE serves as an effective tool in combating misinformation by offering reliable detection and interpretability, holding great promise for fostering truth and integrity in the digital space.

References

[1] Wang, Z., and colleagues (2021). CIFAKE: A Dataset Designed for Synthetic Image Classification. arXiv preprint arXiv:2103.07948. [2] Selvaraju, R. R., and team (2017). Grad-CAM: Gradient-Based Localization for Visual Explanations from Deep Neural Networks. Presented at ICCV. [3] Dosovitskiy, A., et al. (2020). Transformers for Scaled Image Recognition: An Image is Worth 16x16 Words. arXiv preprint. [4] Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining Classifier Predictions. Presented at KDD. [5] Lundberg, S. M., & Lee, S.-I. (2017). A Unified Framework for Model Interpretation. Presented at NeurIPS. [6] Chollet, F. (2017). Xception: Deep Neural Networks Using Depthwise Separable Convolutions. Presented at CVPR. [7] Tan, M., & Le, Q. (2019). EfficientNet: Revisiting Model Scaling for CNNs. Presented at ICML.

Copyright

Copyright © 2025 Janakiraman S, Poovarasan K. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET72291

Publish Date : 2025-06-07

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here