Anomaly Detection Using Machine Learning: DBSCAN, Autoencoder and GMM Approaches

Authors: Bhavika Joshi, Prof. Komal Champanerkar

DOI Link: https://doi.org/10.22214/ijraset.2025.69280

Abstract

This research proposes a novel approach to anomaly detection by combining autoencoders and Gaussian Mixture Models (GMMs). Autoencoders are utilized for dimensionality reduction and feature extraction, while GMMs modelthedistributionofthe encoded data. Thehybridmethod aims to combine the advantages of both techniques to improve anomaly detection accuracy across various domains. Experiments conducted on multiple datasets show how effective the suggested strategy is in comparison to both conventional and modern approaches. The results show improved precision, recall, and F1-scores, highlighting the promise for reliable anomaly identification in complicated data settings with this integrated technique.

Introduction

The research focuses on improving anomaly detection by combining two machine learning techniques: autoencoders and Gaussian Mixture Models (GMMs). Anomaly detection aims to identify data points that deviate significantly from normal patterns, which is crucial in areas like cybersecurity and fraud detection. Traditional methods often struggle with complex, high-dimensional data.

Autoencoders are neural networks that learn compact, lower-dimensional representations of data while preserving key features, making them effective for feature extraction and dimensionality reduction. They detect anomalies by reconstructing input data and identifying high reconstruction errors.
Gaussian Mixture Models represent data as a mixture of multiple Gaussian distributions, modeling complex data distributions probabilistically. GMMs identify anomalies as points with low likelihood under the modeled distribution.

The hybrid framework leverages autoencoders to extract meaningful features and reduce dimensionality, while GMMs model the distribution of encoded data for better anomaly scoring. This combination aims to capture both the structure and probabilistic nature of normal data, enhancing detection accuracy and robustness.

Background and Related Work:
The text reviews traditional and recent anomaly detection techniques, highlighting approaches like clustering (K-Means, DBSCAN), outlier detection, supervised learning (SVM, Random Forest), and improvements in handling imbalanced data. It also discusses various applications, especially in financial fraud detection, credit card fraud, and network security.

Proposed System:
The system integrates:

Data preprocessing – cleaning and normalizing input data.
Autoencoder – trained on normal data to learn latent representations and detect anomalies via reconstruction error.
DBSCAN – density-based clustering on encoded data to identify clusters and outliers.
GMM – probabilistic modeling of encoded data to assign anomaly likelihood scores.
Anomaly Scoring and Decision Module – combines reconstruction error, clustering results, and GMM likelihood to classify data points as normal or anomalous.

This multi-step approach aims to improve detection of complex and subtle anomalies, particularly in credit card fraud scenarios.

Conclusion

The proposed system for credit card fraud detection demonstrates significant improvements over traditional methods by combining the strengths of DBSCAN, autoencoders, and Gaussian Mixture Models. Experimental results show enhanced precision, recall, and F1-scores compared to existing techniques. The system\'s ability to automatically learn relevant features, identify density-based clusters, and model the probabilistic distribution of normal transactionscontributestoitsrobustnessandeffectivenessin detecting various types of fraudulent activities.

References

[1] Dongxu Huang, Dejun Mu, Libin Yang, And Xiaoyan Cai“CoDetect: Financial Fraud Detection with AnomalyFeature Detection” | IEEE 2018 [2] Mohiuddin Ahmeda, Abdun NaserMahmooda, Md. RafiqulIslam “A survey of anomaly detection techniques infinancial domain” | Elsevier 2018 [3] Pawan Kumar Fahad Iqbal “Credit Card FraudIdentificationUsingMachineLearningApproaches”|IEEE2018 [4] JunzhangWangRafaelMartinsdeMoraesAnasseBari “APredictive Analytics Framework to Anomaly Detection” [5] Fahed Yoseph, Markku Heikkilä “A Clustering Approachfor Outliers Detection in a Big Point-of-Sales Database” |IEEE 2019 [6] Sonali B. Wankhede “Anomaly Detection using MachineLearning Techniques” | IEEE 2019 [7] S P Maniraj Aditya Saini, Swarna Deep Sarkar ShadabAhmed “Credit Card Fraud Detection using MachineLearning and Data Science “| IJERT 2019 [8] N. Malini “Analysis on Credit Card Fraud IdentificationTechniques based on KNN and Outlier Detection” | IEEE2019 [9] Ramino Camino, Radu State, Leandro Mo “FindingSuspicious Activities in Financial Transactions andDistributed Ledgers” | IEEE 2017

Copyright

Copyright © 2025 Bhavika Joshi, Prof. Komal Champanerkar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET69280

Publish Date : 2025-04-20

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here