A Hybrid AI-Driven Cybersecurity Framework For Real-Time Threat Detection Using Big Data Analytics

Authors: Gyanendra Kumar Gautam(Research Scholar), Dr. Uday Pratap Singh

DOI Link: https://doi.org/10.22214/ijraset.2026.83364

Abstract

The growing advancements in cloud computing, IoT devices, and high-speed communication networks have created a complicated and more frequent environment for cyberattacks, posing significant risks to computer security. It has been discovered that the conventional intrusion detection techniques are insufficient to deal with sophisticated and zero-day attacks since they are not flexible enough to manage large amounts of real-time traffic. Using machine learning, deep learning, and big data analysis approaches, this study will concentrate on creating a hybrid AI-based security system. To improve attack categorisation, feature learning, and anomaly detection of known and new attacks, the proposed framework combines the Random Forest (RF), 1-Dimensional Convolutional Neural Networks (1D-CNN), and One-Class Support Vector Machine (OCSVM) methods. Scalable distributed stream processing and cybersecurity analytics will be made possible by the use of Apache Kafka and Apache Spark Streaming. To enable intelligent threat analysis, automatic threat containment, and threshold optimisation, weighted ensemble and adaptive mitigation algorithms are also incorporated. The proposed hybrid system can achieve 99.2% detection rate, 99.0% precision, 99.4% recall, and 99.1% F1-score with few false alarms, according to the experimental result using the CICIDS2017 dataset. With an average detection latency of 142 ms, a processing capacity of 52,000 packets per second, and a response time of less than 150 ms, runtime analysis shows that the framework is, in fact, scalable. The proposed method enables the development of a scalable, flexible, and effective cybersecurity solution to identify threats and automatically neutralise them.

Introduction

This paper proposes an adaptive hybrid cybersecurity framework designed to address the growing complexity of cyber threats in modern digital infrastructures such as cloud computing, IoT networks, and high-speed communication systems. Traditional signature-based Intrusion Detection Systems (IDS) are effective for identifying known attacks but struggle to detect advanced threats such as Advanced Persistent Threats (APTs) and zero-day attacks, which often lack recognizable signatures or continuously evolve to evade detection.

To overcome these limitations, the proposed framework integrates machine learning (ML), deep learning (DL), anomaly detection, and big data streaming technologies into a unified cybersecurity platform. The architecture combines Random Forest (RF), One-Class Support Vector Machine (OCSVM), and One-Dimensional Convolutional Neural Network (1D-CNN) models with Apache Kafka and Apache Spark Streaming to enable scalable, real-time threat detection, adaptive analysis, and automated mitigation.

The literature review highlights that traditional supervised learning techniques such as Random Forest, Decision Trees, and SVMs perform well in detecting known attacks but require labeled datasets and are less effective against unknown threats. Unsupervised approaches like OCSVM and Isolation Forest can identify anomalies and zero-day attacks but often suffer from high false-positive rates. Deep learning methods, particularly CNNs, have demonstrated strong capability in extracting complex features from network traffic. However, existing research typically focuses on detection accuracy without fully integrating scalability, real-time processing, and adaptive response mechanisms into a single framework.

The proposed system addresses this research gap through a multi-layer architecture consisting of:

Real-time data collection and ingestion using Apache Kafka.
Stream processing and analytics using Apache Spark Streaming.
Feature engineering techniques including preprocessing, normalization, feature selection, and dimensionality reduction.
Hybrid threat detection using RF, 1D-CNN, and OCSVM.
Adaptive decision-making and automated threat mitigation.

The framework uses the CICIDS2017 dataset for model training and evaluation. Network traffic is represented as feature vectors, which are analyzed by the three detection models. The Random Forest classifier identifies known attack patterns through an ensemble of decision trees. The 1D-CNN model automatically extracts deep features and recognizes complex attack behaviors, while OCSVM detects deviations from normal traffic patterns, making it effective for identifying unknown and zero-day threats.

To improve overall performance, the outputs of the three models are combined using a weighted ensemble decision model:

S(X)=w1PRF(X)+w2PCNN(X)+w3POCSVM(X)S(X) = w_1P_{RF}(X) + w_2P_{CNN}(X) + w_3P_{OCSVM}(X)S(X)=w1?PRF?(X)+w2?PCNN?(X)+w3?POCSVM?(X)

where the weights are assigned according to the validation performance of each model. This ensemble approach enhances detection accuracy, reduces classification errors, and improves robustness against diverse attack types.

The framework further employs an adaptive threshold-based classification mechanism. An initial threshold value of 0.60 is selected to balance detection accuracy and false alarm rates. Traffic is classified as malicious when the ensemble threat score exceeds the threshold. Beyond detection, the system includes an adaptive mitigation model that automatically determines appropriate responses based on threat severity. High-risk threats may trigger IP blocking or node isolation, medium-risk threats generate alerts, and low-risk activities are monitored for further analysis.

Conclusion

This study provides a refined cybersecurity architecture that makes use of big data, machine learning, deep learning, and anomaly detection. Especially, the suggested strategy connects streaming techniques and hybrid AI into a single framework that allows for real-time cybersecurity threat processing and analysis. This architecture aims to provide a flexible and scalable foundation for constructing more complicated solutions using modern AI technologies rather than a replacement for intrusion detection systems. The experimental conclusions show that the suggested method has excellent levels of accuracy, precision, recall, and F1-scores together with very little incorrect categorization. The suggested hybrid approach can improve the identification of both known and undiscovered cyberthreats. Moreover, by quickly containing the effects of an assault, one can guarantee better reaction to threats with the aid of the adaptive system response. However, despite all of the aforementioned benefits, the suggested strategy has certain drawbacks. First, the results reported in this study are derived from produced traffic rather than actual deployments in enterprise networks. Second, federated and limited environments are not supported by the design. Even though the suggested design has produced encouraging experimental results, it is necessary that it be put into practice in order to give additional performance tests when exposed to the dynamic cyber threats. To provide better interpretability and intelligence in detection, future research will emphasize on combining Explainable AI (XAI) with cutting-edge deep learning techniques. The system will also be combined with edge computing and Internet-of-things-based cybersecurity technologies to enable it to function effectively in decentralized and resource-constrained contexts.

References

[1] S. A. Ajagbe, J. B. Awotunde, and H. Florez, “Intrusion detection: A comparison study of machine learning models using unbalanced dataset,” SN Computer Science, vol. 5, no. 6, Art. no. 1028, 2024, doi: 10.1007/s42979-024-03369-0. [2] A. J. A. Immastephy, R. M. Noor, and M. A. Razzaque, “A systematic review on intrusion detection systems using machine learning,” E3S Web of Conferences, vol. 512, pp. 1–10, 2024. [3] M. Benmalek, A. Kharrazi, and M. Mezghani, “Anomaly-based intrusion detection using unsupervised learning techniques,” Procedia Computer Science, vol. 225, pp. 1500–1509, 2024. [4] S. Elsayed, K. Mohamed, M. A. Madkour, and M. A. Madkour, “A Comparative Study of Using Deep Learning Algorithms in Network Intrusion Detection,” IEEE Access, vol. 12, pp. 58851–58870, 2024, doi:10.1109/ACCESS.2024.3389096. [5] S. Elouardi, A. Motii, M. Jouhari, A. N. H. Amadou, and M. Hedabou, “A Survey on Hybrid-CNN and LLMs for Intrusion Detection Systems: Recent IoT Datasets,” IEEE Access, vol. 12, pp. 180009–180033, 2024, doi: 10.1109/ACCESS.2024.3506604. [6] P. Waghmode, S. Patil, and R. Kulkarni, “Feature selection and hybrid machine learning models for intrusion detection,” Scientific Reports, vol. 15, 2025. [7] U. Shahid, M. Z. Hussain, M. Z. Hasan, A. Haider, J. Ali, and J. Altaf, “Hybrid Intrusion Detection System for RPL IoT Networks Using Machine Learning and Deep Learning,” IEEE Access, vol. 12, pp. 113099–113112, 2024, doi: 10.1109/ACCESS.2024.3442529. [8] N. Abbas, Y. Zhang, A. Taherkordi, and T. Skeie, “Ensemble-based intrusion detection systems using machine learning,” Computers & Security, vol. 132, 2024. [9] A. Alabbadi, H. Alkahtani, and M. Alshammari, “Explainable AI-based intrusion detection for IoT environments,” Sensors, vol. 25, no. 3, pp. 847–862, 2025. [10] Y. Wang et al., “Explainable and adaptive cybersecurity frameworks using deep learning,” IEEE Access, vol. 13, pp. 11234–11250, 2025. [11] X. Liu et al., “Real-time intrusion detection using big data analytics and streaming frameworks,” IEEE Access, vol. 12, pp. 20245–20260, 2024. [12] M. Zaharia et al., “Apache Spark: A unified engine for big data processing,” Communications of the ACM, vol. 66, no. 11, pp. 56–65, 2023, doi: 10.1145/3610228. [13] J. Kreps, N. Narkhede, and J. Rao, “Kafka: A distributed streaming platform for real-time data pipelines,” IEEE Data Engineering Bulletin, vol. 46, no. 2, pp. 20–29, 2023. [14] R. Chinnasamy, P. Ramasamy, and S. Karthikeyan, “Challenges in integrating machine learning with big data for cybersecurity,” ICT Express, vol. 11, no. 2, pp. 210–218, 2025. [15] I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” in Proc. Int. Conf. Information Systems Security and Privacy (ICISSP), 2018, pp. 108–116, doi: 10.5220/0006639801080116. [16] M. Sajid, K. R. Malik, A. Almogren, T. S. Malik, A. H. Khan, and A. U. Rehman, “Enhancing intrusion detection: a hybrid machine and deep learning approach,” Journal of Cloud Computing, vol. 13, Art. no. 123, 2024. [17] Y. Imrana, Y. Xiang, L. Ali, A. Noor, and K. Sarpong, “CNN-GRU-FF: a double-layer feature fusion-based network intrusion detection system using convolutional neural network and gated recurrent units,” Complex & Intelligent Systems, vol. 10, pp. 3353–3370, 2024, doi: 10.1007/s40747-023-01313-y.

Copyright

Copyright © 2026 Gyanendra Kumar Gautam, Dr. Uday Pratap Singh. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET83364

Publish Date : 2026-06-02

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here