Distributed network attacks are referred to, usually, as Distributed Denial of Service (DDoS) attacks. These attacks take advantage of specific limitations that apply to any arrangement asset, such as the framework of the authorized organization’s site. In the existing research study, the author worked on an old KDD dataset. It is necessary to work with the latest dataset to identify the current state of DDoS attacks. This paper, used a machine learning approach for DDoS attack types classification and prediction. For this purpose, used Random Forest and XG Boost classification algorithms. To access the research proposed a complete framework for DDoS attacks prediction. For the proposed work, the UNWS-np-15 dataset was extracted from the GitHub repository and Python was used as a simulator. After applying the machine learning models, we generated a confusion matrix for identification of the model performance. In the first classification, the results showed that both Precision (PR) and Recall (RE) are 89% for the Random Forest algorithm. The average Accuracy (AC) of our proposed model is 89% which is superb and enough good. In the second classification, the results showed that both Precision (PR) and Recall (RE) are approximately 90% for the XG Boost algorithm. The average Accuracy (AC) of our suggested model is 90%. By comparing our work to the existing research works, the accuracy of the defect determination was significantly improved which is approximately 85% and 79%, respectively.
Introduction
Objective:
The study aims to enhance DDoS (Distributed Denial of Service) attack detection and prediction using machine learning (ML). It provides a comparative analysis of Random Forest, XGBoost, and Naive Bayes, with an emphasis on Random Forest for its speed and accuracy.
Key Challenges with DDoS Attacks:
Overwhelm network services, leading to downtime and losses.
Traditional security tools (firewalls, IDS) are often ineffective.
Data Source: UNSW-NB15 dataset (includes detailed network traffic and attack labels).
Preprocessing:
Data cleaning, normalization, and feature selection.
Label encoding and visual inspection (heatmaps, class distribution).
Data split into training/testing sets.
Models Evaluated:
Random Forest: Efficient ensemble method, handles classification well.
XGBoost: Fast, scalable, and accurate gradient-boosting model.
Naive Bayes: Simpler probabilistic model used for baseline comparison.
Proposed System Workflow:
Preprocess network data (remove noise, encode labels).
Train Random Forest model on selected features.
Evaluate model using metrics like Accuracy, Precision, Recall, and F1 Score.
Compare performance with XGBoost and Naive Bayes.
Results:
Random Forest:
Accuracy, Precision, Recall: ~89%
Strong performance, especially in classification tasks.
XGBoost:
Slightly higher metrics (~90%) and faster execution.
Outperformed all other models in accuracy and processing time.
Comparison with Prior Work:
CNN: 79% accuracy
LSTM with KDD dataset: 85%
Proposed models (RF & XGBoost) showed superior performance (89–90%) using UNSW-NB15.
Conclusion
In this research, we provided a comprehensive systematic approach for predicting DDOS attacks. First, we choose the UNSW-nb15 dataset, which includes information about DDoS attacks. Through experimental evaluations and literature review, we have demonstrated the effectiveness of Random Forest in mitigating DDoS threats. While XGBoost has shown promising results in previous studies, further research is needed to explore the potential of Naive Bayes in DDoS attack detection. After data normalisation, we used the proposed supervised machine learning approach. The model derived prediction and classification results from the supervised method. Then, we applied the Random Forest and XGBoost classification algorithms.
References
[1] N. Martins, J. M. Cruz, T. Cruz, and P. H. Abreu, ``Adversarial machine learning applied to intrusion and malware scenarios: A systematic review,\'\' IEEE Access, vol. 8, pp. 35403_35419, 2020.
[2] G. Karatas, O. Demir, and O. K. Sahingoz, ``Increasing the performance of machine learning-based IDSs on an imbalanced and up-to-date dataset,\'\'IEEE Access, vol. 8, pp. 32150_32162, 2020.
[3] T. Su, H. Sun, J. Zhu, S. Wang, and Y. Li, ``BAT:Deep learning methods on network intrusion detection using NSL-KDD dataset,\'\' IEEE Access, vol. 8, pp. 29575_29585, 2020.
[4] H. Jiang, Z. He, G. Ye, and H. Zhang, ``Network intrusion detection based on PSO-xgboost model,\'\' IEEE Access, vol. 8, pp. 58392_58401, 2020.
[5] A. Nagaraja, U. Boregowda, K. Khatatneh, R. Vangipuram, R. Nuvvusetty,and V. S. Kiran, ``Similarity based feature transformation for networkanomaly detection,\'\' IEEE Access, vol. 8, pp. 39184_39196, 2020.
[6] L. D\'hooge, T. Wauters, B. Volckaert, and F. De Turck, ``Classification hardness for supervised learners on 20 years of intrusion detection data,\'\' IEEE Access, vol. 7, pp. 167455_167469, 2019.
[7] Xie, J., Wang, X., & Liu, W. (2018). \"A Machine Learning Approach for Detecting DDoS Attacks in SDN-based Networks.\" International Journal of Computer Applications, 179(33), 31-37.
[8] Ahmed, M., Ngu, A. H. H., & Li, J. (2017). \"A Survey of Network Anomaly Detection Techniques: AMachine Learning Perspective.\" Computer Networks, 51(11), 4024-4042.
[9] Gupta, A., Arora, A., & Zaman, T. (2019).\"Time Series Analysis and Prediction for Network Intrusion Detection Using Machine Learning.\" International Journal of Computer Science and Information Security, 17(6), 37-46.
[10] Alqahtani, M., Alharbi, M., & Alshehri, M. (2020). \"DDoS Attack Prediction and Detection Using Machine Learning Techniques.\" Journal of Computer Networks and Communications, 2020, 1-12.
[11] Song, X., & Wang, M. (2019). \"DDoS Attack Detection and Mitigation in Software-Defined Networks Using Deep Learning.\" Proceedings of the 2019 IEEE 16th International Conference on Software Engineering and Service Science (ICSESS), 340-343.
[12] Zhang, L., & Zhang, J. (2018). \"A Study on DDoS Attack Detection and Classification Using Support Vector Machine.\" International Journal of Computer Science and Information Technology (IJCSIT), 10(3), 41-48.