Therapidgrowthoftechnologiesnotonly formulates life easier but also exposes a lotof securityissues. With the advancement of theInternetoveryears,thenumberofattacks overtheInternethasbeenincreased.Intrusion Detection System (IDS )isoneofthesupportive layers applicable to information security. IDS provideasalubriousenvironmentforbusiness and keeps away from suspicious network activities. Recently, Machine Learning (ML) algorithmsareappliedinIDSinorderto identify and classify the security threats. This paper explores the comparative study of various ML algorithms used in IDS for several applicationssuchasfogcomputing,Internetof Things (IoT), big data, smart city, and 5G network. In addition, this work also aims for classifying the intrusions using ML algorithms like Linear Discriminant Analysis (LDA), Classification and RegressionTrees (CART)and Random Forest.
Introduction
The report highlights the rapid growth of internet data (over 1,100% growth from 2000-2019) and the corresponding rise in hacking tools, creating an urgent need for effective information security and intrusion detection. Traditional detection systems struggle with the volume, velocity, and complexity of big data, necessitating advanced big data techniques and machine learning (ML) algorithms to efficiently identify cyber-attacks.
Intrusion Detection Systems (IDS) monitor network traffic to detect suspicious activities and alert administrators. IDS detection methods include anomaly-based (behavioral deviations), signature-based (known attack patterns), and hybrid systems combining both.
Machine learning plays a critical role in IDS, with three main types: supervised, unsupervised, and semi-supervised learning. Supervised methods (e.g., Support Vector Machine, Logistic Regression, Random Forest) use labeled data to classify attacks, while unsupervised methods (e.g., K-Means clustering, PCA) identify patterns in unlabeled data. Semi-supervised learning combines both labeled and unlabeled data to improve detection accuracy.
The literature review covers IDS applications in diverse environments such as Internet of Things (IoT), smart cities, big data, fog computing, and mobile networks, highlighting various ML approaches to detect threats like DDoS attacks, unauthorized access, and malware.
The research implements three ML algorithms—Linear Discriminant Analysis (LDA), Classification and Regression Tree (CART), and Random Forest (RF)—on the KDD’99 dataset, a benchmark intrusion detection dataset. Results show that the Random Forest algorithm outperforms others with 99.65% accuracy, confirming that ML techniques, particularly ensemble methods like RF, are effective for intrusion detection. Performance depends on dataset size and application context.
Conclusion
Thispaperprovidesanextensivereviewofthe network intrusion detection mechanisms based on the ML and DL meth-ods to provide the new researchers with the updated knowledge, recenttrends,andprogress ofthe field.Asystematicapproachisadoptedforthe selection ofthe relevant articles in the field of AI-basedNIDS.Firstly,theconceptofIDSandits different classification schemes is elaborated extensively based on the reviewed articles. Then the methodology of each article is discussedandthestrengthsandweaknessesof each are highlighted in terms of the intrusion detection capability and complexity of the model. Based on this study, the recent trend reveals the usage of DL-based methodologies improvetheperformanceandeffectivenessof NIDS in terms of detection accuracy and reduction in FAR.
About 80%of the proposed solutions were based on the DL approaches withAEandDNNarethemostfrequentlyused algorithms. Although DL schemes have much superior performance than the ML-based methods in terms of their ability to learn features by itself and stronger model fitting abilities.Buttheseschemesarequitecomplex and require extensive computing resources in terms of processing power and storage capabilities. These challenges need to be addressed to fulfill real-time requirements for NIDS and hence improves NIDS performance. Thestudyalsoshowsthat60%oftheproposed methodologies were tested using KDD Cup’99 and NSL-KDD data sets mainly because of the availability of extensive results using these datasets. But these datasets are quite old to address modern network attacks, and hence limits the performance of the proposed methodologies in real-time environments.
For AI-basedNIDS methods, the model should be tested with the latest updated dataset like CSE-CIC-IDS2018 for better performance in termsofdetectionaccuracyforintrusions.This article also highlights the research gaps in improving the model performance for low- frequencyattacksinareal-worldenvironment and to find efficient solutions to reduce complexity for the proposed models. Proposing an efficient NIDS framework using less complex DL algorithms and have an effectivedetectionmechanismisapotential futurescopeofresearchinthisarea.Forfuture research,wewillusethisknowledgetodesign a novel, lightweight, and efficient DL-based NIDSwhichwilleffectivelydetecttheintruders within the network.
References
[1] Tarter A. Importance of cyber security. Community Policing-A European Perspective: Strategies, BestPractices andGuidelines. New York,NY: Springer; 2017:213-230.
[2] Li J, Qu Y, Chao F, Shum HP, Ho ES, Yang L. Machine learning algorithms for network intrusion detection. AI in Cybersecurity. NewYork, NY: Springer; 2019:151-179.
[3] Lunt TF. A survey of intrusion detection techniques. Comput Sec. 1993;12(4):405-418.
[4] Anderson JP. Computer Security Threat MonitoringandSurveillance.FortWashington, PA: James P Anderson Co; 1980.5. Debar H, Dacier M, Wespi A. Towards a taxonomy of intrusion-detection systems. Comput Netw. 1999;31(8):805-822.
[5] Hoque MS, Mukit M, Bikas M, Naser A, An implementation of intrusion detection system using genetic algorithm; 2012. arXiv preprintarXiv:1204.1336.
[6] Prasad R, Rohokale V. Artificial intelligence and machine learning in cyber security. Cyber Security: The Lifeline of Information and Communication Technology. New York, NY: Springer; 2020:231-247.
[7] Lew J, Shah DA, Pati S, et al. Analyzing machine learning workloads using a detailed GPU simulator. Paper presented at: Proceedings ofthe IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). Madison, WI, USA: IEEE; 2019:151-152.
[8] NajafabadiMM,VillanustreF,Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E. Deep learningapplicationsandchallengesinbigdata analytics. J Big Data. 2015;2(1):1
[9] DongB,WangX.Comparisondeeplearning method to traditional methods using for network intrusion detection. Paper presentedat: Proceedings of the 8th IEEE International Conference on Communication SoftwareandNetworks(ICCSN).Beijing,China: IEEE;2016:581-585.
[10] Vasilomanolakis E, Karuppayah S, Mühlhäuser M, Fischer M. Taxonomy and survey of collaborative intrusion detection. ACM ComputSurv. 2015;47(4):1-33.
[11] BuczakAL,GuvenE.Asurveyofdatamining and machine learning methods for cyber security intrusion detection. IEEE Commun SurvTutor. 2015;18(2):1153-1176.
[12] Thomas R, Pavithran D. A survey of intrusion detection models based on NSL-KDD data set. Paper presented at: Proceedings of the 5thHCT Information Technology Trends (ITT). Dubai, United Arab Emirates: IEEE; 2018:286-291.
[13] Liu H, Lang B. Machine learning and deep learning methods for intrusion detection systems: a survey. Appl Sci. 2019;9(20):4396.
[14] Khraisat A, Gondal I, Vamplew P, Kamruzzaman J. Survey of intrusion detection systems: techniques, datasets and challenges.Cybersecurity. 2019;2(1):20.
[15] DKAC,PapaJP,LisboaCO,MunozR,DVHC A. Internet of Things: a survey on machine learning-based intrusion detectionapproaches. Comput Netw. 2019;151:147- 157.3.
[16] Keele S, Guidelines for Performing Systematic Literature Reviews in Software Engineering. Technical Report, Technical Report, Ver. 2.3,EBSE Technical Report. vol. 5, EBSE; 2007.
[17] Scopus Preview Welcome to Scopus Preview;2020.https://www.scopus.com/.AccessedJune25, 2020.
[18] Mukkamala S, Janoski G, Sung A. Intrusion detection using neural networks and support vector machines. Paper presented at: Proceecings of the 2002 International Joint Conference on Neural Networks. IJCNN’02 (Cat.No.02CH37290).Honolulu,HI,USA:IEEE; vol. 2,2002:1702-1707.
[19] Garcia-Teodoro P, Diaz-Verdejo J, Maciá- Fernández G, Vázquez E. Anomaly-based network intrusion detection: techniques systems and challenges. Comput Secur. 2009;28(1-2):18-2