Real-Time Network Intrusion Detection and Active Honeypot Deployment Using Machine Learning on KDD-Based Traffic Classification

Authors: Yenumula Sri Kanyaka Parameswari, Ms. Kuraganti Sudeepa Kumari, Guntaka Venkata Maheswar Reddy, Sk. Aarifa, Moripalli Lingeswararao

DOI Link: https://doi.org/10.22214/ijraset.2026.81864

Certificate: View Certificate

Abstract

Keeping a network safe these days is a lot harder than it used to be — attackers are smarter, faster, and more patient. This paper describes a hybrid intrusion detection system built to tackle that reality head-on. The system combines a supervised machine learning classifier trained on the NSL-KDD dataset with a real-time Docker-based honeypot that automatically deploys when something suspicious is spotted. Traffic is classified into five categories: Normal, Denial of Service (DoS), Probe, Remote-to-Local (R2L), and User-to-Root (U2R). Rather than relying on a single model, a voting ensemble of Logistic Regression, Decision Tree, and Support Vector Machine classifiers is used, achieving around 80.5% accuracy. The honeypot — powered by Cowrie inside a Docker container — lures attackers into a safe sandbox where their every move gets logged. Experimental results show the system can catch a broad range of attacks while also building an intelligence trail from real attacker behavior.

Introduction

This research presents a hybrid Intrusion Detection and Response System (IDRS) that combines machine learning-based network attack detection with automated honeypot deployment to improve cybersecurity. Traditional signature-based intrusion detection systems struggle against new and evolving threats, while machine learning can identify attacks by analyzing network traffic patterns. The study uses the NSL-KDD dataset, a standard benchmark for intrusion detection research, and develops a voting ensemble classifier that achieves approximately 80.5% accuracy in classifying five categories of network traffic.

The proposed system operates through a six-stage pipeline. Scapy captures live network packets and extracts key traffic features, which are processed by a machine learning model. A hybrid detection mechanism combines rule-based checks for obvious attacks (such as SYN floods) with machine learning predictions for more complex threats. When malicious activity is detected, the system automatically launches a Cowrie honeypot inside a Docker container and redirects the attacker’s traffic to it, enabling security teams to observe attacker behavior and gather intelligence without risking production systems.

Feature engineering played a crucial role in improving efficiency and performance. From over 120 encoded features, the researchers selected the 15 most informative features using mutual information analysis, recursive feature elimination, and variance inflation factor analysis. This reduced computational overhead and improved real-time performance. The ensemble classifier, consisting of Logistic Regression, Decision Tree, and Support Vector Machine (SVM) models, outperformed individual classifiers by leveraging soft-voting techniques.

Experimental results showed that the voting ensemble achieved 80.48% accuracy, with strong detection of Normal and Denial-of-Service (DoS) traffic. However, detection of rare attack classes such as User-to-Root (U2R) and Remote-to-Local (R2L) remained challenging due to severe class imbalance in the NSL-KDD dataset. The system also demonstrated effective real-time operation, with honeypots successfully deployed in 97.9% of attack scenarios and an average container launch time of 1.8 seconds.

During a 72-hour evaluation involving 243 simulated attacks, the honeypot collected valuable intelligence. Most attackers performed reconnaissance activities such as checking system information and network configurations, while many attempted to download additional tools or establish persistence mechanisms. The honeypot remained fully isolated, preventing any security breaches while providing detailed logs of attacker behavior.

Conclusion

This paper set out to build an IDS that doesn\'t just detect attacks but actually does something with that detection — and the result is a system that catches malicious traffic and immediately redirects the attacker into a honeypot. The voting ensemble classifier delivers 80.48% accuracy across five traffic classes on NSL-KDD, with particularly strong results for Normal and DoS traffic The Cowrie-based honeypot deploys automatically within seconds and runs safely in isolation, capturing detailed attacker behavior without any risk to production systems. The real value isn\'t just the accuracy number — it\'s the intelligence the honeypot collects. Every attacker session is a data point about real techniques, real tools, and real patterns that can feed back into improving the detector over time. That feedback loop is what makes this a meaningful step toward a more adaptive security architecture, rather than just another classifier evaluated on a benchmark. The path forward involves testing on more current datasets, adding temporal modeling for better multi-step attack detection, and hardening the honeypot orchestration layer for production-grade reliability. There\'s plenty left to do — but the foundation is solid.

References

[1] M. A. Ambusaidi, X. He, P. Nanda, and Z. Tan, \'Building an intrusion detection system using a filter-based feature selection algorithm,\' IEEE Trans. Comput., vol. 65, no. 10, pp. 2986-2998, Oct. 2016. [2] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, \'A detailed analysis of the KDD CUP 99 data set,\' in Proc. 2nd IEEE Symp. Comput. Intell. Security Def. Appl., Ottawa, Canada, 2009, pp. 1-6. [3] L. Dhanabal and S. P. Shantharajah, \'A study on NSL-KDD dataset for intrusion detection system based on classification algorithms,\' Int. J. Adv. Res. Comput. Commun. Eng., vol. 4, no. 6, pp. 446-452, Jun. 2015. [4] L. Spitzner, Honeypots: Tracking Hackers. Boston, MA, USA: Addison-Wesley, 2002. [5] L. Breiman, \'Random forests,\' Mach. Learn., vol. 45, no. 1, pp. 5-32, 2001. [6] I. Guyon and A. Elisseeff, \'An introduction to variable and feature selection,\' J. Mach. Learn. Res., vol. 3, pp. 1157-1182, Mar. 2003. [7] A. L. Buczak and E. Guven, \'A survey of data mining and machine learning methods for cyber security intrusion detection,\' IEEE Commun. Surveys Tuts., vol. 18, no. 2, pp. 1153-1176, 2016 [8] F. Pedregosa et al., \'Scikit-learn: Machine learning in Python,\'J. Mach. Learn. Res., vol. 12, pp. 2825-2830, 2011. [9] Cowrie Project, \'Cowrie SSH/Telnet Honeypot,\' GitHub, 2023. [Online]. Available: https://github.com/cowrie/cowrie [10] Docker Inc., \'Docker: Accelerated Container Application Development,\'2023. [Online].Available:

Copyright

Copyright © 2026 Yenumula Sri Kanyaka Parameswari, Ms. Kuraganti Sudeepa Kumari, Guntaka Venkata Maheswar Reddy, Sk. Aarifa, Moripalli Lingeswararao. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET81864

Publish Date : 2026-05-03

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here