Intrusion detection systems based on signatures face issues with encrypted data and zero-day threats. In this paper, we introduce a novel approach of a hybrid NIDS, combining flow-level analysis with a confidence-gated cascade of classifiers: a weighted average of XGBoost and 1D-CNN (60:40), alongside Isolation Forest invoked upon falling ensemble confidence below a calculated threshold. Training on merged CIC-IDS 2017/2018 datasets (3.49 million flows, 5 classes, 23 behavioral features) with SMOTE-Tomek balancing results in a model with accuracy of 99.83%, FP rate of 0.20% and only 2 false positives among zero-days, outperforming other cascade models (IF-first, AE-first). Testing the same architecture with validation flows (1.83 million honeypot flows) highlights reduced performance for both supervised and unsupervised components due to hypervisorinduced bias, caused by lack of realistic network latency and routing in VM-to-host communication. Results at 10% level of flow injection show that while overall performance is still good (81%), F1-score is notably lower for attacks with specific timing characteristics. Per-flow explainability based on SHAP values is incorporated into our design, and we propose two deployment modes – supervised-only and cascaded for zero-day experiments.
Introduction
This paper presents a flow-based hybrid Network Intrusion Detection System (NIDS) designed to improve the detection of both known and zero-day cyberattacks. Traditional signature-based IDSs are effective only against known threats and struggle with encrypted network traffic. To overcome these limitations, the proposed system analyzes network flow statistics rather than packet contents and combines supervised and unsupervised machine learning techniques.
The proposed architecture consists of a three-level confidence-gated decision cascade. First, XGBoost and a 1D Convolutional Neural Network (1D-CNN) independently classify network flows, and their predictions are combined using weighted probability averaging (60% XGBoost and 40% CNN). If the prediction confidence is above a threshold (τ = 0.50), the flow is classified as either BLOCKED (attack) or PASS (normal). Low-confidence flows are further analyzed by an Isolation Forest, which identifies potential zero-day attacks and generates an ALERT.
The system uses 23 carefully selected network flow features extracted from CICFlowMeter and applies preprocessing techniques such as logarithmic transformation, normalization, and feature scaling. The models were trained on 3.49 million network flows from the CIC-IDS 2017 and 2018 datasets, with class imbalance addressed using SMOTE-Tomek sampling. Validation was also performed on 1.83 million honeypot-generated network flows to evaluate real-world performance.
Experimental results show that the proposed weighted ensemble achieved 99.83% accuracy and 99.68% Macro-F1 score, outperforming other hybrid architectures. However, evaluation on live honeypot traffic revealed performance degradation caused by distribution differences between benchmark datasets and real-world network traffic, emphasizing the importance of representative training data. Noise robustness testing further demonstrated that while overall accuracy remained stable, detection performance for certain attack classes declined under realistic network disturbances.
To improve transparency, the system integrates SHAP explainability, allowing analysts to understand which flow features contributed to each prediction. Overall, the proposed hybrid NIDS provides an effective, scalable, and interpretable solution for detecting both known and unknown network attacks while highlighting the challenges of deploying machine learning-based intrusion detection systems in real-world environments.
Conclusion
The proposed approach implements a confidence gate, which triggers the anomaly detection only when the prediction is made with very low confidence. Consequently, the model achieves 99.83% precision and 0.20% FPR, outperforming approaches that perform anomaly detection on all flows. Validation with honeypots shows that flow-based models need actual traffic on the network level to work well, while flows generated using VM to Host configurations are too few from a statistical point of view to ensure a correct classification. A study on noise reveals that, even though the model performs relatively well, attacks that depend on timing attributes do not work in presence of jitter. Integration with SHAP provides insight into the reasoning of each individual flow.
References
[1] P. Laskovet al., “Learning intrusion detection: Supervised or unsupervised?” Proc. 13th Int. Conf. Image Analysis and Processing, pp. 50–57, 2005.
[2] R. Sommer and V. Paxson, “Outside the closed world: On using machine learning for network intrusion detection,” IEEE S&P, pp. 305–316, 2010.
[3] I. Sharafaldinet al., “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” ICISSP, pp. 108–116, 2018.
[4] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” ACM SIGKDD, pp. 785–794, 2016.
[5] J. Kim et al., “A 1D-CNN based deep learning approach for intrusion detection system,” IEEE BigComp, pp. 605–608, 2018.
[6] F. T. Liu et al., “Isolation Forest,” IEEE ICDM, pp. 413–422, 2008.
[7] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” NeurIPS, vol. 30, 2017.
[8] G. E. Batista et al., “A study of the behavior of several methods for balancing ML training data,” ACM SIGKDD Explorations, vol. 6, no. 1, pp. 20–29, 2004.
[9] M. A. Ferraget al., “Deep learning for cyber security intrusion detection: A survey,” J. Inf. Sec. Appl., vol. 52, 2020.
[10] M. Z. Alom et al., “A deep learning approach for network intrusion detection system,” IEEE CCWC, pp. 248–253, 2019.
[11] J. Franco et al., “A survey of honeypots and honeynets for IoT,” IEEE COMST, vol. 23, no. 4, pp. 2351–2383, 2021.
[12] N. Martins et al., “Adversarial ML applied to intrusion and malware scenarios,” IEEE Access, vol. 8, pp. 35403–35419, 2020.