Software security vulnerabilities remain one of the leading causes of data breaches and cyberattacks in modern enterprise systems. Traditional static analysis tools rely primarily on rule-based detection techniques and often produce high false-positive rates while failing to detect contextual vulnerabilities. This paper presents an Autonomous Threat-Hunting AI Agent designed to automatically detect security vulnerabilities in source code repositories using a hybrid approach combining pattern-based detection, transformer-based machine learning models, and Retrieval-Augmented Generation (RAG) supported by Common Vulnerabilities and Exposures (CVE) knowledge bases. The system integrates FastAPI for backend processing, Streamlit for visualization, and fine-tuned transformer models for semantic vulnerability classification. Experimental evaluation demonstrates improved detection accuracy, reduced false positives, and real-time scanning capability across multiple programming languages. The proposed system provides automated remediation suggestions and severity scoring, making it suitable for enterprise-scale deployment and integration with CI/CD pipelines.
Introduction
The text presents an AI-driven framework for automated software vulnerability detection and verification designed to improve the security of modern software systems.
It begins by explaining that software vulnerabilities such as SQL injection, cross-site scripting, insecure credentials, and command injection are major causes of cyberattacks. Traditional security testing methods rely on manual inspection and rule-based tools, which often produce false positives and fail to understand deeper code semantics. As software systems grow in complexity, there is a need for more intelligent, automated, and scalable solutions.
To address this, the paper proposes an Autonomous LLM-based vulnerability detection system that combines transformer-based language models with rule-based analysis and Retrieval-Augmented Generation (RAG) using CVE (Common Vulnerabilities and Exposures) knowledge bases. This hybrid approach improves both detection accuracy and explainability.
The system works by accepting uploaded code repositories, parsing and preprocessing source files, and then analyzing them using AI models to detect vulnerabilities. It also retrieves relevant security information from CVE databases to provide context-aware insights. Detected vulnerabilities are assigned severity scores (high, medium, low), and the system generates automated remediation suggestions to help developers fix issues.
The architecture is built as a scalable pipeline that includes repository upload, code parsing, AI-based vulnerability detection, CVE-based retrieval, severity classification, and fix recommendation modules. It is designed to support DevSecOps workflows and enterprise security environments.
The evaluation shows that the proposed system achieves better performance than baseline methods, with higher accuracy (0.921 vs 0.853) and a lower false positive rate (0.078 vs 0.182), indicating improved reliability and precision in detecting security issues.
Conclusion
This paper presented an Autonomous Threat-Hunting AI Agent for intelligent vulnerability detection using transformer-based analysis and RAG-supported CVE knowledge retrieval. The proposed system improves detection accuracy while reducing false positives and providing automated remediation suggestions. It supports multi-language scanning and real-time analysis, making it suitable for modern DevSecOps environments. Overall, the framework contributes to scalable and automated AI-assisted software security solutions.
References
[1] E. M. Pasca, D. Delinschi, R. Erdei, and O. Matei, “LLM-Driven, Self-Improving Framework for Security Test Automation: Leveraging Karate DSL for Augmented API Resilience,” 2025.
[2] A. Barabanov, D. Dergunov, D. Makrushin, and A. Teplov, “Automatic Detection of Access Control Vulnerabilities via API Specification Processing,” Voprosy Kiberbezopasnosti, 2022.
[3] M. Bhatt et al., “Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models,” arXiv:2312.04724, 2023.
[4] M. Schäfer, S. Nadi, A. Eghbali, and F. Tip, “An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation,” 2024.
[5] M. L. Siddiq, J. C. D. S. Santos, S. Devareddy, and A. Müller, “SALLM: Security Assessment of Generated Code,” in Proc. IEEE/ACM Int. Conf. Automated Softw. Eng. Workshops, 2024
[6] G. Baye, F. Hussain, A. Oracevic, R. Hussain, and S. M. A. Kazmi, “API security in large enterprises: Leveraging machine learning for anomaly detection,” in Proc. Int. Symp. Netw., Comput. Commun. (ISNCC), Oct. 2021, pp. 1–6
[7] M. Sowmya, A. J. Rai, V. Spoorthi, M. Irfan, P. B. Honnavalli, and S. Nagasundari, “API traffic anomaly detection in microservice architecture,” in Proc. IEEE/ACM Int. Symp. Cluster, Cloud Internet Comput. Workshops (CCGridW), May 2023, pp. 206–213.
[8] A. Aumpansub and Z. Huang, “Detecting software vulnerabilities using neural networks,” in Proc. Int. Conf. Mach. Learn. Comput., ACM, Feb. 2021, pp. 166–171, doi: 10.1145/3457682.3457707.
[9] Y. Huang, C. Shi, J. Lu, H. Li, H. Meng, and L. Li, “Detecting broken object-level authorization vulnerabilities in database-backed applications,” in Proc. ACM SIGSAC Conf. Comput. Commun. Secur., Dec. 2024, pp. 2934–2948, doi: 10.1145/3658644.3690227.
[10] A. G. Mirabella, A. Martin-Lopez, S. Segura, L. Valencia-Cabrera, and A. Ruiz-Cortés, “Deep learning-based prediction of test input validity for RESTful APIs,” in Proc. IEEE/ACM Int. Workshop Deep Learn. Test. (DeepTest), Jun. 2021, pp. 9–16.