Phishing and digital messaging fraud remain among the most pervasive and destructive cyber threats, continuously exploiting human cognitive vulnerabilities to compromise secure networks.
Traditional defence mechanisms—such as domain blocklists, signature matching, and basic heuristic rules—are increasingly inadequate. This failure is aggravated by the rise of consumer-grade generative artificial intelligence, which allows adversaries to launch highly sophisticated, grammatically perfect, and context-aware social engineering campaigns at an unprecedented scale.
To bridge this defensive gap, this paper proposes an advanced, automated detection framework leveraging Natural Language Processing (NLP) paired with hybrid deep learning architectures.
Rather than relying solely on rigid keyword filtering, the proposed system employs transformer-based language representations (such as BERT and RoBERTa) alongside Long Short-Term Memory (LSTM) networks to analyze the nuanced semantic structure, sentiment, and emotional undercurrents (e.g., manufactured urgency, fear, or financial coercion) of incoming text. Furthermore, the framework integrates multi-modal feature engineering, cross-examining unstructured email bodies, SMS text, and metadata like look-alike URL patterns. Empirical evaluation conducted on large-scale, balanced datasets demonstrates that our hybrid NLP model achieves an accuracy rate exceeding 96.5%, significantly reducing the high false-negative rates that plague legacy systems. Additionally, by introducing Explainable AI (XAI) frameworks like SHAP, the system provides transparent, interpretable reasoning behind its threat classifications. This research underscores the vital role of semantic-layer defence in modern cybersecurity pipelines, offering a highly scalable, real-time solution to mitigate evolving, automated digital fraud.
Introduction
The cyber security landscape has changed significantly with the rise of generative Artificial Intelligence (AI). Traditional security methods, such as blocklists, pattern matching, and rule-based detection systems, were effective against simple phishing attacks but are no longer sufficient against modern AI-generated threats. Cybercriminals now use Large Language Models (LLMs) to create highly personalized, grammatically correct, and contextually relevant phishing messages that can bypass conventional detection mechanisms. As a result, fraud detection has shifted from identifying obvious keywords and errors to understanding the semantic meaning, intent, and manipulation techniques used within messages.
Modern phishing attacks typically consist of three key components:
Semantic Payload – The main message content designed to manipulate victims through tactics such as authority impersonation, urgency, fear, or financial incentives.
Syntactic Payload – Techniques used to evade detection systems, including character substitutions, homoglyph attacks, and hidden characters.
Structural Metadata – Deceptive elements such as fake sender names, spoofed email headers, and malicious or look-alike URLs.
To address these evolving threats, the proposed system introduces a hybrid deep learning and Transformer-based framework for automated phishing and fraud detection. The framework focuses on understanding both the content and context of messages while providing explainable results.
Proposed Methodology
The system uses a dual-pathway feature extraction architecture:
Sequential Temporal Pathway: Processes message sequences through tokenization and embedding techniques to preserve word order, sentence structure, and communication patterns.
Deep Contextual Pathway: Uses advanced tokenization methods such as Byte-Pair Encoding and Transformer models to capture contextual relationships and semantic meaning across the text.
Hybrid RoBERTa–LSTM Architecture
At the core of the system is a combination of:
RoBERTa (Robustly Optimized BERT Pretraining Approach): A Transformer-based language model that generates rich contextual representations of text by using self-attention mechanisms. RoBERTa improves upon traditional BERT by removing the Next Sentence Prediction task and applying dynamic token masking, leading to better language understanding.
Deep LSTM (Long Short-Term Memory): A recurrent neural network component that captures sequential and temporal patterns in message structures.
By combining RoBERTa’s contextual understanding with LSTM’s ability to model sequence dependencies, the framework can detect sophisticated phishing attempts that traditional systems often miss.
Conclusion
This chapter demonstrated the efficacy of applying advanced Natural Language Processing to combat modern AI-powered social engineering threats. By blending the deep semantic understanding of RoBERTa with the sequential tracking of LSTMs, the proposed framework provides high-accuracy detection capable of intercepting linguistically complex phishing attacks. Future development will focus on the threat of multi-modal evasion tactics - such as embedding malicious text directly inside image attachments or using variable font encodings Mitigating these zero-day vectors requires expanding current text pipelines into holistic, multi-modal systems capable of processing layout, imagery, and text features simultaneously.
References
Alarfaj, F. K. (2026). Clickbait detection in news headlines using RoBERTa-Large language model and deep embeddings. PMC, 12(780), 1–15. https://pmc.ncbi.nlm.nih.gov/articles/PMC12780135/
[2] Ali, A. A. (2025). Email Spam Detection: A Novel Hybrid Approach Using Machine and Deep Learning Techniques. International Network for Advanced Science and Technology, 12(2), 31–44.
[3] De Nardin, A. (2025). Deep Learning-Based Intrusion Detection Systems for Phishing Email Detection: A Short Survey. Computer Vision Foundation Open Access Workshops, 102–111.
[4] Ibrahim, M. (2026). Phishing Email Detection Using BERT and RoBERTa. MDPI Computer Sciences, 14(2), 46–59. https://www.mdpi.com/2079-3197/14/2/46
[5] Mahendru, S., & Pandit, T. (2024). SecureNet: A Comparative Study of DeBERTa and Large Language Models for Phishing Detection. Proceedings of the 2024 IEEE 7th International Conference on Big Data and Artificial Intelligence, 160–169. https://arxiv.org/pdf/2406.06663
[6] Safran, M. (2025). Phishing GNN: Phishing Email Detection Using Graph Attention Networks and Transformer-Based Feature Extraction. IEEE Xplore, 1109–1119.
[7] Zaman, T. A. S. (2025). Context-aware phishing url detection: harnessing the power of large language models. DergiPark Journal of Academic Networks, 4(2), 201–214.
[8] Uddin, R., Basit, A., Khan, Y., Shazib, MD S., & Hossain, S. (2025). BERT-Based Fake News Detection: A Transformer-Driven Approach for Misinformation Classification on Twitter. International Journal on Science and Technology, 16(1), 1–12. https://doi.org/10.71097/ijsat.v16.i1.2023