Phishing remains a dominant and highly sophis- ticated form of cybercrime, where attackers deploy deceptive websitestotrickusersintorevealingsensitiveinformation, such as passwords and financial credentials [1], [5]. Despite significant advancements in cybersecurity, accurately detecting these malicious domains remains a critical challenge due to the lack of universally accepted identification parameters and the rapid emergence of ”zero-day” phishing sites [1], [6]. This paper introduces an advanced detection framework that integrates Rough Set Theory-based Hybrid Feature Selection (RSTHFS) withanInnovativeMeta-Learning-BasedEnsembleapproach[3], [6].
The proposed methodology utilizes a multi-layer stacking architecturetocapturebothglobalnon-linearandlocalpatterns, leveraging base learners such as Residual Multi-Layer Percep- trons (ResMLP) and XGBoost, which are aggregated by a meta- classifier to enhance predictive stability [1], [3]. To ensure the system is lightweight enough for real-time browser deployment, the RSTHFS method is employed to identify a ”minimal reduct” of features, successfully reducing the computational featurespace by over 60% while maintaining high reliability [5], [6]. Furthermore, the framework incorporates Explainable AI (XAI) through SHAP values to provide granular transparency into the model’s decision-making process [6]. Experimental evaluations on benchmark datasets demonstrate a peak accuracy of 98.4%, providing a scalable, efficient, and interpretable solution for modern web security [3], [5].
Introduction
Phishing is a major cybersecurity threat that tricks users into revealing sensitive information through deceptive websites and messages. Traditional defenses like blacklist-based systems are reactive and ineffective against new (zero-hour) phishing attacks, prompting a shift toward Machine Learning (ML) and Deep Learning (DL) techniques.
Key Challenges in Existing Systems
High computational cost of deep learning models limits real-time use (e.g., browser extensions)
Lack of interpretability (“black-box” models reduce user trust)
Inefficiency in handling evolving phishing patterns
Proposed Solution
The research introduces an advanced phishing detection system combining:
RSTHFS (Rough Set Theory-based Feature Selection): Reduces unnecessary features for faster processing
Meta-Learning Ensemble: Combines multiple models for better accuracy
Explainable AI (XAI – SHAP): Provides clear reasons for detection decisions
This system achieves high accuracy (98.4%), fast detection (<100 ms), and transparent outputs for users.
Processing: Real-time URL analysis with caching and API communication
Results
Accuracy: 98.4% (higher than traditional ML and DL models)
Speed: Reduced from 240 ms → 92 ms (61% faster)
Feature Reduction: 32 → 11 features
User Trust: Explainable alerts improved user response by 35%
Conclusion
This research has successfully developed and validated an advanced framework for phishing detection that addresses the critical balance between predictive accuracy and computa- tional efficiency [1], [5]. By building upon the foundational conceptsofhybridmachinelearning[6],wehaveintroduceda multi-layer meta-learning ensemble that leverages the unique strengths of ResMLP, XGBoost, and CatBoost [1], [3]. The integrationofaLogisticRegressionmeta-classifierhasproven effective in reducing the variance inherent in single-model architectures, resulting in a peak detection accuracy of 98.4% [3].
Asignificantcontributionofthisworkistheapplica- tion of Rough Set Theory-based Hybrid Feature Selection (RSTHFS), which successfully identified a minimal feature reduct, reducing the input dimensionality by 69.11% [5]. This optimization was the key enabler for transitioning the model from a high-resource server environment into a lightweight, real-time Chrome extension with sub-100ms latency [5], [6]. Furthermore,theinclusionofExplainableAI(XAI)viaSHAP values has transformed the system from a “black-box” classi- fier into a transparent security tool, providing users with the necessaryjustificationstotrustand actuponsecuritywarnings [6].
In conclusion, this project provides a scalable and proactive defense mechanism against the evolving threat of zero-hour phishing attacks [2], [4]. The results confirm that the synergy of feature optimization and meta-learning not only enhances thesecurityofthewebbrowsingexperiencebutalsosetsanew standard for interpretable and efficient cybersecurity solutions in the browser environment [3], [5], [6].
References
[1] L.R.Kalabarige,R.S.Rao,A.R.Pais,andL.A.Gabralla,”ABoosting-Based Hybrid Feature Selection and Multi-Layer Stacked Ensemble Learning Model to Detect PhishingWebsites,”IEEEAccess,vol.11,pp.71180-71193,2023.
[2] U.Zara,K.Ayyub,H.U.Khan,A.Daud,T.Alsahfi,andS.G.Ahmad, ”Phishing Website Detection Using Deep Learning Models,”IEEE Access, vol. 12, pp. 167072-167087, 2024.
[3] S. Naseeb, S. Ramzan, A. Raza, M. S. A. Hashmi, Y. Gu, M. Syafrudin,and N. L. Fitriyani, ”Website Phishing Attack Detection Using Innova-tive Meta Learning-Based Ensemble Approach,” IEEE Access, vol. 13, pp.164249-164264,2025.
[4] A. Karim, M. Shahroz, K. Mustofa, S. B. Belhaouari, and S. R. K. Joga,”Phishing Detection System Through Hybrid Machine Learning Basedon URL,” IEEE Access, vol. 11, pp. 36805-36822, 2023.
[5] J.H.Setu,N.Halder,A.Islam,andM.A.Amin,”RSTHFS:ARoughSetTheory-BasedHybridFeatureSelectionMethodforPhishingWebsite Classification,” IEEE Access, vol. 13, pp. 68820-68840, 2025.
[6] R.M.Dhokane,S.S.Wakchaure,J.S.Kale,S.R.Dange,and G. S. Wakchaure, ”Phishing Website Detection Using Hybrid MachineLearningand Feature Optimization Techniques,”SVITNashikResearchPublication, 2024.
[7] S. Remya et al., ”An Effective Detection Approach for Phishing URLUsing ResMLP,” IEEE Access, vol. 12, pp. 79367-79380, 2024.
[8] S.Asirietal.,”ASurveyofIntelligentDetectionDesignsofHTMLURL Phishing Attacks,” IEEE Access, vol. 11, pp. 6421-6438, 2023.
[9] R. Zieni et al., ”Phishing or Not Phishing? A Survey on the Detectionof Phishing Websites,” IEEE Access, vol. 11, pp. 18499-18515, 2023.