Employee credentials like email addresses, passwords, API keys are increasingly targeted by cybercriminals for unauthorized access, phishing and other cyberattacks. According to Verizon\'s 2025 Data Breach Investigations Report, stolen credentials were involved in 53% of all data breaches in 2025, making credential compromise the most prevalent attack vector in the digital threat landscape. The average cost of a credential-based data breach is estimated to be about $4.67 million, with IBM\'s 2025 Cost of a Data Breach Report revealing that organizations take an average of 246 days to identify and contain such incidents. Additionally, 94% of passwords are reused across multiple accounts, and credential stuffing attackswhich automate the replay of stolen credentials. While large organizations can afford dedicated Security Operations Centers (SOCs) and expensive breach monitoring tools, small and medium-sized enterprises (SMEs), educational institutions and healthcare organizations often lack even the most basic systems required for credential leak detection. Existing solutions rely on static breach databases or manual user checks, missing real-time and organizationspecific exposures critical for early incident response and containment. We developed BreachGuard, an automated real-time credential leak tracking system designed for organizational cybersecurity. The system continuously scans multiple public sourcesincluding Have I Been Pwned (HIBP) breach database API, Pastebin and GitHub repositoriesusing pattern-based detection (regex), web scraping and API integration to identify employees and their domain credentials. Upon detection, the system triggers instant Slack and email alerts with severity classification and well-known remediation steps. Critical findings show that this system operates at comparatively low cost, making it much more affordable than its commercial alternatives. The open-source design facilitates future extensions for dark web monitoring, machine learningbased detection and SIEM/SOAR integration. By bridging the detection gap through automation and enabling early identification of credential leaks, BreachGuard demonstrates that proactive, multi-source credential monitoring is feasible and economically viable for organizations of all sizes enabling faster detection, remediation of credential-based security incidents and significantly reducing breach costs.
Introduction
Employee credentials have become a major target for cybercriminals, with 53% of all data breaches in 2025 involving credential theft and costing an average of $4.67 million per incident. The problem worsens because 94% of users reuse passwords, increasing the impact of a single leak. Existing tools like HIBP and Google Password Checkup help find compromised passwords but lack real-time monitoring and do not track organization-specific leaks on platforms like Pastebin or GitHub.
To address this gap, the paper introduces BreachGuard, an affordable real-time credential leak monitoring system for SMEs, educational institutions, and healthcare organizations. It continuously scans three sources—HIBP API, Pastebin, and GitHub—and sends immediate Slack alerts with severity ratings and recommended actions. Built using Python, Flask, and SQLite, BreachGuard runs scheduled scans every 5 minutes and stores logs for auditing.
The literature review shows that while existing tools use privacy-preserving methods like k-anonymity, they lack continuous monitoring. Research highlights the growing threat, with huge increases in leaked credentials and a high frequency of secret exposures in GitHub repositories. Commercial tools exist but cost ?40,000–?400,000 monthly, making them inaccessible to smaller organizations.
BreachGuard’s architecture includes:
Presentation layer (Flask UI and dashboard)
Application layer (scanning modules for HIBP, Pastebin, and GitHub + Slack alerting)
Data layer (SQLite database)
Testing shows high performance: 97.4% accuracy, 2.4-second alert time, and stable operation. The system is highly cost-effective at ?250–?1,200 per month, making it 10–20× cheaper than commercial alternatives.
The discussion emphasizes benefits like low cost, multi-source detection, real-time alerting, fast response times, and reduced manual work. However, limitations include lack of dark web monitoring, regex-based detection issues, possible false positives, limited scalability, and dependence on third-party services.
Future enhancements include dark web integration, machine-learning-based detection, SIEM integration, predictive analytics, automated password resets, and multi-tenant support.
Conclusion
This paper presented BreachGuard, an automated system for detecting credential leaks in real-time at low cost. By combining HIBP API, Pastebin scraping and GitHub scanning, the system provides multi-source monitoring that existing single-source tools cannot match. Testing showed about 98.4% accuracy, 2.4-second alert time and operation at ?300-?1,200 monthly making it accessible to SMEs and institutions.
BreachGuard demonstrates that real-time, automated credential leak monitoring is feasible and affordable for organizations of all sizes. By integrating multiple data sources like breach databases, paste sites and code repositories,the system provides a view of an organization’s exposure to leaked credentials.In testing, BreachGuard successfully identified sample leaks and generated structured alerts while operating with minimal latency and cost. Compared to existing tools, it offers unique advantages: continuous domain-wide scanning, proactive notifications (via Slack/email) and detailed reporting, at a low cost.
In essence, BreachGuard fills a critical gap for proactive cybersecurity; it empowers SMEs and other organizations to detect and remediate leaked credentials before attackers can exploit them. This project emphasizes the value of automation and OSINT in defensive security. Its open architecture invites further research and development, such as adding machine learning and threat feeds. We hope this work will inspire students like us and researchers to pursue accessible, open-source and integrated solutions in the field of Cyber Security.
References
[1] Verizon Business, \"2025 Data Breach Investigations Report (DBIR),\" Verizon Commun. Inc., 2025. [Online]. Available: https://www.verizon.com/about/news/2025-data-breach-investigations-report
[2] IBM Security and Ponemon Institute, \"2025 Cost of a data breach report,\" IBM Corp., 2025. [Online]. Available: https://www.ibm.com/think/insights/whats-new-2024-cost-of-a-data-breach-report
[3] Heimdal Security Team, \"Password breach statistics in 2025,\" Heimdal Blog, 2025. [Online]. Available: https://heimdalsecurity.com/blog/password-breach-statistics
[4] L. Li, B. Pal, J. Ali, N. Sullivan, R. Chatterjee, and T. Ristenpart, \"Protocols for checking compromised credentials,\" in Proc. ACM SIGSAC Conf. Comput. Commun. Secur., 2019, pp. 1387–1403. doi: 10.1145/3319535.3354229.
[5] ProjectDiscovery Security Team, \"Introducing credential monitoring: Free real-time malware log analysis,\" ProjectDiscovery Blog, 2025. [Online]. Available: https://projectdiscovery.io/blog/leaked-credential-monitoring
[6] M. Rabzelj et al., \"Beyond the leak: Analyzing the real-world exploitation of leaked credentials,\" Nat. Sci. Rep., vol. PMC12197152, 2025.
[7] Check Point Research, \"The alarming surge in compromised credentials in 2025,\" Check Point Res., 2025. [Online]. Available: https://blog.checkpoint.com/security/the-alarming-surge-in-compromised-credentials-in-2025
[8] M. Meli, M. R. McNiece, and B. Reaves, \"How bad can it git? Characterizing secret leakage in public GitHub repositories,\" in Proc. NDSS, 2019. [Online]. Available: https://www.ndss-symposium.org/wp-content/uploads/2019/02/ndss2019_04B-3_Meli_paper.pdf
[9] S. K. Basak, C. Bailey, C. Zubak, M. Hicks, and Y. Acar, \"A comparative study of software secrets reporting by secret detection tools,\" in Proc. 45th Int. Conf. Softw. Eng. (ICSE), 2023. doi: 10.1109/ICSE48619.2023.00150.
[10] P. Shamunesh, S. Vinoth, and L. N. B. Srinivas, \"CyberCheck—OSINT & web vulnerability scanner,\" in Proc. 2nd IEEE Int. Conf. Edge Comput. Appl. (ICECAA), 2023, pp. 1–8. doi: 10.1109/ICECAA58104.2023.10212207.