Asthra MailGuard: A Privacy-First Hybrid AI Email Assistant Using ML and Local LLMs

Authors: Mynamapti Sri Ranganadha Avinash, Dr. Ruhin Kouser R

DOI Link: https://doi.org/10.22214/ijraset.2025.71297

Abstract

In the era of digital communication, the volume and complexity of email traffic continue to rise, leading to challenges in managing, filtering, and responding effectively to diverse messages. While traditional spam filters offer basic classification, they lack personalization, contextual understanding, and offline privacy guarantees. This paper presents AsthraMailGuard, a privacy-first, AI-powered command-line email assistant that classifies, filters, and responds to emails using a hybrid architecture—combining a lightweight Logistic Regression model with a fallback to Hermes 3, a powerful local Large Language Model (LLM) via Ollama. The system introduces multiple innovations including a Safe Mode for privacy enforcement, domain-based rules, a self-learning feedback loop, and personalized response drafting guided by user tone. With support for offline execution, AsthraMailGuard ensures total data control while offering intelligent predictions and response generation. Evaluated on real-world email samples, it achieves improved accuracy and usability, highlighting its potential as a reliable, scalable solution for email management in both personal and professional environments.

Introduction

Overview

Despite modern messaging tools, email remains essential in personal and professional communication. However, growing email volume includes both important and unwanted messages, overwhelming users. Traditional filters are either rule-based (rigid) or cloud-based (privacy-invasive). Users now demand intelligent, adaptable, and offline solutions.

AsthraMailGuard: Core Idea

AsthraMailGuard is a hybrid AI system that combines:

Lightweight ML (Logistic Regression) for fast, offline classification
Fallback to Local LLM (Hermes 3 via Ollama) for ambiguous cases
Safe Mode, domain-based overrides, feedback-based retraining, and tone-aware response drafting

Research Gap

Existing email classification solutions:

Lack personalization and user retraining
Are either too rigid (rule-based) or not privacy-safe (cloud-based LLMs)
Don’t offer local deployment, offline operation, or context-aware replies

Objectives

Build a privacy-first, fully offline email assistant
Enable hybrid classification using ML + LLMs
Offer safe processing, domain overrides, and adaptive learning
Generate professional, tone-aware replies from CLI

Methodology

Preprocessing: Clean and normalize input via CLI
ML Classification: Logistic Regression model classifies emails with confidence score
LLM Fallback: Low-confidence emails sent to Hermes 3 for deeper understanding
Safe Mode: Blocks LLMs if email contains sensitive data (e.g., OTPs, financial info)
Domain Overrides: Emails from known domains (e.g., @linkedin.com) automatically categorized
Feedback Loop: User corrections logged and used to retrain ML model
Draft Reply Generation: Hermes 3 generates replies aligned with user tone preferences

System Architecture

Modular design with components like:

CLI input interface
Preprocessing engine
Logistic Regression + TF-IDF
Confidence evaluator
LLM fallback handler
Safe mode filter
Domain rule engine
Feedback logger and retrainer
Response generator

Implementation

Language: Python 3.10
Local ML: Scikit-learn + TF-IDF
Local LLM: Hermes 3 via Ollama
No cloud dependencies
Feedback-driven retraining
CLI interface with optional React frontend in progress

Results & Evaluation

Accuracy: ~86% (ML alone), ~93.5% (with LLM fallback)
LLM Fallback: Used in 12/50 test emails, handled ambiguity effectively
Safe Mode: Blocked 3 sensitive messages from LLM access
Retraining: Improved accuracy after learning from 5 misclassified cases
Reply Drafting: Tone-aligned, contextually relevant responses

Advantages

100% Offline and Privacy-First
Hybrid Intelligence: Fast + Deep Understanding
Self-Learning: Via feedback and retraining
Smart Replies: Contextual and tone-specific
Domain Overrides: Enhance speed and trust

Limitations

CLI-only Interface (GUI in development)
Manual Email Input (No real-time fetching yet)
LLM Runtime Demand (Needs moderate system specs)
No Thread Memory (Stateless draft generation)

Conclusion

AsthraMailGuard presents a novel hybrid framework that blends traditional machine learning with local large language models to deliver an intelligent, privacy-first email assistant. The system intelligently classifies messages, adapts to user feedback, and optionally drafts context-aware responses — all while operating entirely offline. Through a modular pipeline combining logistic regression, fallback LLM routing via Ollama, safe mode restrictions, and domain-based rules, the system achieves high accuracy while maintaining strong privacy safeguards. Evaluation on real-world emails demonstrated an effective hybrid accuracy of ~93.5%, with fallback and safe mode mechanisms working as intended. AsthraMailGuard bridges the gap between lightweight, rule-based filters and cloud-reliant LLM systems, offering a balanced, customizable, and ethical solution. Its CLI-first design, feedback loop, and local model retraining make it a promising prototype for scalable deployment in individual and organizational environments. Future improvements include integrating a GUI interface, Gmail API-based automation, and memory-aware response drafting to further enhance usability and impact.

References

[1] Vaswani et al., “Attention Is All You Need,” Advances in Neural Information Processing Systems, vol. 30, 2017. [2] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011. [3] Ollama Docs, “Run LLMs locally,” [Online]. Available: https://ollama.com [4] Y. Kim, “Convolutional Neural Networks for Sentence Classification,” in Proc. of EMNLP, 2014. [5] H. Zhang et al., “Email Classification Based on BERT and Ensemble Learning,” IEEE Access, vol. 8, pp. 181775–181785, 2020. [6] J. Devlin et al., “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv preprint arXiv:1810.04805, 2019. [7] M. S. R. Avinash, “AsthraMailGuard: GitHub Repository,” 2025. [Online]. Available: https://github.com/Ashx098/Asthra_MailGuard [8] J. Schmidhuber, “Deep Learning in Neural Networks: An Overview,” Neural Networks, vol. 61, pp. 85–117, 2015. [9] K. Cho et al., “Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation,” arXiv preprint arXiv:1406.1078, 2014. [10] Google Developers, “Gmail API Documentation,” [Online]. Available: https://developers.google.com/gmail/api [11] S. Bird, E. Klein, and E. Loper, Natural Language Processing with Python, O’Reilly Media, 2009. [12] T. Bartowski, “Hermes 3 - LLaMA-3 8B GGUF Model,” Hugging Face, 2024. [Online]. Available: https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF [13] J. Jang and J. Huh, “Privacy-Aware AI Assistants for Email Categorization,” in Proc. AAAI Conf. Artificial Intelligence, 2021. [14] K. Kowsari et al., “Text Classification Algorithms: A Survey,” Information, vol. 10, no. 4, p. 150, 2019.

Copyright

Copyright © 2025 Mynamapti Sri Ranganadha Avinash, Dr. Ruhin Kouser R. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET71297

Publish Date : 2025-05-20

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here