Modern knowledge workers face persistent challenges in managing multi-channel coordination tasks spanning emails, calendar events, and follow-up communication. Existing digital assistants are either reactive command-driven tools or rigid rule-based automation systems that lack contextual reasoning and workflow continuity needed to act autonomously on behalf of a user. This paper presents OrchestrAI, a stateful delegated communication system that combines LangGraph-based multi-step agent orchestration with a risk-aware policy engine and a human-in-the-loop approval interface. The system autonomously scans Gmail inboxes, classifies threads, extracts structured tasks and commitments, detects SLA breaches, and drafts follow-up nudges enriched with Google Calendar context. Every proposed outbound action is scored for risk, gated behind an explicit approval workflow, and recorded in an immutable audit trail. Evaluation across ten delegated-communication scenarios shows that OrchestrAI achieves a safety catch rate of 100%, compared to 40% for a one-shot LLM baseline and 0% for a rule-based system, while maintaining comparable task success rates. These results demonstrate that stateful orchestration combined with governance-aware action gating provides a measurably safer foundation for inbox-automation agents than single-turn or rule-driven approaches.
Introduction
Modern professionals manage communication across multiple platforms such as email, calendars, and messaging applications. However, existing tools lack the ability to perform complete cognitive workflows that involve understanding communication, identifying pending actions, drafting responses, checking schedules, assessing risks, and executing tasks with user oversight. Virtual assistants, workflow automation platforms, and standalone Large Language Model (LLM) tools are limited by their inability to maintain long-term context, reason across workflows, enforce approval mechanisms, or provide auditability.
OrchestrAI is proposed as an intelligent workflow orchestration system designed to bridge this gap. It introduces three key innovations: (1) a LangGraph-powered inbox orchestration pipeline that combines email classification, task extraction, commitment detection, and SLA monitoring into a unified workflow; (2) a risk-aware policy engine that evaluates actions and routes high-risk tasks for human approval; and (3) an immutable audit log that records all workflow state changes for transparency and accountability.
The system builds upon previous research in task-oriented dialogue systems, autonomous AI agents, and workflow orchestration while addressing their limitations in persistent state management, governance, and auditability. Unlike existing approaches, OrchestrAI integrates stateful workflow execution, risk scoring, human-in-the-loop (HITL) governance, and comprehensive auditing.
OrchestrAI uses a four-tier architecture consisting of:
An Angular-based user dashboard,
A FastAPI backend,
A LangGraph orchestration layer,
Integrations with Gmail, Google Calendar, and LLM services.
The system defines structured domain objects such as email threads, task extraction results, commitment records, draft responses, proposed actions, and audit events. Persistent data is stored in relational database tables that maintain thread tracking, tasks, commitments, nudges, actions, user credentials, and audit logs.
For security and privacy, users authenticate through Google OAuth 2.0, with separate consent required for Gmail access. This ensures that email resources are accessed only with explicit authorization.
OrchestrAI employs two primary workflows:
Inbox Scan Graph – Fetches and classifies emails, extracts tasks, detects commitments, monitors deadlines, and records audit events.
A key safety feature is that the system never sends emails automatically without governance controls. Medium- and high-risk actions require explicit user approval, ensuring responsible and transparent AI-assisted communication management.
Conclusion
This paper has presented OrchestrAI, a delegated communication system that addresses the gap between single-turn LLM assistants and production-grade autonomous agents through stateful workflow orchestration, risk-aware action gating, and human-in-the-loop governance. Three claims are supported by design and evaluation evidence: (i) stateful agent orchestration using LangGraph is more appropriate for inbox-management tasks than single-shot prompting; (ii) a deterministic policy engine enforcing typed risk levels and approval queuing achieves a 100% safety catch rate, outperforming both rule-based and one-shot LLM alternatives; and (iii) an immutable audit trail makes the system\'s behaviour inspectable and explainable, a prerequisite for user trust in any system handling personal communication. OrchestrAI contributes a concrete, open architecture for governance-aware agentic systems and a reproducible evaluation protocol for benchmarking delegated-communication agents.
References
[1] Google LLC, \"Google Assistant Developer Documentation,\" Google AI Publications, 2024. [Online]. Available: https://developers.google.com/assistant
[2] Microsoft Corporation, \"Microsoft Power Automate Documentation,\" Microsoft Learn, 2024. [Online]. Available: https://learn.microsoft.com/en-us/power-automate
[3] OpenAI, \"Large Language Models for Tool Use and Multi-Step Reasoning,\" OpenAI Technical Reports, 2024.
[4] J. Qin, L. Chen, and M. Zhou, \"End-to-End Task-Oriented Dialogue Systems: A Comprehensive Survey,\" in Proc. EMNLP, 2023, pp. 1–25.
[5] Z. Chen and W. Xu, \"MCAN: Multi-Channel Autonomous Negotiation Agents,\" in Proc. IJCAI, 2022, pp. 3802–3808.
[6] K. Chawla et al., \"CaSiNo: A Corpus of Campsite Negotiation Dialogues for Automated Negotiation Systems,\" in Proc. NAACL-HLT, 2021, pp. 3167–3185
[7] H. He, D. Chen, A. Balakrishnan, and P. Liang, \"Decoupling Strategy and Generation in Negotiation Dialogues,\" in Proc. EMNLP, 2018, pp. 2333–2343.
[8] Y. Yang, Y. Li, and X. Zhao, \"GNOME: Goal-Oriented Negotiation with Open-Domain Language Models,\" in Proc. ACL Findings, 2024, pp. 1024–1036.
[9] F. Piccialli et al., \"AgentAI: A Survey of Autonomous AI Agents,\" Information Fusion, Elsevier, vol. 108, 2025, Art. no. 102366.
[10] LangChain Technologies, \"LangGraph: Stateful Workflow Orchestration for LLM Agents,\" 2024. [Online]. Available: https://langchain-ai.github.io/langgraph
[11] H. Kaewtawee and N. Wattanapongsakorn, \"Cloning Conversational Voice AI Agents from Call Corpora,\" arXiv preprint arXiv:2502.09123, 2025.
[12] E. Perez et al., \"Risks from Learned Optimization in Advanced Machine Learning Systems,\" in Proc. ICML Workshop on Safety and Robustness in Decision-Making, 2023.