The rapid growth of large language models has increased the presence of AI-generated text across social media, making it increasingly challenging to distinguish automated content from human contributions. This raises concerns regarding misinformation, artificial engagement, and the overall integrity of online discussions, creating a need for dependable detection mechanisms. In response, this project develops a web-based system that determines whether a given text sample originates from a human user or an AI model. The development process is organized into multiple stages, beginning with the cleaning and preprocessing of social media text using Natural Language Processing methods, including contextual filtering and sentiment-aware analysis. Classification is carried out using transformer-based models such as BERT, and additional hybrid configurations with LSTM or CNN layers are explored to capture deeper linguistic and sequential features. To support large-scale and multilingual usage, the system is backed by a scalable architecture capable of real-time processing. An interactive interface is provided to allow users to review predictions, observe analytical trends, and navigate the system efficiently. The inclusion of Explainable AI tools further clarifies model behaviour by highlighting the textual elements that influence each decision.
Introduction
Advances in artificial intelligence, particularly large language models (LLMs), have made it increasingly difficult to distinguish AI-generated text from human-written content, creating challenges in education, social media, and news environments. Traditional plagiarism detection methods are ineffective because AI text is generated from learned patterns rather than copied sources. To address this issue, the work proposes a real-time text classification system that identifies whether text is human-authored or AI-generated.
The system is built around transformer-based models, primarily BERT, to extract deep contextual representations. Several architectures are explored, including a fine-tuned BERT classifier, a BERT–LSTM hybrid, and an ensemble approach combining BERT embeddings with a Random Forest classifier. The ensemble model is selected for deployment due to its superior stability, robustness, and confidence consistency, especially for ambiguous inputs. The framework supports multilingual text, real-time processing, and scalable deployment.
An interactive web-based interface using Flask and Dash allows users to submit text, view predictions and confidence scores, and monitor system behavior. To ensure transparency, Explainable AI techniques such as SHAP are integrated to highlight influential words contributing to classification decisions. A feedback module further supports continuous improvement.
Experimental results show that the BERT + Random Forest ensemble outperforms standalone and hybrid models in accuracy, reliability, and interpretability. Overall, the proposed system offers a practical, transparent, and scalable solution for distinguishing AI-generated text from human writing in modern digital environments.
Conclusion
This work presents a robust and real-time system for distinguishing between human-written and AI-generated text using a hybrid architecture that integrates a fine-tuned BERT model with a Random Forest classifier. The ensemble approach enhances prediction reliability and ensures consistent performance across diverse textual inputs. A key contribution of this work is the incorporation of SHAP-based explainability, which provides transparent, token-level insights into model decisions and strengthens user trust in the system’s outputs.
The system is complemented by a real-time interactive dashboard developed using Dash, enabling clear visualization of predictions, confidence scores, and model behavior. A feedback mechanism allows users to validate or correct results, offering valuable support for potential future refinement. With a scalable Flask backend and an intuitive frontend, the framework demonstrates strong practical applicability in areas such as content moderation, academic integrity verification, and social media analytics.
References
[1] Maddugoda, Chamathka. (2023). A Comprehensive Review: Detection Techniques for Human-Generated and AI-Generated Texts.
[2] A. Al Bataineh, R. Sickler, K. Kurcz and K. Pedersen, \"AI-Generated Versus Human Text: Introducing a New Dataset for Benchmarking and Analysis,\" in IEEE Transactions on Artificial Intelligence, vol. 6, no. 8, pp. 2241-2252, Aug. 2025, doi: 10.1109/TAI.2025.3544183.
[3] Vyas, Piyush & Vyas, Gitika & Chennamaneni, Anitha. (2023). Detection of Malicious Bots on Twitter through BERT Embeddings-based Technique.
[4] Muhammad Irfaan Hossen Rujeedawa, Sameerchand Pudaruth and Vusumuzi Malele, “Unmasking AI-Generated Texts Using Linguistic and Stylistic Features” International Journal of Advanced Computer Science and Applications(ijacsa), 16(3), 2025. http://dx.doi.org/10.14569/IJACSA.2025.0160321.