OWN-GPT is a 2.3-billion-parameter transformer-based Large Language Model (LLM) developed as a full-stack AI system featuring a React.js frontend and a Python FastAPI backend. The system provides conversational text generation, image-based question answering using OCR and visual feature extraction, and PDF document question answering through Retrieval-Augmented Generation (RAG). The backend incorporates PyTorch, HuggingFace Transformers, PyMuPDF, LangChain, FAISS, sentence-transformers, OpenCV, Tesseract, and ViT-Base/16 for multimodal processing. Trained on an 81 GB multi-domain corpus (~40 billion tokens), OWN-GPT achieves a perplexity of 7.8, BLEU score of 0.68, image module CER of 6.8%, and RAG module faithfulness and relevancy scores of 0.81 and 0.79. Designed for local deployment in Indian academic institutions, the system ensures data privacy and zero per-query operational cost. A user study reports a System Usability Scale (SUS) score of 77.1, confirming strong usability and user satisfaction.
Introduction
OWN-GPT is a locally deployable, multimodal Large Language Model (LLM) designed for academic institutions to ensure data privacy, avoid cloud costs, and provide on-premise AI capabilities. It integrates conversational AI, image question answering via OCR and Vision Transformers, and PDF-based Retrieval-Augmented Generation (RAG). The system uses a FastAPI backend, React.js frontend, and PyTorch-based transformer models trained on an 81?GB multi-domain corpus. Evaluation shows strong language generation, OCR-integrated image understanding, and accurate PDF retrieval, with high user satisfaction, demonstrating an efficient, privacy-preserving, and practical AI platform for institutional deployment.
Conclusion
OWN-GPT demonstrates that a fully local, multimodal LLM system can be successfully developed and deployed within academic institutions. Its strong performance across text generation, OCR-based image understanding, and PDF-RAG-based document analysis makes it a valuable tool for teaching, research, and administration. The system’s cost-free operation post-deployment addresses major barriers faced by educational organizations.
References
[1] V. Visweswaraiah, \"Everyday AI: Real-World Applications of Transformer-Based Large Language Models,\" IJCTT, vol. 73, no. 9, pp. 19-27, 2025. [Online]: https://ijcttjournal.org
[2] A. Singh et al., \"Research Gaps in Developing Fair and Inclusive LLMs for India\'s Agriculture Sector,\" IRJET, vol. 12, no. 11, pp. 550-556, 2025.
[3] M. A. Khan, R. Sharma, S. Patel, \"Large Language Models: An Overview of Foundational Architectures, Recent Trends, and a New Taxonomy,\" Discover Applied Sciences, Springer Nature, doi:10.1007/s42452-025-07668-w, 2025.
[4] G. Yenduri et al., \"GPT (Generative Pre-Trained Transformer) — A Comprehensive Review on Enabling Technologies, Potential Applications, Emerging Challenges, and Future Directions,\" IEEE Access, vol. 12, pp. 54608-54649, doi:10.1109/ACCESS.2024.3387368, 2024.
[5] S. Suresh, M. Chandrika, \"Large Language Model (LLM): An AI Model for Pattern Recognition,\" IJERT, 2023.
[6] A. Gupta, R. Rao, S. Mehta, \"Advancements in Transformer Architectures for Large Language Models: From BERT to GPT-3 and Beyond,\" IRJMETS, vol. 6, no. 4, 2024.
[7] T. B. Brown et al., \"Language Models are Few-Shot Learners,\" Advances in NeurIPS, vol. 33, pp. 1877-1901, 2020.
[8] A. Vaswani et al., \"Attention Is All You Need,\" NeurIPS, vol. 30, arXiv:1706.03762, 2017.
[9] H. Touvron et al., \"LLaMA 2: Open Foundation and Fine-Tuned Chat Models,\" arXiv:2307.09288, 2023.
[10] S. Biderman et al., \"Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling,\" ICML 2023, arXiv:2304.01373.
[11] J. Devlin et al., \"BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding,\" NAACL-HLT 2019, pp. 4171-4186, doi:10.18653/v1/N19-1423.
[12] C. Raffel et al., \"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5),\" JMLR, vol. 21, no. 140, 2020.
[13] A. Radford et al., \"Language Models are Unsupervised Multitask Learners (GPT-2),\" OpenAI Blog, 2019.
[14] Z. T. Hamad et al., \"ChatGPT\'s Impact on Education and Healthcare,\" IEEE Access, vol. 12, pp. 114858-114876, doi:10.1109/ACCESS.2024.3437374, 2024.
[15] P. Liu et al., \"Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in NLP,\" ACM CSUR, vol. 55, no. 9, doi:10.1145/3560815, 2023.
[16] M. Ansari et al., \"Intelligent Chatbot,\" IJERT, vol. 10, no. 3, 2021.
[17] A. Binu et al., \"Chatbot Using Artificial Intelligence,\" IJERT, vol. 12, no. 6, 2023.
[18] M. Senthil Kumar et al., \"An Automated Chatbot for an Educational Institution using NLP,\" IJCRT, vol. 10, no. 5, pp. 142-150, 2022.