The text discusses the development of an AI-powered chatbot designed to improve information retrieval from college websites. Traditional keyword-based search systems often fail to understand the meaning of user queries, leading to inaccurate or incomplete results. To address this issue, the proposed system uses Natural Language Processing (NLP) and Large Language Models (LLMs) within a Retrieval-Augmented Generation (RAG) framework. This approach combines semantic retrieval with language generation to provide accurate, context-based responses grounded in institutional data.
The literature review highlights the evolution of chatbots from rule-based systems to advanced transformer-based models such as BERT and GPT. While modern generative models improve conversational abilities, they may produce incorrect information, known as hallucinations. RAG systems solve this problem by retrieving relevant documents before generating responses, improving reliability and accuracy. Semantic embeddings and vector similarity methods are also emphasized as effective techniques for identifying relevant information.
The methodology describes a modular system architecture involving data collection, text processing, semantic embedding, retrieval, and response generation. Institutional information is gathered through web scraping, processed into smaller text chunks, and transformed into vector embeddings using transformer-based models. These embeddings enable efficient semantic search and accurate response generation for user queries.
Conclusion
This paper presented the design and implementation of an AI chatbot for retrieving institutional information using a Retrieval-Augmented Generation architecture. The system integrates web scraping, semantic embeddings, cosine similarity retrieval, and a local Large Language Model to deliver context-aware responses to user queries. The results demonstrate that combining semantic retrieval with generative models provides a more effective approach to institutional information access than traditional keyword-based search systems. The modular architecture of the system allows for future scalability and integration with advanced vector databases and cloud-based language models.
Future work may include implementing scalable vector indexing methods such as FAISS, integrating cloud-based LLM APIs for improved response quality, and enabling automated data ingestion pipelines to keep the knowledge base updated.
References
[1] P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Advances in Neural Information Processing Systems (NeurIPS), 2020.
[2] J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Proceedings of NAACL-HLT, 2019.
[3] A. Vaswani et al., “Attention Is All You Need,” Advances in Neural Information Processing Systems, 2017.
[4] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT Networks,” Proceedings of EMNLP, 2019.
[5] T. Brown et al., “Language Models are Few-Shot Learners,” Advances in Neural Information Processing Systems, 2020.
[6] Meta AI, “LLaMA: Open and Efficient Foundation Language Models,” 2023.
[7] HuggingFace, “Sentence Transformers: Multilingual Sentence Embeddings,” https:// www.sbert.net
[8] GPT4All Documentation, “Running Local Large Language Models,” https://docs.gpt4all.io