A Hybrid Intelligent Framework for Library Demand Forecasting and Learning Path Recommendation Using Knowledge Graphs and Retrieval-Augmented Generative AI
Authors: Mrs. Ilakkia, Harija R, Daamini A, Lokitha K
Libraries today generate large volumes of transactional data through book issue and return activities, yet most existing management systems lack the intelligence to translate this data into actionable insights. This paper presents a Smart Library Usage and Demand Forecasting System, a cloud-native, end-to-end platform built on Amazon Web Services (AWS) that processes book transactions in real time, stores and analyzes data at scale, visualizes usage trends, and applies machine learning to forecast book demand. The architecture integrates API Gateway and AWS Lambda for serverless transaction handling, Amazon S3 for tiered data storage, AWS Glue and Apache Spark for data transformation, Amazon Athena for ad-hoc analytics, and Amazon QuickSight for interactive dashboards. An advanced knowledge graph layer maps book themes and authors into a semantic network, enabling a Generative AI assistant to recommend personalized Learning Paths rather than isolated book titles. Experimental results demonstrate significant improvements in prediction accuracy, operational efficiency, and patron engagement, offering a scalable blueprint for next-generation library management.
Introduction
The text discusses the development of a Smart Library Usage and Demand Forecasting System that modernizes traditional library management using cloud computing, machine learning, and generative artificial intelligence. Traditional library systems mainly rely on relational databases and manual cataloging methods, which are inefficient for handling modern user expectations such as instant access, personalized recommendations, and intelligent digital services.
The proposed system aims to transform library management from a reactive process into a predictive and intelligent platform. It uses a layered AWS-based cloud architecture to capture and process library transactions, forecast book demand, and provide personalized learning recommendations through AI-powered assistance.
The literature review highlights previous work in digital library systems, cloud-based platforms, machine learning forecasting, knowledge graphs, and generative AI. Earlier systems improved operational efficiency but lacked predictive analytics and personalization. Machine learning models such as ARIMA, LSTM, and forecasting systems demonstrated the ability to predict book demand, while knowledge graphs improved recommendation quality. However, the integration of real-time data pipelines, forecasting, knowledge graphs, and conversational generative AI into a unified platform remained largely unexplored.
The proposed system architecture consists of several scalable cloud-native layers:
Ingestion Layer:
Amazon API Gateway and AWS Lambda functions handle user activities such as book issue requests, returns, and search queries. The system validates and enriches transaction data before storing it in Amazon S3.
Storage Layer:
Amazon S3 is used for both operational and analytical storage. Raw transaction records are stored temporarily before being archived, while transformed analytical data is stored in optimized Parquet format for efficient querying and cost reduction.
Processing Layer:
AWS Glue and Apache Spark perform ETL operations, including schema validation, data quality checks, metadata enrichment, and feature generation for machine learning models.
Analytics and Visualization Layer:
Amazon Athena enables serverless SQL queries on analytical data, while Amazon QuickSight provides dashboards displaying circulation trends, genre popularity, patron behavior, and demand forecasts.
The system also includes a robust data pipeline and storage design. Transaction records contain details such as transaction ID, patron ID, ISBN, transaction type, timestamp, and library branch information. Data quality rules ensure integrity and prevent corrupt records from entering the analytical layer.
For demand forecasting, the system uses machine learning techniques. Features such as historical issue counts, rolling averages, seasonal trends, academic calendar indicators, genre information, and patron ratings are extracted from transaction data. After comparing several models including ARIMA, Prophet, LSTM, and XGBoost, the XGBoost model was selected because it achieved the best forecasting performance with low prediction error. The model predicts future demand for books and automatically generates procurement recommendations when forecasted demand exceeds available copies.
The system also integrates a personalized knowledge graph and Generative AI assistant. The knowledge graph stores relationships between books, authors, subjects, and learning competencies. Using this structured knowledge, the AI assistant can generate personalized, context-aware learning paths and recommendations rather than simple single-book suggestions.
Overall, the Smart Library Usage and Demand Forecasting System combines cloud-native infrastructure, machine learning, analytics, and generative AI to create an intelligent library platform. The system improves operational efficiency, enhances user experience, supports proactive resource planning, and provides personalized educational guidance for library patrons.
Conclusion
This paper presented a Smart Library Usage and Demand Forecasting System that unifies serverless cloud ingestion, scalable data pipelines, machine learning forecasting, and generative AI-driven recommendations into a production-ready platform. By building on AWS services including API Gateway, Lambda, S3, Glue, Athena, Spark, and QuickSight, the system achieves the scalability, cost efficiency, and operational simplicity required for institutional deployment.
The ML demand forecasting module demonstrated practical accuracy, correctly anticipating semester-driven demand surges with sufficient lead time to inform procurement decisions. The knowledge graph and Learning Path assistant represent a meaningful advancement over conventional library recommenders, shifting the patron experience from reactive search to guided, sequenced learning.
Future work will explore federated learning to aggregate circulation signals across institutions without sharing raw patron data, integration of multimodal book content embeddings for richer similarity computation, and the extension of the knowledge graph to support cross-library interlibrary loan optimization. The open architecture described herein provides a solid foundation for these extensions and for the broader mission of transforming libraries into intelligent knowledge discovery platforms.
References
[1] R. Breeding, “Library systems report 2022,” American Libraries, vol. 53, no. 5, pp. 22–35, May 2022.
[2] A. Kumar and S. Mehta, “Migrating academic library systems to the cloud: A case study,” Journal of Library & Information Technology, vol. 41, no. 3, pp. 180–188, 2021.
[3] Y. Zhang, L. Chen, and W. Liu, ”Book demand forecasting using LSTM neural networks in public libraries,” in Proc. IEEE Int. Conf. Big Data, 2020, pp. 1452–1459.
[4] F. Liang, X. Wang, and H. Zhao, “Building a library knowledge graph from bibliographic metadata,” Information Processing & Management, vol. 58, no. 2, 2021.
[5] T. Brown et al., “Language models are few-shot learners,” in Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, 2020.
[6] Amazon Web Services, “AWS Lambda Developer Guide,” [Online]. Available: https://docs.aws.amazon.com/lambda/
[7] Amazon Web Services, “Amazon QuickSight User Guide,” [Online]. Available: https://docs.aws.amazon.com/quicksight/
[8] O. Vinyals and Q. Le, “A neural conversational model,” arXiv preprint arXiv:1506.05869, 2015.
[9] P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQUAD: 100,000+ questions for machine comprehension of text,” in Proc. EMNLP, 2016, pp. 2383–2392.
[10] S. Merity, C. Xiong, J. Bradbury, and R. Socher, “Pointer sentinel mixture models,” in Proc. ICLR, 2017.