Book recommendation systems play a crucial role in enhancing user experience by suggesting books tailored to individual preferences. Traditional approaches, such as collaborative filtering and content-based filtering, have limitations, including cold-start issues and lack of diversity in recommendations. This paper proposes a hybrid book recommendation system that integrates content-based filtering with popularity-based filtering to generate personalized yet diverse book suggestions. The system utilizes vector similarityfor content-based recommendations while leveraging user engagement metrics to identify popular books. The proposed model is evaluated based on recommendation effectiveness, diversity, and user engagement. Results demonstrate that the hybrid approach improves personalization while mitigating the limitations of individual filtering methods. Future work includes integrating deep learning techniques and natural language processing (NLP) for further optimization.
Introduction
Overview
This research presents a hybrid book recommendation system that combines collaborative filtering and popularity-based filtering to improve accuracy, personalization, and diversity in book suggestions. It aims to overcome the limitations of traditional methods—like the cold-start problem, lack of diversity, and data sparsity—by integrating different filtering strategies and using a robust, modular web application.
Key Components and Methods
1. Recommendation Techniques
Content-Based Filtering: Suggests books based on user’s past preferences and book metadata (e.g., genre, author). It struggles with limited diversity and cold-start issues.
Collaborative Filtering: Uses ratings from similar users to predict preferences. It is powerful but affected by data sparsity and requires a large volume of interactions.
Popularity-Based Filtering: Recommends highly rated or frequently rated books. It's simple and effective for trending books but lacks personalization.
Hybrid Approach: The proposed system combines popularity-based filtering (to recommend high-rated books) and collaborative filtering (to personalize suggestions based on user behavior).
System Implementation
Frontend: Built with React and Tailwind CSS for user interaction.
Backend: Implemented in Flask (Python) to handle recommendation logic.
Microservice (Express + MongoDB): Manages user authentication, book data, and reading history.
Data and Preprocessing
Dataset Source: Kaggle Book Recommendation Dataset.
Books Dataset: Contains book metadata (ISBN, title, author, etc.).
Ratings Dataset: Used to understand user preferences (filtered to include books with ≥50 ratings and users with ≥200 ratings).
Users Dataset: Used to focus on experienced users.
Preprocessing: Merged datasets, normalized ratings, and filtered out unreliable data to enhance recommendation reliability.
Filtering Criteria
Book Threshold: Books with fewer than 50 ratings are excluded.
User Threshold: Only users with ≥200 ratings are considered.
Popularity Filtering: Books with an average rating and at least 250 ratings are shown in the Top Books Section.
Vector Representation: Books are represented in an 810-dimensional space for similarity analysis using Euclidean distance.
Recommendation Model Architecture
Popularity-Based Filtering:
Ranks books by average rating.
Used to recommend universally liked books (Top 50).
Collaborative Filtering:
Represents books as user-rating vectors.
Uses Euclidean distance to measure book similarity.
Focuses on personalized suggestions.
Hybrid Model:
Blends popularity and collaborative filtering for balanced, diverse recommendations.
Evaluation Metrics
Precision and Recall: Proposed for future feedback-based evaluation.
Mean Reciprocal Rank (MRR): Measures how early relevant books appear in recommendation lists.
Cosine Similarity: Mentioned as an alternative similarity metric but not used; system primarily uses Euclidean distance.
Conclusion
In this paper, we developed and analyzed a book recommendation system incorporating popularity-based filtering and collaborative filtering techniques to enhance personalized reading suggestions. The system was designed to provide general recommendations to all users while also tailoring recommendations for authenticated users based on their reading history.
A. Summary of Findings
• Popularity-based filtering effectively highlighted the most highly rated books among users but lacked personalization.
• Collaborative filtering using Euclidean distance provided personalized recommendations by identifying similar books based on user rating behavior.
• The dataset preprocessing techniques ensured that only books with a minimum threshold of ratings and reviews were considered, improving the reliability of the recommendations.
• The system successfully generated top 50 books based on average ratings and personalized recommendations using book similarity in an 810-dimensional vector space.
B. Limitations
Despite the success of the model, some challenges were observed:
• Cold Start Problem: Users with no prior reading history received limited recommendations.
• Computational Complexity: The collaborative filtering approach with Euclidean distance may become computationally expensive with a large user base.
• Limited Diversity: Personalized recommendations often consisted of books similar to those already read, limiting exposure to diverse content.
C. Future Work
To improve recommendation effectiveness and scalability, future work will focus on:
• Hybrid Recommendation Systems: Combining collaborative filtering with content-based filtering or deep learning techniques for improved accuracy.
• Improved Similarity Metrics: Exploring cosine similarity, Pearson correlation, or neural embeddings for better recommendation performance.
• Real-time Recommendations: Optimizing model performance to provide real-time book recommendations with minimal latency.
• User Feedback Mechanisms: Allowing users to refine recommendations by explicitly liking or disliking suggested books.
References
[1] P. Resnick and H. R. Varian, “Recommender Systems,” Commun. ACM, vol. 40, no. 3, pp. 56–58, 1997.
[2] G. Adomavicius and A. Tuzhilin, “Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 6, pp. 734–749, 2005.
[3] B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl, “Item-Based Collaborative Filtering Recommendation Algorithms,” in Proc. 10th Int. Conf. World Wide Web, 2001, pp. 285–295.
[4] X. He, H. Zhang, M. Y. Kan, and T. S. Chua, “Fast Matrix Factorization for Online Recommendation with Implicit Feedback,” in Proc. 39th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., 2016, pp. 549–558.
[5] S. Zhang, L. Yao, A. Sun, and Y. Tay, “Deep Learning-Based Recommender System: A Survey and New Perspectives,” ACM Comput. Surv., vol. 52, no. 1, pp. 1–38, 2019.
[6] P. Lops, M. Gemmis, and G. Semeraro, “Content-Based Recommender Systems: State of the Art and Trends,” in Recommender Systems Handbook, Springer, 2011, pp. 73–105.
[7] M. J. Pazzani and D. Billsus, “Content-Based Recommendation Systems,” in The Adaptive Web, Springer, 2007, pp. 325–341.
[8] Y. Koren, R. Bell, and C. Volinsky, “Matrix Factorization Techniques for Recommender Systems,” Computer, vol. 42, no. 8, pp. 30–37, 2009.
[9] G. Linden, B. Smith, and J. York, “Amazon.com Recommendations: Item-to-Item Collaborative Filtering,” IEEE Internet Comput., vol. 7, no. 1, pp. 76–80, 2003.
[10] C. C. Aggarwal, Recommender Systems, Springer, 2016.
[11] F. Ricci, L. Rokach, and B. Shapira, Recommender Systems Handbook, 2nd ed., Springer, 2015.
[12] R. Burke, “Hybrid Recommender Systems: Survey and Experiments,” User Model. User-Adapt. Interact., vol. 12, pp. 331–370, 2002.