A Comparative Survey on Conversational LLM-Based Multi-Modal Product Recommendation Systems

Authors: Aditya M. Bhosale, Nihal K. Barkade, Akshay V. Choudhari, Sandeep Warhade

DOI Link: https://doi.org/10.22214/ijraset.2025.75205

Abstract

The exponential expansion of e-commerce platforms has increased accessibility but has also introduced challenges for users attempting to locate reliable, unbiased, and well- structured product information. Existing recommendation engines predominantly rely on static filtering algorithms that operate on historical behavioral data. These systems lack the ability to interpret user intent, adapt to contextual preferences, or justify their recommendations. As a result, users frequently invest significant time comparing products across platforms, manually validating reviews, and navigating inconsistent or biased information. This research proposes a Conversational LLM-Based Multi-Modal Product Research Assistant—an intelligent system that enables interactive, context-aware, and evidence-driven product exploration. The system integrates a domain-adapted Large Language Model (LLM) with multi-modal reasoning capabilities to analyze textual and visual product content. It leverages a Retrieval-Augmented Generation (RAG) pipeline to access real-time, verified data from e-commerce sources, ensuring factual consistency and transparency in the generated responses. The assistant also incorporates unbiased comparison logic and sustainability-aware evaluation aligned with the United Nations Sustainable Development Goal (SDG) 12: Responsible Consumption and Production. The outcome of this work is a scalable web-based assistant that supports natural conversational querying, displays relevant images and specifications, and enables users to make informed and responsible purchase decisions. The study demonstrates that integrating conversational AI with multi- modal retrieval can transform digital shopping into a transparent, personalized, and sustainable decision-making experience. Key- words: Conversational AI, Large Language Models (LLM), Multi- Modal Learn ing, Product Recommendation, Retrieval-Augmented Generation (RAG), Responsible Consumption, E-commerce.

Introduction

E-commerce has transformed into a vast, globally connected digital ecosystem, but the abundance of products and data often leads to information overload, inconsistent specifications, and biased reviews. Traditional recommendation systems—based on collaborative or content-based filtering—struggle to interpret user intent, reason contextually, or perform transparent cross-platform comparisons. Recent advances in Large Language Models (LLMs) and multimodal learning offer a solution by enabling natural-language interaction, visual understanding, and synthesis of heterogeneous data sources. The analyzed system, a Conversational LLM-Based Multi-Modal Product Research Assistant, uses API-driven data retrieval, Retrieval-Augmented Generation (RAG), and sustainability metrics to provide unbiased, real-time, and environmentally conscious product comparisons. This bridges the gap between conversational AI and responsible, personalized e-commerce decision-making.

A comprehensive literature survey highlights broad applications of web scraping, NLP, and AI in domains such as real estate, retail pricing, health research, academic recommendation, and advanced recommender systems:

Real-estate research shows that web-scraped housing listings can act as real-time economic indicators, revealing market changes faster than official datasets.
NLP-enhanced web scraping facilitates the conversion of unstructured textual data into structured insights for sentiment, trend, and entity analysis.
Dynamic pricing models use competitor price scraping to optimize online shop revenue.
Product comparison engines automate cross-platform aggregation of product details to reduce decision fatigue.
Scholarly recommendation tools leverage BERT embeddings to retrieve semantically relevant literature.
Health research frameworks integrate scraped text and spatial analytics to generate contextual health indicators.
Retail studies emphasize web data’s value and provide methodological frameworks for broader adoption.
Advanced recommender systems, including contrastive-learning-based knowledge-aware models (MCCLK), improve robustness under sparse data conditions.
LLM-as-a-Judge frameworks automate evaluation of structured search queries using LLM reasoning.
Hybrid conversational recommenders (CARE) combine collaborative filtering with LLM-based reranking to enhance personalized recommendations.
Multimodal systems merge text and image features through co-attention mechanisms to improve rating prediction accuracy.

Collectively, these studies show a strong shift toward integrating web-scraped data, semantic understanding, and advanced AI models to enhance decision-making across e-commerce, research, public health, and digital retail analytics.

Conclusion

This survey presented an extensive comparative analysis of research spanning web data acquisition, deep learning–driven recommender systems, and emerging Large Language Model (LLM)-based multimodal reasoning frameworks. The findings reveal an evident evolution in the design of intelligent product exploration and recommendation systems. Traditional rule- based scraping and filtering methods provided access to product information but lacked contextual awareness, adaptability to platform changes, and interpretability. Deep-learning ap- proaches, particularly those leveraging representation learning, knowledge graphs, and multimodal fusion, enabled stronger personalization and semantic modeling but often suffered from computational overhead, sparse supervision, and limited transparency in decision logic. More recent advancements employing LLMs and multimodal architectures demonstrate significant progress toward conversa- tional and explainable recommendation ecosystems. Techniques such as Retrieval-Augmented Generation (RAG), cross-modal embedding alignment, and LLM-based re-ranking enable systems to reason jointly over structured and unstructured sources, integrating textual reviews, visual content, and product metadata. However, these systems still face challenges involving hallucination control, inference latency, domain adaptation, and the need for efficient bias-aware optimization to ensure trustworthiness and fairness.

References

[1] J.-C. Bricongne, B. Meunier, and S. Pouget, “Web- scraping housing prices in real-time: The Covid-19 crisis in the UK,” Journal of Housing Economics, 2023 [2] V. Pichiyan, S. Muthulingam, G. Sathar, et al., “Web Scraping using Natural Language Processing: Exploiting Unstructured Text for Data Extraction and Analysis,” Procedia Computer Science, vol. 230, pp. 193–202, 2023. [3] O. Jorge, A. Pons, J. Rius, C. Vintro´, J. Mateo, and J. Vilaplana, “Increasing online shop revenues with web scraping: A case study for the wine sector,” British Food Journal, vol. 122, no. 11, pp. 3383–3401, 2020. [4] M. D’Souza, S. Desai, D. Agrawl, F. Joshi, et al., “Web scraping based product comparison model for e- commerce website,” Journal of Emerging Technologies and Innovative Research, vol. 11, no. 4, 2024. [5] P. Bandi, “Advanced Google Scholar Scraper: A content- based filtering approach for literature recommendation using BERT,” M.Sc. research project, National College of Ireland, 2023 [6] P. Galvez-Hernandez, A. Gonzalez-Viana, L. Gonzalez- de Paz, et al., “Generating contextual variables from web-based data for health research,” JMIR Public Health Surveillance, vol. 10, 2024. [7] J. Y. Guyt, H. Datta, and J. Boegershausen, “Unlocking the potential of web data for retailing research,” Journal of Retailing, vol. 100, pp. 130–147, 2024. for LLM-Based Conversational Recommendation,” arXiv preprint arXiv:2508.13889, 2025. [8] J.-K. Kim et al., “A Knowledge Graph and Large Language Model-Based Hybrid Recommender System,” in Advances in Intelligent Systems and Computing, Springer, 2023. [9] Y. Wang, Z. Chu, X. Ouyang, et al., “Enhancing Recom- mender Systems with Large Language Model Reasoning Graphs,” arXiv preprint arXiv:2308.10835, 2023. [10] X. Ren, W. Wei, L. Xia, et al., “Representation Learning with Large Language Models for Recommendation,” arXiv preprint arXiv:2310.15950, 2023. [11] A. Nigam, P. Jain, and A. Gupta, “Semantic Product Search for Matching Structured Product Catalogs in E- Commerce,” arXiv preprint arXiv:2008.08180, 2020. [12] Y. Zhang, X. Liu, and C. Yu, “Learning Variant Product Relationship and Variation Attributes from E-Commerce Website Structures,” in Proc. ACM Web Conference (WWW), 2022. [13] H. Chen, M. Zhang, and D. Song, “Effective Product Schema Matching and Duplicate Detection with Large Language Models,” arXiv preprint arXiv:2104.09576, 2021 [14] X. Wang, L. Zhao, and R. Li, “AE-smnsMLC: Multi- Label Classification with Semantic Matching and Nega- tive Label Sampling for Product Attribute Value Extrac- tion,” arXiv preprint arXiv:2310.07137, 2023. [15] A. Brinkmann, R. Shraga, and C. Bizer, “ExtractGPT: Exploring the Potential of Large Language Models for Product Attribute Value Extraction,” arXiv preprint arXiv:2310.12537, 2024. [16] C. Fang, X. Li, Z. Fan, et al., “LLM-Ensemble: Optimal Large Language Model Ensemble Method for Product Attribute Value Extraction,” in Proc. 47th Int. ACM SIGIR Conference, 2024. [17] Z. He, M. Huang, and Q. Lin, “LaTeX-Numeric: Language-Agnostic Text Attribute Extraction for E-Commerce Numeric Attributes,” arXiv preprint arXiv:2403.00863, 2024. [18] M. S. Baysan, S. Uysal, ?I. ?Is¸lek, C¸ . C¸ ?g? Karaman, and T. Gu¨ngo¨r, “LLM-as-a-Judge: Automated evaluation of search query parsing using large language models,” Frontiers in Big Data, vol. 8, 2025. [19] E. Jeong, X. Li, A. (Eunyoung) Kwon, S. Park, Q. Li, and J. Kim, “A Multimodal Recommender System Using Deep Learning Techniques Combining Review Texts and Images,” Applied Sciences, vol. 14, no. 20, 2024. [20] C. Li, Y. Deng, H. Hu, S.-K. Ng, M.-Y. Kan, and H.Li, “CARE: Contextual Adaptation of Recommenders

Copyright

Copyright © 2025 Aditya M. Bhosale, Nihal K. Barkade, Akshay V. Choudhari, Sandeep Warhade. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET75205

Publish Date : 2025-11-08

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here