The amount of reviews on shopping platforms is reallyhugeanditisveryhardtocheckthemallmanually.It is even harder when people write reviews that can make other peoplenottrusttheratings.ThispaperisaboutInsightFlow, a system that helps figure out if reviews are real or not and if products are good or not. It looks at what people write in their reviews to compare products.
To sort out reviews InsightFlow uses a combination of three things:LogisticRegression,BidirectionalLongShort-TermMem-ory and DistilBERT. This combination works well and is more accurate than using just one of them. It gets it right 92% of the time.
InsightFlow also has something called the Dynamic Trust and EvaluationIndex(DTEI).Thisisawaytomeasurehowmuch wecantrustareview.Itlooksatfivethings:howsurepeople are about what they are saying, if the review might be fake, if people are being emotional, how old the review is and if people are talking about the same things. Each of these things is figured out on its own.
Introduction
This paper presents InsightFlow, an AI-powered and explainable framework designed to improve product evaluation and recommendation using online customer reviews. Traditional e-commerce platforms often rely on star ratings, which can be misleading due to fake reviews, review manipulation, and the difficulty of analyzing thousands of customer comments. InsightFlow addresses these challenges by combining advanced Natural Language Processing (NLP) and Machine Learning techniques to provide more trustworthy and interpretable product rankings.
The framework integrates multiple modules, including ensemble sentiment analysis, fake review detection, emotion analysis, and aspect-based review analysis. Sentiment classification is performed using a hybrid model that combines Logistic Regression (TF-IDF), Bidirectional LSTM, and DistilBERT, achieving an overall accuracy of 92%, which outperforms individual models. Fake reviews are detected using Sentence-BERT embeddings and DBSCAN clustering, identifying approximately 18.5% of reviews as suspicious without requiring labeled training data.
A key contribution of the study is the development of the Dynamic Trust and Evaluation Index (DTEI), a mathematically defined score that evaluates the overall credibility of product reviews. DTEI combines five factors: Sentiment Confidence, Fake Review Reliability, Emotion Stability, Recency Weight, and Aspect Consistency. This multidimensional approach provides a more reliable assessment than traditional star ratings or sentiment-only methods.
The system also employs BERTopic for topic modeling and aspect extraction, allowing users to understand specific product strengths and weaknesses. Explainable AI principles are incorporated so that rankings are supported by clear reasons, such as review authenticity, emotional consistency, and aspect-level sentiment.
Experiments were conducted on approximately 5,056 Amazon Electronics reviews. Results showed that the ensemble sentiment model achieved 92% accuracy, while DTEI successfully distinguished trustworthy and untrustworthy review groups with a significant score separation. Comparative analysis demonstrated that DTEI-based rankings outperformed conventional star-rating and sentiment-based approaches by accounting for review authenticity and quality.
Conclusion
ThispaperpresentedInsightFlow,atrustaware,multidimensionalframeworkforintelligentproductevaluationgroundedincustomerreviewanalysis.Thesystemintegratesa hybridsentimentensemble(LogisticRegression,BiLSTM[6], DistilBERT [2]) achieving 0.920 accuracy, unsupervised fake reviewdetectionviaSentence-BERT[3]andDBSCAN[4]
identifying18.5%suspiciousreviews,direction-awareemotion analysis[7],BERTopic-based[5]aspectextractionidentifying 15producttopicclusters,andthenovelDynamicTrustand Evaluation Index (DTEI)—all within an explainable compari-son module designed in accordance with XAI principles [11]. The DTEI combines five mathematically grounded components—SentimentConfidence,FakeReviewReliability, EmotionStability,RecencyWeight,andAspectConsistency—achievingaseparationof0.262betweentrustedpositiveandnegativereviewpools(meansof0.755and0.493 respectively).ExperimentsonAmazonElectronicsreviews demonstratedthatDTEI-basedrankingproducesmeaningfully moretrustworthyproductrecommendationsthanstar-rating and sentiment-only baselines.A logarithmic confidence factor furtherensuresthatproductsbackedbylarger,moreconsistent reviewcorporaarerankedappropriatelyabovethosewith
limitedreviewevidence.
There are still plenty of ways this work could be improvedand taken further. For example, the FR approach could be tested against a properly labelled dataset of fake reviews to betterunderstandhowwellitperforms.Insteadofrelying on manually set DTEI weights, more adaptive methods like Bayesian optimisation or reinforcement learning could beused to fine-tune them automatically. The framework could also be expanded to handle multiple languages by usingcross-lingual transformer models. Another useful step would betofine-tuneDistilBERTspecificallyforelectronicsreviews tomakeitmoredomain-aware.Finally,addingsupportforreal-timereviewstreamsandcombiningopinionsfrom different platforms could make the system more practical and robust in real-world use..
References
[1] J.Devlin,M.-W.Chang,K.Lee,andK.Toutanova,“BERT:Pre-trainingof Deep Bidirectional Transformers for Language Understanding,” inProc. NAACL-HLT, Minneapolis, MN, 2019, pp. 4171–4186.
[2] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilledversion of BERT: smaller, faster, cheaper and lighter,” arXiv preprintarXiv:1910.01108, 2019.
[3] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddingsusing Siamese BERT-Networks,” in Proc. EMNLP-IJCNLP, Hong Kong,2019, pp. 3982–3992.
[4] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A Density-BasedAlgorithm for Discovering Clusters in Large Spatial Databases withNoise,” in Proc. 2nd Int. Conf. on Knowledge Discovery and DataMining (KDD), Portland, OR, 1996, pp. 226–231.
[5] M.Grootendorst,“BERTopic:Neuraltopicmodelingwithaclass-basedTF-IDF procedure,” arXiv preprint arXiv:2203.05794, 2022.
[6] S.HochreiterandJ.Schmidhuber,“LongShort-TermMemory,”NeuralComputation, vol. 9, no. 8, pp. 1735–1780, 1997.
[7] F. A. Acheampong, C. Wenyu, and H. Nunoo-Mensah, “Text-basedEmotion Detection: Advances, Challenges, and Opportunities,” Engi-neering Reports, vol. 2, no. 7, p. e12189, 2020.
[8] N. Jindal and B. Liu, “Opinion Spam and Analysis,” in Proc. ACM Int.Conf. on Web Search and Data Mining (WSDM), Stanford, CA, 2008,
[9] pp.219–230.
[10] L.Zhang,S.Wang,andB.Liu,“DeepLearningforSentimentAnalysis:A Survey,” WIREs Data Mining and Knowledge Discovery, vol. 8, no.4, p. e1253, 2018.
[11] S.M.MudambiandD.Schuff,“WhatMakesaHelpfulOnlineReview?A Study of Customer Reviews on Amazon.com,” MIS Quarterly, vol.34, no. 1, pp. 185–200, 2010.
[12] A. B. Arrieta et al., “Explainable Artificial Intelligence (XAI): Con-cepts,taxonomies,opportunitiesandchallengestowardresponsibleAI,”Information Fusion, vol. 58, pp. 82–115, 2020.
[13] M. Das and N. Mehta, “Detecting Fake and Duplicate Reviews UsingSimilarityMetrics,”inProc.IEEEConf.onAdvancesinComputingandCommunication (ICAC), 2024, pp. 95–101.
[14] A. Verma and D. Patel, “Spam Review Detection Using SemanticSimilarity and Pattern Analysis,” in Proc. IEEE Int. Conf. on DataMining Workshops (ICDMW), 2023, pp. 310–317.
[15] L. Zhou and Y. Zhao, “Customer Opinion Summarization Using NLPand Statistical Modeling,” Information Processing & Management, vol.61, no. 1, pp. 102–118, 2024.