The digital era has witnessed an unprecedented explosion in user-generated textual data, primarily driven by the rapid expansion of e-commerce and online entertainment platforms. In this landscape, consumer reviews have emerged as a critical form of social proof, significantly influencing purchasing decisions and brand reputation. However, this influence has led to the proliferation of deceptive opinion spam, fake reviews systematically crafted to manipulate product ratings or damage competitors. Detecting these fraudulent entries is a significant challenge because they are often written to mimic genuine human feedback closely. Traditional manual verification is inefficient and practically impossible given the massive daily volume of data.
This study proposes a robust, automated deep learning framework for detecting fake reviews using Natural Language Processing (NLP) and Bidirectional Long Short-Term Memory (BILSTM) networks. While earlier systems relied on traditional machine learning algorithms such as Naïve Bayes and Support Vector Machines (SVM), these methods often failed to capture deep contextual dependencies and the sequential nature of human language. To address these limitations, the proposed model utilises a BILSTM architecture that processes review text in both forward and backward directions simultaneously. This dual-directional approach enables the system to understand the complete semantic context and identify subtle deceptive patterns, such as exaggerated praise, generic phrasing, and unnatural emotional shifts.
The methodology uses a comprehensive e-commerce dataset consisting of approximately 48,800 labelled reviews, evenly distributed between “Real” and “Fake” classes. A rigorous preprocessing pipeline is applied, including HTML tag removal, text normalisation, tokenisation, stop-word removal, and Porter stemming to clean and standardise the raw text. Furthermore, word embedding techniques are employed to convert textual data into dense 128-dimensional vector representations.
Experimental results demonstrate the effectiveness of the proposed model. The BILSTM model achieved a test accuracy of 91.90%, outperforming traditional baseline approaches. It also achieved a precision score of 0.93 for the “Fake” class, indicating a low rate of false positives. Confusion matrix analysis further confirms the model’s balanced performance and robustness across various product categories and review patterns.
This research concludes that sequential deep learning models provide a scalable and highly accurate solution for maintaining the integrity of online marketplaces. Future work includes incorporating transformer-based architectures such as BERT and extending the model to support multilingual datasets.
Introduction
This project presents a deep learning-based fake review detection system that identifies deceptive product reviews on e-commerce platforms using a Bidirectional Long Short-Term Memory (BiLSTM) model. As online reviews increasingly influence consumer decisions, detecting fake reviews has become essential for maintaining trust and preventing misinformation.
Traditional machine learning methods such as Naïve Bayes, SVM, and TF-IDF rely on manual feature engineering and struggle to capture contextual meaning, sarcasm, negation, and sequential language patterns. To overcome these limitations, the proposed system employs a BiLSTM network, which processes text in both forward and backward directions to better understand context and improve classification accuracy.
The methodology includes:
Dataset: Approximately 48,800 balanced e-commerce reviews (real and fake).
Preprocessing: Removal of duplicates and missing values, lowercasing, HTML and punctuation removal, tokenization, stop-word removal, stemming, and label encoding.
Text Representation: Tokenization, sequence padding (length 100), and 128-dimensional word embeddings.
Model Architecture: Embedding layer, two BiLSTM layers, dropout regularization, dense layer, and sigmoid output layer.
Training: Adam optimizer with Binary Cross-Entropy loss and early stopping to prevent overfitting.
Evaluation Metrics: Accuracy, precision, recall, and F1-score.
Experimental results demonstrate that the proposed model achieves 91.9% test accuracy, outperforming many traditional machine learning approaches by effectively capturing contextual and sequential information in text.
Conclusion
A working model was developed to categorise e-commerce product feedback into Real or Fake categories using deep learning. Built around a Bidirectional Long Short-Term Memorynetwork, the system handles word sequences naturally, identifying deceptive patterns across phrases as sentences unfold.
A. Summary of the Research Work
This study presented a context-aware detection approach using both an optimised Logistic Regression baseline and an advanced BILSTM model. The methodology included cleaning raw textual data through natural language techniques like stemming, stop-word removal, and tokenisation. The Bilt model, specifically designed to process words in both forward and backward directions, achieved a test accuracy of 91.90%, outperforming the traditional baseline models.
The key findings of this research highlight the effectiveness of deep learning techniques in detecting deceptive reviews. Sequential models such as BiLSTM demonstrate superior performance in text classification compared to traditional methods that rely on simple word frequency, as they can better capture contextual relationships within the text. The use of bidirectional processing enables the model to analyze text both forward and backward, allowing it to identify subtle inconsistencies and unnatural patterns often present in fake reviews. Additionally, preprocessing plays a crucial role in improving model performance, as removing noise such as irrelevant symbols and meaningless words enhances the clarity of input data without altering the core methodology. Overall, the consistent results indicate that the system successfully recognizes deceptive linguistic patterns, achieving high precision and reliability in distinguishing genuine opinions from spam.
B. Future Scope
Future work can further enhance the system\'s accuracy and applicability through several planned developments:
The future scope of this research includes several enhancements to improve the effectiveness and applicability of the model. Advanced architectures such as transformer-based models like BERT, RoBERTa, or GPT can be explored to further enhance precision in detecting more sophisticated and subtle forms of deceptive reviews. Expanding the system to support multiple languages by retraining it on localized datasets would allow it to be used across global platforms, overcoming the current limitation of English-only analysis. Additionally, moving beyond simple binary classification to more granular labeling can help distinguish between different types of spam, such as bot-generated content and paid human reviews. The integration of hybrid models, combining CNNs and BiLSTM, can further improve performance by capturing both local word relationships and broader contextual patterns within sentences. Overall, this approach holds strong potential for real-world deployment in areas such as real-time e-commerce moderation, social media brand monitoring, and customer trust evaluation systems.
References
[1] S. K. DUVVURI, Applications of Artificial Intelligence Across Domains . Commissionerate of Collegiate Education, Government of Andhra Pradesh , 2026. doi: 10.5281/zenodo.18623057.
[2] J. A. Chevalier and D. Mayzlin, “The Effect of Word of Mouth on Sales: Online Book Reviews,” Journal of Marketing Research, vol. 43, no. 3, pp. 345–354, 2006, doi: 10.1509/jmkr.43.3.345.
[3] M. Luca, “Reviews, reputation, and revenue: The case of Yelp.com,” Harvard Business School Working Paper, no. 12–016, 2016, [Online]. Available: https://ssrn.com/abstract=1928601
[4] A. Mukherjee, V. Venkataraman, B. Liu, and N. Glance, “What Yelp Fake Review Filter Might Be Doing?,” in Proceedings of the International AAAI Conference on Web and Social Media (ICWSM), 2013. [Online]. Available: https://ojs.aaai.org/index.php/ICWSM/article/view/14432
[5] D. Jurafsky and J. H. Martin, Speech and Language Processing, 3rd ed. draft. Stanford University, 2023. [Online]. Available: https://web.stanford.edu/~jurafsky/slp3/
[6] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. Cambridge University Press, 2008. [Online]. Available: https://nlp.stanford.edu/IR-book/
[7] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, 2019, pp. 4171–4186.
[8] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” arXiv preprint arXiv:1301.3781, 2013.
[9] B. Liu, Sentiment analysis and opinion mining. Morgan & Claypool Publishers, 2012. doi: 10.2200/S00416ED1V01Y201204HLT016.
[10] B. Pang and L. Lee, “Opinion Mining and Sentiment Analysis,” Foundations and Trends in Information Retrieval, vol. 2, no. 1–2, 2008, doi: 10.1561/1500000011.
[11] G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Inf. Process. Manag., vol. 24, no. 5, pp. 513–523, 1988, doi: 10.1016/0306-4573(88)90021-0.
[12] H. Li, Z. Chen, B. Liu, W. Wei, and J. Shao, “Learning to identify review spam,” in Proceedings of the 22nd International Joint Conference on Artificial Intelligence, 2011, pp. 2488–2493.
[13] S. Feng, R. Banerjee, and Y. Choi, “Syntactic stylometry for deception detection,” in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 2012, pp. 171–175.
[14] T. Joachims, “Text categorization with support vector machines: Learning with many relevant features,” in European Conference on Machine Learning, Springer, 1998, pp. 137–142. doi: 10.1007/bfb0026683.
[15] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997, doi: 10.1162/neco.1997.9.8.1735.
[16] D. P. Patinavalasa and D. Suneel Kumar, “Scalable Email Spam Detection Using BiLSTM with Large-Scale Hybrid Datasets,” International Journal Of Recent Trends In Multidisciplinary Research, p. 96, Mar. 2026, doi: 10.59256/ijrtmr.20260602016.
[17] A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM and other neural network architectures,” Neural Networks, vol. 18, no. 5–6, pp. 602–610, 2005, doi: 10.1016/j.neunet.2005.06.042.
[18] N. Jindal and B. Liu, “Opinion spam and analysis,” in Proceedings of the 2008 International Conference on Web Search and Data Mining, 2008, pp. 219–230. doi: 10.1145/1341531.1341560.
[19] J. Li, M. Ott, C. Cardie, and E. Hovy, “Towards a General Rule for Identifying Deceptive Opinion Spam,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014, pp. 1566–1576. [Online]. Available: https://aclanthology.org/P14-1147/
[20] K. S. Jones, “A statistical interpretation of term specificity and its application in retrieval,” Journal of Documentation, vol. 28, no. 1, pp. 11–21, 1972, doi: 10.1108/eb026526.
[21] A. Y. Ng, “Feature selection, L1 regularization, and rotational invariance,” in Proceedings of the twenty-first international conference on Machine learning, 2004, p. 78. doi: 10.1145/1015330.1015435.
[22] H. Li, Z. Chen, A. Mukherjee, B. Liu, and J. Shao, “Analyzing and detecting opinion spam on a large-scale dataset via temporal and product features,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2015. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/9266
[23] A. Mukherjee, V. Venkataraman, B. Liu, and N. Glance, “What Yelp Fake Review Filter Might Be Doing?,” in Proceedings of the International AAAI Conference on Web and Social Media, 2013, pp. 409–418. [Online]. Available: https://ojs.aaai.org/index.php/ICWSM/article/view/14432
[24] S. Rayana and L. Akoglu, “Collective Opinion Spam Detection: Bridging Review Networks and Metadata,” in Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2015, pp. 985–994. doi: 10.1145/2783258.2783370.
[25] J. Camacho-Collados and others, “A survey on deep learning for fake review detection,” arXiv preprint arXiv:1912.01234, 2019.
[26] TensorFlow Team, “TensorFlow Documentation,” 2024.
[27] Keras Team, “Keras Documentation,” 2024.