Sentiment Analysis of Amazon Product Reviews: Leveraging NLP Techniques for Enhanced Classification Accuracy

Authors: Ashish Mathur, Dr. Pharindara Kumar Sharma

DOI Link: https://doi.org/10.22214/ijraset.2025.67634

Abstract

This work tackles sentiment classification on product reviews using a hybrid LSTM-GRU model. The primary objective is to evaluate the model\'s ability to correctly categories sentiments (positive, negative, and neutral) in a massive review dataset including 568,454 rows & 10 columns of product reviews. Data collecting from an online retailer\'s review system forms part of the approach, then additional pre-processing activities including text cleaning, tokenising, and padding for sentiment analysis preparation follow. Aimed to capture intricate sentiment patterns from textual data, the hybrid LSTM-GRU architecture groups the reviews. The model is defined by its accuracy and loss, and its performance is evaluated using metrics such as recall, F1 score, and precision. A test accuracy of 82.50% is achieved by the model, suggesting high sentiment classification performance, with a loss value of 1.56, according to the results. These results suggest potential for real-time sentiment analysis applications in e-commerce systems since they show that the hybrid LSTM-GRU model efficiently detects sentiment trends inside product reviews. The results highlight the great generalising capacity of the model, thereby reducing prediction error and offering correct sentiment classifications over several review data.

Introduction

1. Introduction & Purpose:
Sentiment classification of product reviews is essential for understanding customer feedback, guiding purchasing decisions, and enhancing product offerings on e-commerce platforms like Amazon. Given the massive volume of reviews, manual analysis is impractical. Thus, automated sentiment analysis using NLP (Natural Language Processing) is crucial for classifying reviews as positive, negative, or neutral.

2. Techniques & Challenges:
The process typically involves:

Data preprocessing (cleaning text, removing stopwords, lemmatization, stemming).
Feature extraction using methods like TF-IDF, Word2Vec, or GloVe.
Model training using traditional ML models like Logistic Regression, SVM, and Naïve Bayes.

However, newer deep learning methods such as CNNs, RNNs (LSTM, GRU), and transformer-based models like BERT significantly outperform traditional models by better capturing contextual meaning and handling nuances like sarcasm, slang, and mixed sentiments.

3. Literature Review:
Studies reveal that:

LSTM and GRU achieved F1 scores up to 91%.
Logistic Regression with BoW showed 89% accuracy in some cases.
BERT consistently outperformed models like T5, VADER, and traditional ML methods in handling nuanced language.
Google PaLM, a large language model, showed superior performance in classifying complex sentiments in Amazon fashion reviews.

4. Methodology:
A hybrid LSTM-GRU deep learning model was developed using a dataset of 568,504 Amazon product reviews. Key steps include:

Data cleaning and preprocessing, including tokenization, stopword removal, and padding to a fixed length.
Exploratory Data Analysis (EDA) revealed that most reviews were positive (77.7%), and most users posted only one review.
Model architecture includes embedding layers, LSTM and GRU layers, a dropout layer to prevent overfitting, and a final dense layer with softmax for multi-class classification.

5. Results & Performance:

The hybrid LSTM-GRU model achieved high accuracy and low loss, indicating strong performance in sentiment classification.
Its ability to learn complex patterns and contextual information led to precise and reliable sentiment predictions.
The model’s success suggests practical applications in e-commerce, customer service analytics, and social media monitoring.

Key Insights:

BERT and transformer-based models are state-of-the-art for sentiment analysis due to their contextual understanding.
Deep learning models (especially hybrid ones) significantly outperform traditional ML models.
Effective data preprocessing and feature engineering are critical to improving model accuracy.
Amazon review data, due to its scale and variety, is a valuable benchmark for sentiment analysis research.

Conclusion

In conclusion, on sentiment classification tasks on product reviews, the hybrid LSTM-GRU model shows really great performance. Having an accuracy of 82.50% and a precision of 83.85%, it beats current models such the PLSA hybrid ELMo and LDA hybrid ELMo models, which attained accuracies of 79% and 75%, respectively. This suggests that more accurate sentiment forecasts result from the proposed model\'s improved capture of the complex trends found in review texts. The model\'s effectiveness in learning sentiment classifications shown in its capacity to reach great accuracy while minimising loss (0.35). Furthermore underlined by the results are the significance of hybrid deep learning architectures—such being the LSTM-GRU combo—in improving performance above conventional models. Comparative study reveals that the Hybrid LSTM-GRU model positions itself as a more dependable and efficient method for sentiment analysis jobs since it offers better accuracy and precision. The performance of the suggested model verifies its possibility for implementation in practical sentiment analysis applications and provides a strong solution for large-scale text dataset analysis. These results imply that deep learning-based models—especially hybrid architectures—offer a hopeful path for raising sentiment categorisation performance in many different fields

References

[1] I. Technology and S. G. Vihar, “EMOTION DETECTION USING CONTEXT BASED,” vol. 100, no. 19, pp. 5607–5614, 2022. [2] A. E. de O. Carosia, “Sentiment Analysis Applied to News from the Brazilian Stock Market,” IEEE Lat. Am. Trans., vol. 20, no. 3, pp. 512–518, 2022, doi: 10.1109/TLA.2022.9667151. [3] H. Guo, B. Liu, and Z. Yang, “Machine Learning-Based Emotion Factor Analysis of Sport Fan Community,” Secur. Commun. Networks, vol. 2022, 2022, doi: 10.1155/2022/2674987. [4] A. P. Rodrigues et al., “Real-Time Twitter Spam Detection and Sentiment Analysis using Machine Learning and Deep Learning Techniques,” Comput. Intell. Neurosci., vol. 2022, 2022, doi: 10.1155/2022/5211949. [5] C. Chen, B. Xu, J. H. Yang, and M. Liu, “Sentiment Analysis of Animated Film Reviews Using Intelligent Machine Learning,” Comput. Intell. Neurosci., vol. 2022, 2022, doi: 10.1155/2022/8517205. [6] A. Goswami et al., “Sentiment Analysis of Statements on Social Media and Electronic Media Using Machine and Deep Learning Classifiers,” Comput. Intell. Neurosci., vol. 2022, 2022, doi: 10.1155/2022/9194031. [7] G. Chandrasekaran, N. Antoanela, G. Andrei, C. Monica, and J. Hemanth, “Visual Sentiment Analysis Using Deep Learning Models with Social Media Data,” Appl. Sci., vol. 12, no. 3, 2022, doi: 10.3390/app12031030. [8] Renu D.S, Tintu Vijayan, and Dr. D. Dhanya, “Emotion Analysis Using Convolutional Neural Network,” vol. 10, no. 04, pp. 223–228, 2022, [Online]. Available: www.ijert.org [9] Z. Jalil et al., “COVID-19 Related Sentiment Analysis Using State-of-the-Art Machine Learning and Deep Learning Techniques,” Front. Public Heal., vol. 9, no. January, pp. 1–14, 2022, doi: 10.3389/fpubh.2021.812735. [10] Y. Gherkar, P. Gujar, A. Gaziyani, and S. Kadu, “Keyword?:,” vol. 03029, pp. 1–6, 2022. [11] U. Sirisha and B. S. Chandana, “Aspect based Sentiment and Emotion Analysis with ROBERTa, LSTM,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 11, pp. 766–774, 2022, doi: 10.14569/IJACSA.2022.0131189. [12] T. Nijhawan, G. Attigeri, and T. Ananthakrishna, “Stress detection using natural language processing and machine learning over social interactions,” J. Big Data, vol. 9, no. 1, 2022, doi: 10.1186/s40537-022-00575-6. [13] G. Kalpana, K. Pranav Kumar, J. Sudhakar, and P. Sowndarya, “Emotion and sentiment analysis using machine learning,” Ann. Rom. Soc. Cell Biol., vol. 25, no. 1, pp. 1906–1911, 2021, [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85101624121&partnerID=40&md5=17f24e6a38bba21036fbf25c37606969 [14] S. Cahyaningtyas, D. Hatta Fudholi, and A. Fathan Hidayatullah, “Deep Learning for Aspect-Based Sentiment Analysis on Indonesian Hotels Reviews,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, 2021, doi: 10.22219/kinetik.v6i3.1300. [15] N. Farhoumandi, S. Mollaey, S. Heysieattalab, M. Zarean, and R. Eyvazpour, “Facial Emotion Recognition Predicts Alexithymia Using Machine Learning,” Comput. Intell. Neurosci., vol. 2021, 2021, doi: 10.1155/2021/2053795. [16] S. Kusal, S. Patil, K. Kotecha, R. Aluvalu, and V. Varadarajan, “Ai based emotion detection for textual big data: Techniques and contribution,” Big Data Cogn. Comput., vol. 5, no. 3, 2021, doi: 10.3390/bdcc5030043. [17] A. Chiorrini, C. Diamantini, A. Mircoli, and D. Potena, “Emotion and sentiment analysis of tweets using BERT,” CEUR Workshop Proc., vol. 2841, 2021. [18] S. Gupta and A. Noliya, “URL-Based Sentiment Analysis of Product Reviews Using LSTM and GRU,” Procedia Comput. Sci., vol. 235, no. 2023, pp. 1814–1823, 2024, doi: 10.1016/j.procs.2024.04.172. [19] A. Sarraf, “Utilizing NLP Sentiment Analysis Approach to Categorize Amazon Reviews against an Extended Testing Set,” Int. J. Comput. Int. J. Comput., vol. 50, no. 1, pp. 107–116, 2024. [20] M. K. Shaik Vadla, M. A. Suresh, and V. K. Viswanathan, “Enhancing Product Design through AI-Driven Sentiment Analysis of Amazon Reviews Using BERT,” Algorithms, vol. 17, no. 2, 2024, doi: 10.3390/a17020059. [21] O. Shobayo, S. Sasikumar, S. Makkar, and O. Okoyeigbo, “Customer Sentiments in Product Reviews: A Comparative Study with GooglePaLM,” Analytics, vol. 3, no. 2, pp. 241–254, 2024, doi: 10.3390/analytics3020014. [22] B. Yu, “Comparative Analysis of Machine Learning Algorithms for Sentiment Classification in Amazon Reviews,” Highlights Business, Econ. Manag., vol. 24, pp. 1389–1400, 2024, doi: 10.54097/eqmavw44. [23] H. Ali, E. Hashmi, S. Yayilgan Yildirim, and S. Shaikh, “Analyzing Amazon Products Sentiment: A Comparative Study of Machine and Deep Learning, and Transformer-Based Techniques,” Electron., vol. 13, no. 7, pp. 1–21, 2024, doi: 10.3390/electronics13071305. [24] H. Wang, “Word2Vec and SVM Fusion for Advanced Sentiment Analysis on Amazon Reviews,” Highlights in Science, Engineering and Technology, vol. 85. pp. 743–749, 2024. doi: 10.54097/sw4pft19. [25] A. M. Shetty, M. F. Aljunid, D. H. Manjaiah, and A. M. S. Shaik Afzal, “Hyperparameter Optimization of Machine Learning Models Using Grid Search for Amazon Review Sentiment Analysis,” Lect. Notes Networks Syst., vol. 821, no. May, pp. 451–474, 2024, doi: 10.1007/978-981-99-7814-4_36. [26] M. Tabany and M. Gueffal, “Sentiment Analysis and Fake Amazon Reviews Classification Using SVM Supervised Machine Learning Model,” J. Adv. Inf. Technol., vol. 15, no. 1, pp. 49–58, 2024, doi: 10.12720/jait.15.1.49-58. [27] F. Nurifan, R. Sarno, and K. R. Sungkono, “Aspect based sentiment analysis for restaurant reviews using hybrid ELMo-wikipedia and hybrid expanded opinion lexicon-senticircle,” Int. J. Intell. Eng. Syst., vol. 12, no. 6, pp. 47–58, 2019, doi: 10.22266/ijies2019.1231.05.

Copyright

Copyright © 2025 Ashish Mathur, Dr. Pharindara Kumar Sharma. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET67634

Publish Date : 2025-03-19

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here