Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Dr. Prashant Udavant, Suryadip Gujar, Bhagyesh Chaudahry
DOI Link: https://doi.org/10.22214/ijraset.2025.75043
Certificate: View Certificate
In recent years, leveraging artificial intelligence for stock price prediction has emerged as a critical research focus within the financial sector. This study compares the forecasting performance of two prominent time series models: the traditional Autoregressive Integrated Moving Average (ARIMA) model and the deep learning-based Long Short-term Memory (LSTM) network. Additionally, to enhance robustness and capture diverse market dynamics, machine learning models such as Random Forest and Support Vector Machines (SVM) are also integrated into the framework. Utilizing historical closing prices from Yahoo Finance, these models are developed and rigorously evaluated using key statistical indicators, namely Mean Squared Error (MSE), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). Experimental results consistently demonstrate that the LSTM model achieves lower error rates across all metrics compared to ARIMA, highlighting its superior ability to capture complex, nonlinear patterns inherent in stock market data . These findings are consistent with broader research showing LSTM’s advantage in modeling time-dependent financial data for both short-term and longerterm horizons. Beyond conventional historical data, this work also incorporates sentiment analysis from financial news and social media along with event analysis of political, economic, and social occurrences, thereby extending predictive capability to real-world market drivers. The integration of statistical, machine learning, and deep learning techniques with sentiment and event features represents a novel contribution, as no prior project has simultaneously employed ARIMA, LSTM, Random Forest, and SVM in this combined framework. The results of this comprehensive analysis offer valuable guidance for investors and market analysts aiming to improve the accuracy of future stock price forecasts. Moreover, this paper contributes to the growing body of literature evaluating the real-world utility of hybrid AI-driven approaches in financial prediction tasks
The research focuses on leveraging big data and advanced computational models to improve stock price prediction, addressing the limitations of traditional forecasting methods. Stock prediction techniques include time series analysis (e.g., MA, ARMA, ARIMA), technical indicator analysis, and artificial intelligence approaches such as neural networks (BPNN, RNN, LSTM, CNN). LSTM is especially favored for capturing long-term dependencies in stock prices. Traditional machine learning models like SVM and Random Forest are also applied to enhance prediction accuracy. External factors such as market sentiment and events (political, economic, geopolitical) are incorporated to improve reliability.
Dataset & Preprocessing:
The dataset spans daily stock and index data from January 2010 to December 2023 from Yahoo Finance, covering bullish, bearish, and volatile periods. Features include traditional financial indicators (Open, High, Low, Close, Adjusted Close, Volume), sentiment-based variables derived from news, social media, and analyst opinions, and event-based variables (e.g., earnings, policy changes, geopolitical events). Preprocessing involved handling missing values, normalization, stationarity testing (ADF test) for ARIMA, sliding-window creation for LSTM, and encoding of sentiment and event features. The integration of structured market data with unstructured sentiment and event data creates a comprehensive panel that captures both internal trends and external shocks, improving predictive performance across ARIMA, LSTM, Random Forest, and SVM.
Dataset Significance:
The long-term, multi-dimensional dataset enables rigorous evaluation under various market conditions. ARIMA is suitable for stationary trends, while LSTM captures non-linear and volatile movements. Inclusion of sentiment and event features adds context, allowing models to account for investor psychology and exogenous shocks. Hybrid approaches combining statistical, machine learning, and deep learning methods benefit from this holistic dataset.
Literature Review – Model Insights:
SVM: Effective for high-dimensional, non-linear classification tasks, short-term trend prediction, and incorporating technical, sentiment, and macroeconomic features. Limitations include kernel sensitivity and computational inefficiency on large datasets.
Random Forest: An ensemble of decision trees that handles noisy, heterogeneous data well. Effective in predicting price direction, volatility, and feature importance. Less sensitive to preprocessing but less effective at capturing complex temporal dependencies compared to LSTM. Hybrid or ensemble approaches improve predictive accuracy.
I have used multiple python libraries, mainly pandas, to analyze and visualize data pertaining to stocks, especially technology stocks, and their respective data from the stock market. Also, this paper attempts to analyze the stock risk based on the past performance and uses stock price prediction using LSTM model and ARIMA model. The historical dataset available on the company’s website is lacking in several aspects as it only covers a few fundamental pillars such as high and low stock prices, closed and opened stock prices as well as trading volumes. In order to augment accuracy, additional variables are generated from the features. LSTM model experiment on Apple stock price, 95. Rather than calculating a simple moving 95 average, which rests on the premise of calculating the average of the last N values, the model incorporates the use of several randomly selected short subsequences from the training dataset. The method df[col]. rolling(N), which corresponds to the command used to create a rolling window, applies the same principle and helps in the generation of a window of size N for each timestamp t such that the outputs are the rows t, t-1,..., t-(N-Ver1), and t the set N is the number of the rows to be shifted. This method of filling aims to keep the order of the inputs. The inputs for the t timestamp are obtained after the values have been shifted prediction and the last value is set to NaN as is required the. The last step is to compute the predicted values with…. From Figure 11 and Figure 12, it can....
[1] Fama, E. F. (1970). Efficient Capital Markets: A Review of Theory and Empirical Work. The Journal of Finance, 25(2), 383–417. [2] Box, G. E. P., Jenkins, G. M., Reinsel, G. C., Ljung, G. M. (2015). Time Series Analysis: Forecasting and Control. Wiley [3] Chatfield, C. (2003). The Analysis of Time Series: An Introduction. CRC Press. [4] Ariyo, A., Adewumi, A. and Ayo, C., (2014). Stock Price Prediction Using the ARIMA Model. 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation. Gers, F. A., Schmidhuber, J., Cummins, F. (2000). Learning to Forget: Continual Prediction with LSTM. Neural Computation, 12(10), 2451–2471. [5] Agrawal, N., (2019). Stock Market Prediction Approach: An Analysis. International Journal of Engineering Research & Technology (IJERT), 06(03), pp.847-849.Nelson, D. M., Pereira, A. C., de Oliveira, R. A. (2017). Stock market’s price movement prediction with LSTM neural networks. International Joint Conference on Neural Networks (IJCNN). [6] Haider Khan, Z., Sharmin Alin, T. and Hussain, A., (2011). Price Prediction of Share Market Using Artificial Neural Network \'ANN\'. International Journal of Computer Applications, 22(2), pp.42-47. [7] Fischer, T., Krauss, C. (2018). Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 270(2), 654–669. [8] Siami-Namini, S., Tavakoli, N., Namin, A. S. (2019). A comparison of ARIMA and LSTM in forecasting time series. IEEE International Conference on Machine Learning and Applications (ICMLA). [9] 0. Support.sas.com. (2021). PROC ARIMA. [online] Available at: [Accessed 20 August 2021]. [10] Shynkevich, Y., McGinnity, T. M., Coleman, S. A., Belatreche, A. (2017). Forecasting price movements using technical indicators: Investigating the impact of varying input window length. Neurocomputing, 264, 71–88. [11] Fama, E.F. Efficient capital markets: A review of theory and empirical work. J. Financ. 1970, 25, 383–417. [CrossRef] 2. Malkiel, B.G. Efficient market hypothesis. In Finance; Springer: Berlin, Germany, 1989; pp. 127– 134. [12] Deorukhkar, O., Lokhande, S., Nayak, V. and Chougule, A., (2019). Stock Price Prediction using combination of LSTM Neural Networks, ARIMA and Sentiment Analysis. International Research Journal of Engineering and Technology (IRJET), 06(03), pp.3497-35003.. [13] Zhang, G., Eddy Patuwo, B., Hu, M. Y. (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, 14(1), 35–62. [14] Peng, H., Zhou, Y. (2022). Hybrid ARIMA–LSTM models for financial time series forecasting: Enhancing robustness and interpretability. Applied Soft Computing, 114, 108108. [15] Liu, H., Chen, L., Xu, Y. (2022). Attention-based deep learning models for stock prediction: A survey. Neurocomputing, 489, 336–356. [16] Xing, F., Cambria, E., Welsch, R. E. (2020). Natural language based financial forecasting: A survey. Artificial Intelligence Review, 54, 3763– 3812. [17] Wu, H., Zhang, J. (2021). Hybrid ARIMA–LSTM model for stock price prediction. IEEE Access, 9, 33011–33020. [18] Kara, Y., Boyacioglu, M. A., Baykan, O. K. (2011).¨ Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert Systems with Applications, 38(5), 5311–5319. [19] Goodfellow, I., Bengio, Y., Courville, A. (2016). Deep Learning. MIT Press. [20] Zhao, K., Zhang, C., Li, H. (2019). Event detection from social media data streams. IEEE Transactions on Knowledge and Data Engineering, 31(7), 1234–1248. [21] Atkinson, P., Campos, L. (2020). Event detection and classification in heterogeneous data. Springer Lecture Notes in Computer Science, 12013, 112– 128. [22] Chen, Y., Wang, S., Zhai, X. (2021). Deep learning for event extraction: A survey. ACM Transactions on Knowledge Discovery from Data, 15(4), 1– 33. [23] Liu, B. (2012). Sentiment Analysis and Opinion Mining. Morgan Claypool Publishers. [24] Cambria, E. (2016). Affective computing and sentiment analysis. IEEE Intelligent Systems, 31(2), 102–107. [25] Zhang, L., Wang, S., Liu, B. (2018). Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1253. [26] Li, J., Sun, A., Ma, J. (2021). Event-driven sentiment analysis: Methods and applications. IEEE Access, 9, 75712–75729. [27] Xu, Y., Cohen, S., Zhao, T. (2020). Temporal sentiment-event analysis for financial markets. Information Processing Management, 57(3), 102256. [28] Schwert, G.W. Why does stock market volatility change over time? J. Financ. 1989, 44, 1115–1153 [29] Chan, J.Y.L.; Leow, S.M.H.; Bea, K.T.; Cheng, W.K.; Phoong, S.W.; Hong, Z.W.; Chen, Y.L. Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review. Mathematics 2022, 10, 1283 [30] Chan, J.Y.L.; Leow, S.M.H.; Bea, K.T.; Cheng, W.K.; Phoong, S.W.; Hong, Z.W.; Chen, Y.L. A Correlation-Embedded Attention Module to Mitigate Multicollinearity: An Algorithmic Trading Application. Mathematics 2022, 10, 1231 [31] Li, Q.; Wang, T.; Li, P.; Liu, L.; Gong, Q.; Chen, Y. The effect of news and public mood on stock movements. Inf. Sci. 2014, 278, 826–840. [CrossRef] 16. Jiang, W. Applications of deep learning in stock market prediction: Recent progress. Expert Syst. Appl. 2021, 184, 115537. [32] Ozbayoglu, A.M.; Gudelek, M.U.; Sezer, O.B. Deep learning for financial applications: A survey. Appl. Soft Comput. 2020, 93, 106384. [CrossRef] [33] 18. Chopra, R.; Sharma, G.D. Application of Artificial Intelligence in Stock Market Forecasting: A Critique, Review, and Research Agenda. J. Risk Financ. Manag. 2021, 14, 526. [CrossRef] 19 [34] . Shah, D.; Isah, H.; Zulkernine, F. Stock market analysis: A review and taxonomy of prediction techniques. Int. J. Financ. Stud. 2019, 7, 26. [CrossRef] 20. [35] Shahi, T.B.; Shrestha, A.; Neupane, A.; Guo, W. Stock price forecasting with deep learning: A comparative study. Mathematics 2020, 8, 1441. [36] E. F. Fama, “Efficient Capital Markets: A Review of Theory and Empirical Work,” The Journal of Finance, vol. 25, no. 2, pp. 383–417, May 1970. [37] G. E. P. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung, Time Series Analysis: Forecasting and Control, 5th ed. Hoboken, NJ, USA: Wiley, 2015. [38] C. Chatfield, The Analysis of Time Series: An Introduction, 6th ed. Boca Raton, FL, USA: CRC Press, 2003. [39] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. [40] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to Forget: Continual Prediction with LSTM,” Neural Computation, vol. 12, no. 10, pp. 2451–2471, 2000. [41] V. Chang, P. Baudier, H. Zhang, Q. Xu, J. Zhang, M. Arami, and H. P. Le, “Financial data forecasting with deep learning: A systematic literature review (2010–2023),” Information Fusion, vol. 102, art. 101881, Jan. 2024. [42] D. M. Nelson, A. C. Pereira, and R. A. de Oliveira, “Stock market’s price movement prediction with LSTM neural networks,” in Proc. Int. Joint Conf. Neural Netw. (IJCNN), Anchorage, AK, USA, 2017, pp. 1419–1426. [43] J. Patel, S. Shah, P. Thakkar, and K. Kotecha, “Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques,” Expert Syst. Appl., vol. 42, no. 1, pp. 259– 268, Jan. 2015. [44] T. Fischer and C. Krauss, “Deep learning with long short-term memory networks for financial market predictions,” Eur. J. Oper. Res., vol. 270, no. 2, pp. 654–669, Oct. 2018. [45] S. Siami-Namini, N. Tavakoli, and A. S. Namin, “A comparison of ARIMA and LSTM in forecasting time series,” in Proc. IEEE Int. Conf. Mach. Learn. Appl. (ICMLA), Orlando, FL, USA, Dec. 2018, pp. 1394– 1401. [46] S. Borovkova and I. Tsiamas, “An ensemble of LSTM neural networks for highfrequency stock market classification,” J. Forecast., vol. 38, no. 6, pp. 600–619, Sep. 2019 [47] Y. Shynkevich, T. M. McGinnity, S. A. Coleman, and A. Belatreche, “Forecasting price movements using technical indicators: Investigating the impact of varying input window length,” Neurocomputing, vol. 264, pp. 71–88, Nov. 2017. [48] A. A. Adebiyi, A. O. Adewumi, and C. K. Ayo, “Comparison of ARIMA and artificial neural networks models for stock price prediction,” J. Appl. Math., vol. 2014, art. ID 614342, pp. 1–7, 2014. [49] G. Zhang, B. E. Patuwo, and M. Y. Hu, “Forecasting with artificial neural networks: The state of the art,” Int. J. Forecast., vol. 14, no. 1, pp. 35–62, 1998. [50] H. Peng and Y. Zhou, “Hybrid ARIMA–LSTM models for financial time series forecasting: Enhancing robustness and interpretability,” Appl. Soft Comput., vol. 114, art. 108108, Jan. 2022. [51] H. Liu, L. Chen, and Y. Xu, “Attention-based deep learning models for stock prediction: A survey,” Neurocomputing, vol. 489, pp. 336–356, May 2022. [52] F. Xing, E. Cambria, and R. E. Welsch, “Natural language based financial forecasting: A survey,” Artif. Intell. Rev., vol. 54, pp. 3763–3812, Mar. 2020. [53] H. Wu and J. Zhang, “Hybrid ARIMA–LSTM model for stock price prediction,” IEEE Access, vol. 9, pp. 33011–33020, 2021. [54] Y. Kara, M. A. Boyacioglu, and O. K. Baykan, “Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange,” Expert Syst. Appl., vol. 38, no. 5, pp. 5311–5319, May 2011. [55] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press, 2016. [56] K. Zhao, C. Zhang, and H. Li, “Event detection from social media data streams,” IEEE Trans. Knowl. Data Eng., vol. 31, no. 7, pp. 1234–1248, Jul. 2019. [57] P. Atkinson and L. Campos, “Event detection and classification in heterogeneous data,” in Lecture Notes in Computer Science, vol. 12013. Cham, Switzerland: Springer, 2020, pp. 112–128. [58] Y. Chen, S. Wang, and X. Zhai, “Deep learning for event extraction: A survey,” ACM Trans. Knowl. Discov. Data, vol. 15, no. 4, pp. 1–33, Jul. 2021. [59] B. Liu, Sentiment Analysis and Opinion Mining. San Rafael, CA, USA: Morgan & Claypool, 2012. [60] E. Cambria, “Affective computing and sentiment analysis,” IEEE Intell. Syst., vol. 31, no. 2, pp. 102–107, Mar.–Apr. 2016. [61] L. Zhang, S. Wang, and B. Liu, “Deep learning for sentiment analysis: A survey,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 8, no. 4, art. e1253, Jul.–Aug. 2018. [62] J. Li, A. Sun, and J. Ma, “Event-driven sentiment analysis: Methods and applications,” IEEE Access, vol. 9, pp. 75712–75729, 2021. [63] Y. Xu, S. Cohen, and T. Zhao, “Temporal sentiment-event analysis for financial markets,” Inf. Process. Manage., vol. 57, no. 3, art. 102256, May 2020. [64] T. H. Nguyen, K. Shirai, and J. Velcin, “Sentiment analysis on social media for stock movement prediction,” Expert Syst. Appl., vol. 42, no. 24, pp. 9603–9611, Dec. 2015. [65] L. Nemes and A. Kiss, “Prediction of stock values changes using sentiment analysis of stock news headlines,” J. Inf. Telecommun., vol. 5, no. 3, pp. 375–394, Jul. 2021. [66] K. Gupta, N. Jiwani, and N. Afreen, “A combined approach of sentimental analysis using machine learning techniques,” Rev. Intell. Artif., vol. 37, no. 1, pp. 1–6, 2023.
Copyright © 2025 Dr. Prashant Udavant, Suryadip Gujar, Bhagyesh Chaudahry. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET75043
Publish Date : 2025-11-04
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here
Submit Paper Online
