As the variety of mobile applications used in daily life expands, it becomes crucial to stay updated and discern which apps are safe and which are not. It is challenging to make a judgment. Our methodology predicts using four criteria: ratings, feedback, in-app purchases, and the presence of advertisements. The system assesses three models: Naïve Bayes, logistic regression, and decision tree classifier. These models were then evaluated based on four F1 score metrics: recall, precision, and accuracy. A high F1 score should exceed 0.7, and a recall score greater than 0.5 indicates enhanced precision and accuracy. After analysis, we found that the decision tree model was an exceptional model with an accuracy of 85% and an F1 score.
Introduction
The text discusses the growing issue of fraudulent mobile applications on platforms like the Google Play Store and Apple App Store. As mobile app usage increases, developers compete intensely for visibility and downloads, sometimes using unethical practices such as fake reviews, ratings, bot farms, and crowd-turfing to manipulate app rankings. This creates a need for reliable fraud detection systems to help users identify authentic apps before downloading them.
The proposed solution uses four main features—ratings, reviews, advertisements, and in-app purchases—to detect fraudulent apps. Several machine learning models were tested, including Naive Bayes (83% accuracy), Logistic Regression (84%), and Decision Tree (85%), with the Decision Tree model performing best.
The literature survey reviews previous fraud detection methods, such as ranking-based, review-based, and IP-address-based detection systems. Existing systems mainly rely on static features and have limitations like limited feature sets, data imbalance, dependency on user-generated reviews, and difficulty detecting evolving fraud patterns.
To overcome these issues, the proposed system introduces dynamic features, real-time monitoring, sentiment analysis, natural language processing, and advanced machine learning techniques. These improvements aim to enhance fraud detection accuracy, transparency, reliability, and user trust, ultimately providing a safer mobile app environment.
Conclusion
This study presents a comprehensive approach to detecting fraudulent apps on the Google Play Store using machine learning. We demonstrated the effectiveness of our methodology through extensive experiments and analysis. The Decision Tree model, with its high accuracy and interpretability, proves to be a valuable tool for identifying fraudulent apps. Our proposed system addresses the limitations of existing methods and offers several advantages, including enhanced feature set, better handling of data imbalance, increased interpretability, and advanced analytical techniques. Future work will focus on further improving the system\'s accuracy and reliability by incorporating more dynamic features and exploring advanced machine learning techniques.
References
[1] Esther Nowroji, Vanitha, “Detection Of Fraud Ranking For Mobile App Using IP Address Recognition Technique”, vol. 4.
[2] JavvajiVenkataramaiah, BommavarapuSushen, Mano. R, Dr.GladispushpaRathi, “An enhanced mining leading session algorithm for fraud app detection in mobile applications”
[3] S.R.Srividhya, S.Sangeetha – “A Methodology to Detect Fraud Apps Using Sentiment Analysis”
[4] Keerthana. B, Sivasankari and ShaisthaTabasum.S, “Detecting Malwares and Search Rank Fraud in Google Search Using Rabin Karp Algorithm”, IJARSE,7(02), 2018, pp.504-527.
[5] Shashank Bajaj, Nikhil Nigam, PriyaVandana, Srishti Singh, “Detection of fraud apps using sentiment analysis”, International Journal of Innovative Science and Research Technology.
[6] HarpreetKaur, VeenuMangat and Nidhi, ? “A Survey of Sentiment Analysis techniques”
[7] International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2017, pp.921
[8] Jing Wan, Mufan Liu, Junkai Yi and Xuechao Zhang, “Detecting Spam Webpages through Topic andSemantics Analysis”, IEEE Global Summit on Computer and Information Technology (GSCIT), 2015, pp. 83-92.
[9] Navdeep Singh, Prashant Kr. Pandey and Mr.Srinivasan, ? “Improved Discovery of Rating Fake for Cellular Apps”, IEEE International Conference on Science Technology Engineering and Management (ICONSTEM), 2016, pp. 135-140.
[10] Weiman Wang, Restricted Boltzmann Machine. GitHub. Aug 2017. [Online] Available: https://github.com/aaxwaz/Fraud-detection-usingdeeplearning/blob/master/rbm/rbm.py.
[11] DubeyVeena, G. D. (2016). Sentiment Analysis Based on Opinion Classification Techniques: A Survey .International Journal of Advanced Research in Computer Science and Software Engineering, 5358.
[12] Ranking fraud Mining personal context-aware preferences for mobile users. H. Zhu, E. Chen, K. Yu, H. Cao, H. Xiong, and J. Tian. In Data Mining (ICDM), 2012 IEEE 12th International Conference on, pages1212–1217, 2012.
[13] NandimathJyoti, K. B. (2017). Efficiently Detecting and Analyzing SpamReviews Using Live Data Feed. International Research Journal of Engineering and Technology (IRJET) , 1421-1424.
[14] Detecting product review spammers using rating behaviors. E.-P. Lim, V.-A.Nguyen, N. Jindal, B. Liu, and H. W. LauwIn Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM ?10 pages 939–948, 2013.
[15] Detection for mobile apps H. Zhu, H. Xiong, Y. Ge , and E. Chen. A holistic view. In Proceedings of the 22nd ACM international conference on Information and knowledge management, CIKM ?13, 2013.