Android Malware Detection Using Machine Learning: A Hybrid Approach with Twitter-Based Hash Updates and Deep Learning Classification

Authors: M Tanmaya, K Sowmya, I Lavanya

DOI Link: https://doi.org/10.22214/ijraset.2026.81736

Abstract

Smart phones play a pivotal role in human life, underscoring the critical importance of security and privacy. This is particularly true for the Android operating system, which dominates the smartphone market with a 70.97% share, making it a prime target for malware developers. This paper introduces a multilayer hybrid method for Android OS malware detection. Our approach uniquely integrates real-time data extraction from Twitter and machine learning techniques. The Twitter API is used to update a malware hash database (MD5, SHA-1, SHA-256) every 48 hours, capturing the latest malware signatures. Additionally, machine learning models — specifically Random Forest (RF) and Long Short-Term Memory (LSTM) — are used to analyze application permissions, achieving 88% accuracy in malware detection. A Java-based prototype anti-malware application demonstrates the practical effectiveness of the proposed system.

Introduction

The text discusses the growing threat of Android malware, especially as smartphones become central to banking, communication, and daily activities. It highlights that Android’s large market share makes it a major target for cyberattacks, including banking trojans like SharkBot, which can bypass security measures and steal financial data.

To address these threats, the paper proposes a hybrid malware detection system combining two approaches:

Hash-based detection that identifies known malware using signature matching, updated automatically via Twitter API and a cloud database.
Machine learning-based detection that analyzes app permissions using models like Random Forest and LSTM to detect unknown or suspicious apps.

The problem statement explains limitations in existing systems, such as reliance on outdated signature databases, poor detection of new (zero-day) malware, high false positives in permission-based methods, vulnerability to obfuscation techniques, and lack of real-time, user-friendly applications.

The gap analysis reinforces these issues, noting that most current systems are single-layered, not continuously updated, computationally inefficient at scale, and not designed for real-world deployment or non-technical users.

The literature survey shows that recent research has used advanced ML and deep learning techniques (e.g., LSTM, GRU, ensemble models, graph-based networks), achieving high accuracy (often above 97–99%), but still lacking robustness, scalability, and practical usability.

Finally, the proposed system integrates real-time hash updates, machine learning classification, and a user-friendly Android application to provide a more adaptive, scalable, and practical solution for detecting both known and unknown malware threats in Android environments.

Conclusion

This paper presented a novel hybrid Android malware detection system combining automated hash-database updates via the Twitter API with machine learning-based permission classification using Random Forest and LSTM models. Trained on 837 samples (288 benign, 549 malicious), the system achieved 88% classification accuracy. The dual-component architecture reduces both false positives and false negatives while remaining adaptive to the evolving malware landscape. Future work will explore: (1) integration of dynamic behavioral features (API call sequences, network traffic patterns) alongside static permission analysis; (2) adversarial robustness training to counter obfuscated malware; (3) expansion of the Twitter-based update pipeline to include other threat intelligence sources such as VirusTotal and MISP feeds; and (4) deployment on cloud platforms for scalable, real-time protection at the enterprise level.

References

[1] A. Mcneil and W. S. Jones, \"Mobile Malware Surging in Europe: A Look at the Biggest Threats,\" Proofpoint, 2022. [Online]. Available: https://www.proofpoint.com [2] CYBLE, \"New SharkBot Variant Discovered,\" 2022. [Online]. Available: https://blog.cyble.com [3] J. Chen et al., \"Android Malware Detection Method Based on Graph Attention Networks and Deep Fusion of Multimodal Features,\" Expert Systems with Applications, 2024, Elsevier. [4] H. Alamro et al., \"Android Malware Detection Using Optimal Ensemble Learning Approach for Cybersecurity,\" IEEE Access, 2023. [5] H. Zhu et al., \"Android Malware Detection Based on Multi-head Squeeze-and-Excitation Residual,\" Expert Systems with Applications, vol. 212, pp. 118705, 2023. [6] A. Mothanna et al., \"Machine Learning Models for Android Malware Detection,\" Procedia Computer Science, vol. 184, pp. 841–846, 2021. [7] O. N. Elayan and A. M. Mustafa, \"Android Malware Detection Using Deep Learning,\" Procedia Computer Science, vol. 184, pp. 847–852, 2021. [8] J. Jung et al., \"Efficient Malware Detection Using Grayscale Image Representation,\" pp. 153, 2021. [9] A. S. Shatnawi et al., \"Android Malware Detection Using Static and Dynamic Analysis,\" Wireless Communications and Mobile Computing, Hindawi, 2022. [10] J. Sun et al., \"Frequency Differential Selection Algorithm with Weight Measurement for Android Malware Detection,\" 2022.

Copyright

Copyright © 2026 M Tanmaya, K Sowmya, I Lavanya. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET81736

Publish Date : 2026-05-02

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here