Automated Cyber Harassment Surveillance: Enhancing Social Media Safety

Authors: D V Divakar Rao, Shravya Shri Sahu, Kilaparthi Durga Mallesh, Vedulla Naga Venkata Nitin, Bandaru Avinash

DOI Link: https://doi.org/10.22214/ijraset.2025.67633

Abstract

The increase in the use of social media platforms has coincided with an increase in cyber harassment and child predation, leading to significant safety issues. This research presents an AI-driven system aimed at detecting cyber harassers and potential child predators using machine learning (ML) and natural language processing (NLP). The system utilizes a Random Forest Classifier in conjunction with TF-IDF vectorization to classify user messages into various categories, such as hate speech, offensive content, and neutral interactions. Unlike conventional methods, which depend largely on manual oversight and basic keyword filtering, this approach improves accuracy, contextual understanding, and automated responses. The system is implemented with a flask-based backend and a react-powered front-end, facilitating real-time content analysis, automatic threat detection, and an administrative dashboard for moderation. The experimental findings highlight the system\'s capability to identify harmful online behaviors, providing a proactive solution to ensure a safer social media environment, especially for at-risk users.

Introduction

The widespread use of social media has revolutionized communication but also escalated challenges like cyber harassment and predatory behavior. Traditional moderation methods—manual review, keyword filtering, and basic machine learning—are inadequate due to their inability to understand context, detect subtle threats, or keep pace with evolving abuse tactics.

To address these shortcomings, the paper introduces a proactive AI-powered system that detects cyber harassment and child predation in real time. The system uses Machine Learning (ML) and Natural Language Processing (NLP), primarily employing a Random Forest Classifier alongside TF-IDF vectorization for accurate classification of online text into hate speech, offensive language, predatory messages, or neutral content. This combination allows detection of hidden patterns and subtle abuse strategies like grooming or coded language.

Key Features of the Proposed System

Machine Learning Models: Includes Random Forest, SVM, Logistic Regression, Naïve Bayes, and LSTM. Ensemble and hybrid models are also explored to enhance accuracy and contextual understanding.
NLP Techniques: TF-IDF, Word2Vec, and GloVe are used for feature extraction and understanding semantic meaning.
Real-Time Monitoring: Built with Flask (backend) and React (frontend) to provide immediate detection and interactive dashboards for moderators and law enforcement.
Data Handling: Uses labeled datasets, synthetic data, and ethical scraping, adhering to privacy standards like GDPR.
Security Measures: Implements OAuth 2.0, JWT, SSL, and rate limiting to ensure secure and scalable operation.

Methodology Overview

Data Collection: Text data from social platforms, pre-labeled datasets, and synthetic messages.
Preprocessing: Cleaning, tokenization, vectorization, and normalization of text.
Model Training: Evaluation of various ML models; Random Forest is favored for its balance of performance and scalability.
System Architecture: Flask-based backend for model deployment and a React-based dashboard for human review and intervention.
Evaluation Metrics: Accuracy, precision, recall, and F1-score are used to assess model performance.

Experimental Results

The AI system demonstrated high accuracy in identifying harmful content, maintaining strong performance across metrics such as:

Precision and recall: Effectively identifies threats while minimizing false positives.
Scalability and responsiveness: Performs well under high user loads and in real-time scenarios.

Conclusion

As social media continues to grow, the issues of cyber harassment and online abuse become increasingly pressing. The simplicity of digital communication has led to a surge in inappropriate interactions, necessitating the creation of smart, automated systems to identify and prevent harmful content before it is shared publicly. This paper proposes an AI-powered content moderation system that utilizes machine learning (ML) and natural language processing (NLP) to classify user-generated messages into categories, such as hate speech, offensive language, and neutral content. By implementing real-time detection features, the system actively identifies and blocks harmful interactions before they can escalate [7]. Unlike traditional moderation techniques that depend on basic keyword filtering, which often leads to misclassification and a high rate of false positives, the proposed method improves accuracy by using TF-IDF vectorization and Random Forest classification, thereby significantly reducing both false positives and false negatives [1]. To ensure efficient operation and seamless real-time analysis, the system combines a Flask-based backend with a React-driven frontend, providing an intuitive dashboard where moderators can review flagged content, evaluate its severity, and take appropriate action. Through extensive evaluation and testing, the proposed system has demonstrated high classification accuracy, effectively preventing harmful content from being overlooked [6]. The ability to automatically detect and block inappropriate messages provides a more efficient alternative to traditional manual content moderation, which often struggles with the vast amount of user-generated data [8]. In future, several improvements can be made to enhance the system’s capabilities. Incorporating advanced deep learning architectures like BERT and Transformer-based models, can improve contextual understanding, which enables a system to better interpret linguistic nuances, evolving conversation patterns, and subtle forms of harassment [9]. Expanding multilingual support will make the system more inclusive, which will enable it to identify harmful content across various languages and cultural contexts [5]. In addition, direct collaboration with major social media platforms will facilitate automated large-scale content filtering, ensuring adherence to platform-specific policies while enhancing moderation effectiveness [12]. As digital interactions continue to evolve, the need for proactive, intelligent, and scalable moderation tools will only increase. The ability to detect and prevent harmful interactions in real time is crucial for fostering a safer and more respectful online environment. By integrating machine learning, natural language processing, and cloud-based technologies, this system represents a significant advancement in digital safety[10]. With ongoing improvements, real-world deployment, and strategic partnerships, it has the potential to become an essential tool in combating cyber harassment, ultimately contributing to a more secure and inclusive online space for all users[14].

References

[1] N. AlDahoul, H. Karim, M. Abdullah, M. Fauzi, A. Wazir, S. Mansor, and J. See, \"Transfer detection of YOLO to focus CNNs attention on nude regions for adult content detection,\" Symmetry, vol. 13, no. 1, p. 26, 2020. Available: https://www.mdpi.com/2073-8994/13/1/26 [2] A. Bochkovskiy, \"Darknet,\" 2019. [Online]. Available: https://github.com/AlexeyAB/darknet. [3] S. Avila, N. Thome, M. Cord, E. Valle, and A. de A. Araújo, \"Pooling in image representation: The visual codeword point of view,\" Comput. Vis. Image Understand., vol. 117, no. 5, pp. 453-465, May 2013. Available: https://www.researchgate.net/publication/257484792_Pooling_in_Image_Representation_the_Visual_Codeword_Point_of_View [4] D. Bogdanova, P. Rosso, and T. Solorio, \"Exploring high-level features for detecting cyberpedophilia,\" Comput. Speech Lang., vol. 28, no. 1, pp. 108-120, Jan. 2014. Available: https://www.researchgate.net/publication/259133212_Exploring_high-level_features_for_detecting_cyberpedophilia [5] A. E. Cano, M. Fernandez, and H. Alani, \"Detecting child grooming behaviour patterns on social media,\" in Social Informatics (Lecture Notes in Computer Science). Springer, 2014, pp. 412-427. Available: https://link.springer.com/chapter/10.1007/978-3-319-13734-6_30 [6] A. Chatterjee, K. N. Narahari, M. Joshi, and P. Agrawal, \"SemEval-2019 task 3: EmoContext contextual emotion detection in text,\" in Proc. 13th Int. Workshop Semantic Eval., 2019, pp. 1-10. Available: https://dl.acm.org/doi/10.1109/TAFFC.2021.3053275 [7] M. Dadvar and K. Eckert, \"Cyberbullying detection in social networks using deep learning-based models,\" in Big Data Analytics and Knowledge Discovery (Lecture Notes in Computer Science). Springer, 2020, pp. 245-255. Available: https://link.springer.com/chapter/10.1007/978-3-030-59065-9_20 [8] M. Dadvar, D. Trieschnigg, R. Ordelman, and F. D. Jong, \"Improving cyberbullying detection with user context,\" in Advances in Information Retrieval (Lecture Notes in Computer Science). Springer, 2013, pp. 693-696. Available: https://www.researchgate.net/publication/366657071_DETECTING_CYBERBULLYING_IN_SOCIAL_MEDIA_PLATFORMS_USING_MACHINE_LEARNING_ALGORITHMS [9] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, \"BERT: Pre-training of deep bidirectional transformers for language understanding,\" 2018, arXiv:1810.04805. [Online]. Available: https://arxiv.org/abs/1810.04805. [10] M. Ebrahimi, C. Y. Suen, and O. Ormandjieva, \"Detecting predatory conversations in social media by deep convolutional neural networks,\" Digit. Invest., vol. 18, pp. 33-49, Sep. 2016. Available: https://www.researchgate.net/publication/337501984_Predatory_Conversation_Detection [11] M. Ebrahimi, C. Suen, O. Ormandjieva, and A. Krzyzak, \"Recognizing predatory chat documents using semi-supervised anomaly detection,\" Electron. Imag., vol. 2016, no. 17, pp. 1-9, Feb. 2016. Available: https://www.researchgate.net/publication/322650456_Using_Machine_Learning_to_Detect_Fake_Identities_Bots_vs_Humans [12] H. Escalante, E. Villatoro-Tello, A. Juárez-González, M. Montes, and L. Villaseñor-Pineda, \"Sexual predator detection in chats with chained classifiers,\" in Proc. 4th Workshop Comput. Approaches Subjectivity, Sentiment social media Anal., 2013, pp. 46-54. Available: https://www.researchgate.net/publication/283328545_Automated_Identification_of_Child_Abuse_in_Chat_Rooms_by_Using_Data_Mining_260415-124242 [13] EU COST Action IS0801 on Cyberbullying. EUCOST, 2010. [Online]. Available: https://sites.google.com/site/costis0801. [14] EU Kids Online: Researching European Children\'s Online Opportunities, Risks and Safety, London School of Economics and Political Science, London, U.K., 2014. Available: https://www.lse.ac.uk/media-and-communications/research/research-projects/eu-kids-online [15] J. Guglani and A. N. Mishra, \"DNN based continuous speech recognition system of Punjabi language on Kaldi toolkit,\" Int. J. Speech Technol., vol. 24, no. 1, pp. 41-45, Mar. 2021. Available: https://dl.acm.org/doi/10.1007/s10772-020-09717-8 [16] J. Davidson, J. Gottschalk, and B. Shodipo, \"Child Online Protection: Risks, Regulation, and Research,\" Journal of Digital Safety and Security, vol. 5, no. 3, pp. 45-62, 2020. Available: https://www.researchgate.net/publication/263716112_Children_and_online_risk [17] K. Reynolds, R. Kontostathis, and L. Edwards, \"Using Machine Learning to Detect Cyber Predators,\" AI & Society, vol. 29, no. 4, pp. 651-666, 2015. Available: https://www.researchgate.net/publication/254051434_Using_Machine_Learning_to_Detect_Cyberbullying [18] L. Xu, F. Qian, X. Li, and J. Zhang, \"Deep Learning for Cyber Harassment Detection in Social Media: A Comparative Study,\" IEEE Transactions on Computational Social Systems, vol. 7, no. 2, pp. 253-267, 2021. Available: https://www.mdpi.com/1999-5903/15/5/179 [19] M. Schmidt and P. Wiegand, \"AI-Based Approaches to Online Safety: Challenges and Innovations,\" International Journal of Cybersecurity Research, vol. 12, no. 1, pp. 89-107, 2022. Available: https://www.researchgate.net/publication/377235308_Artificial_Intelligence_in_Cyber_Security [20] N. Kumar, S. Sharma, and R. Gupta, \"Real-Time Detection of Offensive Content on Social Media Using Deep Learning Techniques,\" Journal of Information and Optimization Sciences, vol. 41, no. 5, pp. 1113-1125, 2020. Available: https://www.researchgate.net/publication/350584572_A_Review_on_the_Detection_of_Offensive_Content_in_Social_Media_Platforms [21] O. P. John, L. P. Naumann, and C. J. Soto, \"Paradigm shift to the integrative Big Five trait taxonomy: History, measurement, and conceptual issues,\" in Handbook of Personality: Theory and Research, 3rd ed., O. P. John, R. W. Robins, and L. A. Pervin, Eds. New York: Guilford Press, 2008, pp. 114-158. Available: https://www.researchgate.net/publication/289963274_Paradigm_shift_to_the_integrative_big_five_trait_taxonomy_History_measurement_and_conceptual_issues [22] D. Jurgens, T. Chandrasekharan, L. Hemphill, and E. Gilbert, \"A just and comprehensive strategy for using NLP to address online abuse,\" Proc. ACM Hum. Comput. Interact., vol. 2, pp. 1-33, 2019. Available: https://dl.acm.org/doi/10.1145/3359276 [23] S. Fortuna and S. Nunes, \"A survey on automatic detection of hate speech in text,\" ACM Comput. Surv., vol. 51, no. 4, pp. 1-30, Jul. 2018. Available: https://www.semanticscholar.org/paper/A-Survey-on-Automatic-Detection-of-Hate-Speech-in-Fortuna-Nunes/f9c56fb6e3001f3acbc994a894b4190d78270e1b [24] A. Schmidt and M. Wiegand, \"A survey on hate speech detection using natural language processing,\" in Proc. 5th Int. Workshop NLP Comput. Soc. Sci., 2017, pp. 1-10. Available: https://aclanthology.org/W17-1101/ [25] B. Gambäck and U. Sikdar, \"Using convolutional neural networks to classify hate speech,\" in Proc. 1st Workshop Abusive Lang. Online, 2017, pp. 85-90. Available: https://aclanthology.org/W17-3013/

Copyright

Copyright © 2025 D V Divakar Rao, Shravya Shri Sahu, Kilaparthi Durga Mallesh, Vedulla Naga Venkata Nitin, Bandaru Avinash. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET67633

Publish Date : 2025-03-19

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here