This paper presents a robust neural network-based framework for detecting gender-biased hate speech in social media, leveraging optimized text classification techniques. The increasing prevalence of hate speech on platforms like Twitter poses a significant threat to digital discourse, necessitating automateddetectionsystemscapableofhandlingthecomplexities of natural language.Our proposed model achieves a state-of-the-art classification accuracy of 98.19%, surpassing conventional machine learning approaches. The methodology integrates advanced natural lan- guageprocessing(NLP)techniqueswithdeeplearning,employing adatasetof5,000tweetscollectedviatheSNSCRAPEAPI. The system features a meticulously designed text preprocessing pipeline utilizing the NLTK library, vocabulary optimization through frequency-based filtering, and a custom neural network architecture enhanced with dropout regularization for improved generalization.
Akeycontributionofthisworkisitsabilitytoaddress three major challenges in gender-biased hate speech detection: the context-dependent nature of offensive language, extremeclass imbalance in real-world datasets, and linguistic variations inherent in short-text social media content. Through extensive experimentation, our model demonstrates a 12.4% improvement in classification performance over traditional approaches suchas Support Vector Machines (SVM) and Logistic Regression. Furthermore, the inclusion of an adaptive learning rate mechanism ensures stability in model convergence, while dropout regularization mitigates overfitting.This research contributes to ongoing efforts to develop au- tomated systems for hate speech detection by proposing an optimized deep learning approach capable of handling noisy, unbalanced,andcontext-sensitivesocialmediadata.Futurework will extend this framework to multilingual settings and real-time detection applications
Introduction
1. Introduction
Hate speech on social media, especially gender-biased content (e.g., misogyny, misandry), is a growing concern. Manual moderation is impractical at scale, and automated systems using NLP and deep learning offer a solution. However, challenges include:
Context-dependency of language
Data imbalance (hate vs. non-hate)
Short text lengths in platforms like Twitter
2. Objective
To develop an optimized neural network-based framework that:
Detects gender-biased hate speech
Handles imbalanced and noisy text data
Outperforms traditional models (e.g., SVM, Naive Bayes)
3. Key Contributions
A custom neural network achieving 98.19% accuracy in binary classification.
A 7-step text preprocessing pipeline achieving 95.6% noise reduction.
Effective class balancing using SMOTE.
Comparative analysis showing clear performance superiority over traditional methods.
4. Methodology
A. Data Collection & Preprocessing
Collected 5,000 tweets using hashtags like #womenarestupid and #menaretrash via SNSCRAPE.
Balanced to 3,973 entries using oversampling.
Preprocessing included:
Removing URLs, punctuation, and stopwords
Lowercasing and token filtering
Minimum token length enforcement
B. Vocabulary Optimization
Reduced vocabulary from 9,564 to 3,030 tokens
Used frequency threshold θ=2\theta = 2θ=2
C. Model Architecture
Input Layer: 3,030 nodes
Hidden Layer: 50 nodes with ReLU
Output Layer: 1 node with Sigmoid
Dropout: 0.2 to reduce overfitting
Optimizer: Adam (learning rate = 0.001)
Loss: Binary cross-entropy
5. Dataset
Comprised of diverse, annotated user-generated content.
Focused on English-language posts, with care taken to avoid storing any personally identifiable information (PII).
6. Results
Metric
Training
Testing
Accuracy
99.83%
98.19%
Precision
0.983
0.978
Recall
0.991
0.973
F1-Score
0.987
0.975
Figure 1: Visuals of accuracy/loss over epochs and keyword frequencies like “genderpaygap”, “patriarchy”, “misandry”.
Preprocessing and dropout helped prevent overfitting and improved generalizability.
7. Related Work Highlights
Prior work spans traditional ML, LSTM, CNN, transformer-based models (BERT), and multimodal approaches.
Limitations in earlier models included handling sarcasm, class imbalance, and lack of multimodal data support.
8. Discussion & Limitations
Strengths:
High performance
Robust handling of noise and class imbalance
Limitations:
English-only support
Binary classification may miss nuanced hate
May miss slang, coded, or implicit bias
Hashtag-based data collection may introduce sampling bias
Future Work:
Multilingual capability
Multi-class classification
Incorporation of multimodal data (text + images)
Adaptive vocabulary updates
Conclusion
In this study, we have developed a neural network-based approach for detecting gender-biased hate speech in social media content, achieving a state-of-the-art classification accu- racy of 98.19%. The model’s architecture, combined with an optimized text preprocessing pipeline and vocabulary reduc- tionstrategy,hasproveneffectiveinaddressingthechallenges posed by noisy and unstructured social media data.
To enhance the model’s applicability and robustness, future research should focus on several key areas:
1) Multilingual Support: Incorporate multilingual embed- dingsortranslationmodelstodetecthatespeechacrossvarious languages, thereby increasing the system’s global applicability.
2) Multi-Class Classification: Develop a more nuanced classification system that can distinguish between different types and severities of hate speech, enabling more targeted content moderation strategies.
3) Real-Time Deployment: Implement the model in a real-time environment, such as a Flask API, to facilitate immediate detection and response to hate speech on social media platforms.
4) Continuous Vocabulary Update: Establish mechanisms for the dynamic updating of the model’s vocabulary to capture emerging terms and slang used in hate speech, ensuring the system remains current with evolving language trends.
5) Comprehensive Data Collection: Expand data collection methods to include a wider array of hate speech examples, encompassing both explicit and implicit instances, to improve the model’s generalization capabilities.
By addressing these areas, the proposed system can evolve intoamorerobustandversatiletoolformitigatingtheproliferationofgender-biasedhatespeechonsocialmedia, contributingtoasaferandmoreinclusiveonlineenvironment.
References
[1] Arango, J. Pe´rez, and B. Poblete, ”Hate Speech Detection is Not asEasy as You May Think,” Proc. 42nd Int. ACM SIGIR Conf., 2019, doi:10.1145/3331184.3331262.
[2] J.H. Park and P. Fung, ”Abusive Language Detection Using Hierarchical LSTMs,” Proc. EMNLP, 2020, pp. 210-220.
[3] H. Mubarak, ”A Literature Review of Textual Hate Speech Detection Methods,” Information, vol. 13, no. 6, 2022, doi: 10.3390/info13060273.
[4] A. Das et al., ”Multilingual Hate Speech Detection Using Transformers,”arXiv:2401.11021, 2024.
[5] A. Mandal et al., ”Multimodal Hate Speech Detection with Attentive Fusion,” arXiv:2401.10653, 2024.
[6] S. Unnava and S.R. Parasana, ”Deep Learning for CyberbullyingDetec- tion with Focal Loss,” J. Comput. Eng. Sci., vol. 14, no. 4, 2024.
[7] M.L. Ripoll et al., ”Transformer Ensembles for Multilingual Detection,”
[8] Proc. HASOC, 2022, pp. 45-53.
[9] K. Darwish et al., ”CBDC-Net: Advanced Cyberbullying Detection,”
[10] IJCESEN, vol. 9, no. 2, 2024.
[11] Z. Zhang et al., ”Hate Speech Detection Using CNNs,” Proc. WWW, 2018, pp. 1345-1354.
[12] J. Devlin et al., ”BERT: Pre-training of Deep Bidirectional Transform- ers,” Proc. NAACL, 2019, pp. 4171-4186.