Everyone has the right to freedom of expression. However, under the guise of free speech, this privilege is being abused to discriminate against and harm others, either physically or verbally. Hate speech is the term for this type of bigotry. Hate speech is described as language used to show hatred toward an individual or a group of individuals based on traits such as race, religion, ethnicity, gender, nationality, handicap, and sexual orientation. It can take the form of speech, writing, gestures, or displays that target someone due to their affiliation with a particular group. Hate speech has been more prevalent in recent years, both in person and online. Hateful content is bred and shared on social media and other internet platforms, which finally leads to hate crimes. The growing use of social media platforms and information exchange has resulted in significant benefits for humanity. However, this has resulted in several issues, including the spread and dissemination of hate speech messages. Recent studies used a range of machine learning and deep learning techniques with text mining methods to automatically detect hate speech messages on real-time datasets to handle this developing issue on social media platforms. Hence, this paper aims to survey the various algorithms to detect hateful comments and predict the best algorithms in social media datasets. And also implemented in real-time social environments to detect hate speech with mobile intimation.
Introduction
Introduction
Social media has become a vital part of daily life, enabling users to express opinions and connect globally. However, it also serves as a platform for hate speech, cyberbullying, blackmail, sexism, racism, and political abuse. The accessibility of high-speed internet and smartphones has increased user engagement, especially among people under 30.
As a result, researchers have utilized vast social media data for sentiment analysis and hate speech detection using Natural Language Processing (NLP) and Machine Learning (ML).
II. Related Work – Key Research Contributions
P. Fortuna & S. Nunes: Discussed challenges in defining and detecting hate speech across languages and platforms. Emphasized the need for shared datasets and annotation standards.
A. Tolba et al.: Used KNLPEDNN, a deep learning model combining NLP with ensemble learning, which improved accuracy and reduced misclassification on Twitter hate speech detection.
R. Cao et al.: Proposed DeepHate, a deep learning model using multi-faceted textual features. Outperformed existing models across three datasets and improved model explainability.
Z. Waseem & D. Hovy: Created a dataset of 16k tweets labeled for hate speech. Found that character-level n-grams and gender features helped improve detection.
T. Davidson et al.: Developed a multi-class classifier to distinguish between hate speech, offensive language, and neutral text using crowd-sourced tags. Noted challenges in context sensitivity.
P. Badjatiya et al.: Tested classifiers like Logistic Regression, SVM, DNN, CNN, LSTM using word embeddings (TF-IDF, BoW). Found that DNN-based methods performed well but struggled with language complexity.
M. O. Ibrahim & I. Budi: Built a hate speech dataset in Indonesian Twitter using SVM, Naive Bayes, Random Forest with techniques like Label Powerset and Classifier Chains. Found RF with LP yielded the best accuracy.
I. Alfina et al.: Developed a general-purpose hate speech dataset in Indonesian focusing on religion, race, gender. Studied combinations of machine learning methods to detect hate speech.
J. Salminen et al.: Built a cross-platform hate speech classifier using BERT, Word2Vec, TF-IDF and found XGBoost + BERT gave best results (F1 = 0.92). Released it as an open-source mobile app.
III. Existing Methodologies
A. Keyword-Based Detection
Uses predefined dictionaries (e.g., Hatebase) to detect hateful content.
Fast but lacks context and struggles with subtle or implied hate speech.
High false positives if non-hateful terms are flagged; low recall if hate is implied without specific terms.
B. Machine Learning Approaches
Preprocessing & Feature Selection
Uses n-grams, stemming, TF-IDF, Bag of Words, and word embeddings (e.g., Word2Vec).
Advanced models (RNNs, Transformers) consider word order and context.
Classification Models
Algorithms: Naive Bayes, SVM, Logistic Regression, Random Forest, XGBoost, DNNs
Goal: Train models using labeled datasets to detect hate vs. non-hate content.
IV. Key Challenges
Ambiguity between offensive and hateful speech.
Contextual understanding and sarcasm detection remain difficult.
Multilingual and culturally nuanced expressions of hate require robust datasets.
Balancing precision vs. recall and minimizing bias in classifiers.
Conclusion
We can survey the existing machine learning deep learning models in this research. We may conclude that deep learning models can be used to solve a variety of problems. The widely used machine learning and deep learning approaches for text classification were explored and compared in this work. We discovered that several forms of BPNN perform well in sequential learning tasks and solve the problems of disappearing and explosion of weights in standard text classification algorithms when learning long-term relationships in this work. Furthermore, the performance of BPNN models can be affected by hidden size and batch size.
References
[1] P. Fortuna and S. Nunes, \\\'\\\'A survey on automatic detection of hate speech in text,\\\'\\\' ACM Comput. Surv., vol. 51, no. 4, pp. 1–30, Sep. 2018.
[2] Z. Al-Makhadmeh and A. Tolba, \\\'\\\'Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach,\\\'\\\' Computing, vol. 102, no. 2, pp. 501–522, Feb. 2020.
[3] R. Cao, R. K.-W. Lee, and T.-A. Hoang, \\\'\\\'DeepHate: Hate speech detection via multi-faceted text representations,\\\'\\\' in Proc. 12th ACM Conf. Web Sci., Southampton, U.K., Jul. 2020, pp. 11–20.
[4] Z. Waseem and D. Hovy, \\\'\\\'Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter,\\\'\\\' in Proc. NAACL Student Res. Workshop, San Diego, CA, USA, Jun. 2016, pp. 88–93.
[5] T. Davidson, D. Warmsley, M. Macy, and I. Weber, \\\'\\\'Automated hate speech detection and the problem of offensive language,\\\'\\\' in Proc. ICWSM, Montreal, QC, Canada, May 2017, pp. 15–18.
[6] P. Badjatiya, S. Gupta, M. Gupta, and V. Varma, \\\'\\\'Deep learning for hate speech detection in tweets,\\\'\\\' in Proc. 26th Int. Conf. World Wide Web Companion (WWW Companion), Perth, WA, Australia, Apr. 2017, pp. 759–760.
[7] M. O. Ibrohim and I. Budi, \\\'\\\'Multi-label hate speech and abusive language detection in Indonesian Twitter,\\\'\\\' in Proc. 3rd Workshop Abusive Lang. Online, Florence, Italy, Aug. 2019, pp. 46–57.
[8] I. Alfina, R. Mulia, M. I. Fanany, and Y. Ekanata, \\\'\\\'Hate speech detection in the Indonesian language: A dataset and preliminary study,\\\'\\\' in Proc. Int. Conf. Adv. Comput. Sci. Inf. Syst. (ICACSIS), Jakarta, Indonesia, Oct. 2017, pp. 233–238
[9] M. O. Ibrohim and I. Budi, \\\'\\\'A dataset and preliminaries study for abusive language detection in Indonesian social media,\\\'\\\' Procedia Comput. Sci., vol. 135, pp. 222–229, Jan. 2018.
[10] J. Salminen, M. Hopf, S. A. Chowdhury, S.-G. Jung, H. Almerekhi, and B. J. Jansen, \\\'\\\'Developing an online hate classifier for multiple social media platforms,\\\'\\\' Hum.-centric Comput. Inf. Sci., vol. 10, no. 1, pp. 1–34, Dec. 2020
[11] A. Jha and R. Mamidi, \\\'\\\'When does a compliment become sexist? Analysis and classification of ambivalent sexism using Twitter data,\\\'\\\' in Proc. 2nd Workshop NLP Comput. Social Sci., Vancouver, BC, Canada, Aug. 2017, pp. 7–16.
[12] S. Yuan, X. Wu, and Y. Xiang, \\\'\\\'A two phase deep learning model for identifying discrimination from tweets,\\\'\\\' in Proc. EDBT, Bordeaux, France, Mar. 2016, pp. 696–697.
[13] M. Mozafari, R. Farahbakhsh, and N. Crespi, \\\'\\\'Hate speech detection and racial bias mitigation in social media based on BERT model,\\\'\\\' PLoS ONE, vol. 15, no. 8, pp. 1–26, Aug. 2020.
[14] P. Burnap and M. L. Williams, \\\'\\\'Cyber hate speech on Twitter: An application of machine classification and statistical modeling for policy and decision making,\\\'\\\' Policy Internet, vol. 7, no. 2, pp. 223–242, Jun. 2015.
[15] M. Wiegand, J. Ruppenhofer, and T. Kleinbauer, \\\'\\\'Detection of abusive language: The problem of biased datasets,\\\'\\\' in Proc. HLT-NAACL, Minneapolis, MN, USA, Jun. 2019, pp. 602–608.
[16] M. Mozafari, R. Farahbakhsh, and N. Crespi, \\\'\\\'Hate speech detection and racial bias mitigation in social media based on BERT model,\\\'\\\' PLoS ONE, vol. 15, no. 8, pp. 1–26, Aug. 2020.
[17] P. Burnap and M. L. Williams, \\\'\\\'Cyber hate speech on Twitter: An application of machine classification and statistical modeling for policy and decision making,\\\'\\\' Policy Internet, vol. 7, no. 2, pp. 223–242, Jun. 2015.
[18] M. Wiegand, J. Ruppenhofer, and T. Kleinbauer, \\\'\\\'Detection of abusive language: The problem of biased datasets,\\\'\\\' in Proc. HLT-NAACL, Minneapolis, MN, USA, Jun. 2019, pp. 602–608.
[19] D. Cer, Y. Yang, S. Kong, N. Hua, N. Limtiaco, R. John, N. Constant, M. Guajardo-Céspedes, S. Yuan, C. Tar, Y. Sung, and R. Kurzweil, \\\'\\\'Universal sentence encoder,\\\'\\\' in Proc. EMNLP, Brussels, Belgium, Mar. 2018, pp. 169–174.
[20] V. Basile, C. Bosco, E. Fersini, D. Nozza, V. Patti, F. M. R. Pardo, P. Rosso, and M. Sanguinetti, \\\'\\\'SemEval-2019 task 5: Multilingual detection of hate speech against immigrants and women in Twitter,\\\'\\\' in Proc. 13th Int. Workshop Semantic Eval., Minneapolis, MN, USA, Jun. 2019, pp. 54–63.