Linguistic Feature-Driven Fake News Detection

Authors: Vineet ., Sunil Kumar Nandal

DOI Link: https://doi.org/10.22214/ijraset.2025.74154

Abstract

This study presents a data mining-driven prediction approach for assessing the well-being of IT workers. The strategy puts individuals into groups depending on how likely they are to become sick, such low, medium, and high. It looks at factors like how active they are, how they eat, how stressed they are, how well they sleep, and their medical history. The strategy combines appropriate preprocessing, feature selection, and classification algorithms to make sure the findings are correct and trustworthy. There are a number of tables and figures that support the testing findings and show that the model may provide important health information. The findings illustrate how it may help individuals remain healthy at work, recognize threats early, and stay healthy. This approach is meant to bring IT-based decision-making and keeping an eye on workers\' health closer together.

Introduction

In the digital age, the rapid spread of information via social media and internet news sites has made it easier for misinformation and fake news to proliferate, impacting public opinion, democracy, and social harmony. Detecting fake news quickly is crucial but challenging, especially when traditional fact-checking methods—relying on content verification, user reactions, and network metadata—are ineffective or unavailable.

To address this, linguistic feature-based fake news detection has emerged, leveraging text analysis techniques such as word frequency, part-of-speech distribution, readability scores, and discourse structures. This method uses natural language processing (NLP) and machine learning (ML)/deep learning (DL) models to detect subtle stylistic and semantic differences between true and false news in real-time, without depending on external fact-checking databases. The approach is scalable, language-independent, and offers explainability by revealing which linguistic features contribute to the classification.

The motivation behind this research is to counter the societal harm caused by fake news, including increased political polarization and public mistrust. By extracting and analyzing linguistic cues like sentiment polarity, lexical diversity, syntax, and discourse coherence, the research aims to develop robust, cross-domain, multilingual, and interpretable fake news detection models.

Key Research Contributions Include:

Advanced frameworks for extracting syntactic, semantic, and stylistic features.
Multi-layer linguistic analysis to improve accuracy and reduce false positives.
Curated datasets with annotated fake and real news for training.
Hybrid detection models integrating linguistic features with ML/DL techniques.
Validation across domains like politics, health, and entertainment.
Real-time scalable detection systems.
Enhanced interpretability to build user trust.
Performance benchmarking showing improvements over existing methods.

Literature Review Highlights:

Various hybrid models combining textual, contextual, and multimodal data show improved fake news detection.
Graph neural networks (GNNs) and attention mechanisms enhance detection by modeling propagation and user behavior.
Sentiment analysis, stylometry, and linguistic style features contribute valuable signals but have limitations like domain or language specificity.
Transformer-based models excel in many scenarios but require extensive pretraining and diverse datasets.
Challenges include dataset bias, annotation difficulties, multilingual adaptation, computational costs, and balancing fairness, transparency, and accuracy.
Future directions stress multimodal integration, scalable real-time systems, and improving model generalizability and robustness.

This research domain is critical for developing trustworthy AI tools to combat misinformation and protect democratic discourse in an increasingly complex digital information ecosystem.

Conclusion

The proposed data-mining prediction model for assessing the health of IT professionals has shown its capability to effectively categorize and evaluate health risk levels, Low, Medium, and High, by using significant lifestyle, occupational, and physiological variables. The data reveals that 45% of individuals are in the low-risk category. This means that a lot of workers are healthy.

References

[1] T. Bellam and P. L. Prasanna, “Synergistic Approach for Fake News Detection: Bi-GRUs Coupled with Count Vectorizer and TF-IDFs,” SN Computer Science, vol. 6, no. 5, pp. 1–12, 2025. [2] T. V. La, M. H. Nguyen, and M. S. Dao, “KGAlign: Joint Semantic-Structural Knowledge Encoding for Multimodal Fake News Detection,” arXiv preprint arXiv:2505.14714, 2025. [3] Y. Liu, X. Shen, Y. Zhang, Z. Wang, Y. Tian, J. Dai, and Y. Cao, “A systematic review of machine learning approaches for detecting deceptive activities on social media: Methods, challenges, and biases,” International Journal of Data Science and Analytics, pp. 1–26, 2025. [4] D. Zhang and J. Mo, “LinguaSynth: Heterogeneous Linguistic Signals for News Classification,” arXiv preprint arXiv:2506.21848, 2025. [5] A. Choudhary and A. Arora, “GIN-FND: Leveraging users’ preferences for graph isomorphic network driven fake news detection,” Multimedia Tools and Applications, vol. 83, no. 22, pp. 62061–62087, 2024. [6] M. Zhao, Y. Zhang, and G. Rao, “Fake news detection based on dual-channel graph convolutional attention network,” Journal of Supercomputing, vol. 80, no. 9, 2024. [7] P. Alian, N. Nashid, M. Shahbandeh, T. Shabani, and A. Mesbah, “Feature-Driven End-To-End Test Generation,” arXiv preprint arXiv:2408.01894, 2024. [8] Y. Wang, J. Zhang, and J. Zhou, “Urban traffic tiny object detection via attention and multi-scale feature driven in UAV-vision,” Scientific Reports, vol. 14, no. 1, p. 20614, 2024. [9] Y. Chen, “A Preliminary Observation: Can One Linguistic Feature Be the Deterministic Factor for More Accurate Fake News Detection?,” 2023. [10] S. V. Balshetwar, A. Rs, and D. J. R, “Fake news detection in social media based on sentiment analysis using classifier techniques,” Multimedia Tools and Applications, vol. 82, no. 23, pp. 35781–35811, 2023. [11] A. K. Yadav, S. Kumar, D. Kumar, L. Kumar, K. Kumar, S. K. Maurya, … and D. Yadav, “Fake news detection using hybrid deep learning method,” SN Computer Science, vol. 4, no. 6, p. 845, 2023. [12] R. Madden, “A Style-Based Approach for Detecting COVID-19 Fake News,” Ph.D. dissertation, Dublin Institute of Technology, 2023. [13] J. Alghamdi, Y. Lin, and S. Luo, “A comparative study of machine learning and deep learning techniques for fake news detection,” Information, vol. 13, no. 12, pp. 2–28, 2022. [14] S. Garg and D. Sharma, “Linguistic features based framework for automatic fake news detection,” Computers & Industrial Engineering, vol. 172, no. 4, p. 108432, 2022. [15] R. Khan, A. Shihavuddin, M. S. Syeed, R. U. Haque, and F. Uddin, “Improved fake news detection method based on deep learning and comparative analysis with other machine learning approaches,” in Proc. Int. Conf. Electrical, Electronics and Information Technology (ICEET), pp. 1–1, 2022. [16] N. N. Prachi, M. Habibullah, E. H. Rafi, E. Alam, and R. Khan, “Detection of fake news using machine learning and natural language processing algorithms,” Journal of Advances in Information Technology, vol. 13, no. 6, pp. 652–661, 2022. [17] A. Rafique, F. Rustam, M. Narra, A. Mehmood, E. Lee, and I. Ashraf, “Comparative analysis of machine learning methods to detect fake news in an Urdu language corpus,” PeerJ Computer Science, vol. 8, no. 1, p. e1004, 2022. [18] B. Mahmud, S. Mahi, M. Shuvo, S. Islam, and M. K. Morol, “A comparative analysis of graph neural networks and commonly used machine learning algorithms on fake news detection,” arXiv preprint arXiv:2203.14132, pp. 1–8, 2022. [19] A. Ali, F. Ghaleb, B. Al-rimy, F. Alsolami, and A. Khan, “Deep ensemble fake news detection model using sequential deep learning technique,” Sensors, vol. 22, no. 1, p. 6970, 2022. [20] M. Al-yahya, H. Al-Khalifa, H. Al-Baity, D. AlSaeed, and A. Essam, “Arabic fake news detection: Comparative study of neural networks and transformer-based approaches,” Complexity, vol. 2021, no. 1, pp. 1–10, 2021. [21] T. Chauhan and H. Palivela, “Optimization and improvement of fake news detection using deep learning approaches for societal benefit,” International Journal of Information Management Data Insights, vol. 1, no. 1, pp. 1–10, 2021. [22] A. Choudhary and A. Arora, “Linguistic feature-based learning model for fake news detection and classification,” Expert Systems with Applications, vol. 169, no. 2, 2020. [23] M. Jain, D. Gopalani, Y. Meena, and R. Kumar, “Machine learning-based fake news detection using linguistic features and word vector features,” in Proc. IEEE Uttar Pradesh Section Int. Conf. Electrical, Electronics and Computer Engineering (UPCON), pp. 1–6, 2020. [24] P. Faustini and T. Covoes, “Fake news detection in multiple platforms and languages,” Expert Systems with Applications, vol. 158, p. 113503, 2020. [25] A. Abd and M. Baykara, “Fake news detection using machine learning and deep learning algorithms,” pp. 18–23, 2020.

Copyright

Copyright © 2025 Vineet ., Sunil Kumar Nandal. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET74154

Publish Date : 2025-09-08

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here