The widespread growth of online platforms has led to an overwhelming volume of user-generated questions, many of which are redundant, poorly categorized, or lack clarity in purpose. Effective question classification has thus become a crucial task in natural language processing, especially for applications like intelligent search systems, educational forums, and conversational agents. This study explores a transfer learning-based approach for classifying questions into predefined categories by leveraging the power of RoBERTa,a robustly optimized transformer model. We rely on a preprocessed draft of the Quora Question Pairs?dataset for training and evaluation. For more reliable learning, we adapt fine-tuning?strategies to a subsampled subset and analyze linguistic properties and semantic embeddings. Results show higher accuracy especially among the overlapped classes or the fine grained semantic?ones. Furthermore, our method demonstrates strong capability for?detecting and removing duplicate questions, leading to cleaner data and more effective information retrieval. These results bolster the utility of transfer learning for tackling challenging language problems with only limited?amounts of manual feature engineering. As part of future?work, we are likely to generalize this model for multi-label classification, domain adaptation, and to real-time question stream analysis in continual systems.
Introduction
With the surge of user-generated content on platforms like Quora and Stack Overflow, organizing and classifying user-posed questions has become critical. Traditional ML methods (e.g., SVM, Naïve Bayes) struggle with natural language complexity, ambiguity, and diverse phrasing. Transformer-based models, especially RoBERTa, offer a powerful alternative thanks to their deep contextual understanding.
???? Problem & Motivation
Question classification is difficult due to ambiguous intent, slang, and multi-topic phrasing.
Traditional models fail in nuanced or overlapping categories (e.g., “Why do we get a fever?” could relate to health, biology, or science).
There's a growing need for context-aware, adaptive, and explainable systems.
Transfer learning (e.g., fine-tuning RoBERTa) provides a way to apply learned language understanding to specific tasks like classification and duplicate detection—without extensive manual feature engineering.
????? Proposed Methodology
The study fine-tunes RoBERTa, a pre-trained language model, for the task of question classification and duplicate detection.
A. Key Steps
Dataset: Uses a subset of the Quora Question Pairs dataset.
Preprocessing: Cleaning, normalization, tokenization using RoBERTa’s tokenizer.
Model Fine-Tuning: A classification head is added to RoBERTa; trained using cross-entropy loss and Adam optimizer with early stopping.
Training Split: Divided into train, validation, and test sets to ensure robust evaluation.
B. Evaluation Metrics
Accuracy, Precision, Recall, F1-Score, and Confusion Matrix used for performance analysis.
C. Duplicate Detection
Semantic similarity (e.g., cosine similarity on embeddings) used to detect and filter near-identical questions, improving user experience.
???? Key Contributions
Effective Fine-Tuning: Achieved high classification accuracy with minimal data using RoBERTa.
Low Resource Need: The method works without deep architectures or intensive compute, ideal for practical deployment.
Duplicate Question Detection: Showed promise in detecting semantically similar questions, essential for QA platforms.
Minimal Feature Engineering: Relied on RoBERTa’s pretrained language capabilities rather than handcrafted features.
???? Results
RoBERTa outperforms traditional models (e.g., SVM, Naïve Bayes, LSTM) significantly on question classification benchmarks.
Models like BERT and RoBERTa consistently surpass 95% accuracy on datasets like TREC.
Fine-tuned models generalize well, even with modest training data, proving more effective and scalable than prior approaches.
???? Research Gaps Identified
Limited contextual awareness in dialog-based or follow-up questions.
Lack of multi-label classification, despite real-world questions often fitting multiple categories.
Domain adaptation issues when applied to specialized topics (e.g., legal, medical).
Low explainability, which is problematic for critical applications.
References
[1] X. Li and D. Roth, \"Learning Question Classifiers,\" Proceedings of the 19th International Conference on Computational Linguistics (COLING), Taipei, Taiwan, 2002, pp. 556–562.
[2] J. Devlin, M. Chang, K. Lee, and K. Toutanova, \"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,\" Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, USA, 2019, pp. 4171–4186.
[3] Y. Liu et al., \"RoBERTa: A Robustly Optimized BERT Pretraining Approach,\" arXiv preprint arXiv:1907.11692, 2019.
[4] T. Wolf et al., \"Transformers: State-of-the-Art Natural Language Processing,\" Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020, pp. 38–45.
[5] S. Ranjan, \"Question Classification Using BERT and Transfer Learning,\" International Journal of Engineering Research & Technology (IJERT), vol. 9, no. 07, pp. 1496–1501, Jul. 2020.
[6] Y. Zhang, D. Zhao, and T. Liu, \"Multi-turn Question Matching with Deep Attention Networks,\" Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), Melbourne, Australia, 2018, pp. 1118–1127.
[7] A. Vaswani et al., \"Attention Is All You Need,\" Advances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017.
[8] S. Wang and J. Jiang, \"A Compare-Aggregate Model for Matching Text Sequences,\" arXiv preprint arXiv:1703.04816, 2017.
[9] M. Qiu et al., \"Question Answering via Sentence Selection Using Question-Focused Neural Networks,\" Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM), 2016, pp. 607–616.
[10] J. Howard and S. Ruder, \"Universal Language Model Fine-tuning for Text Classification,\" arXiv preprint arXiv:1801.06146, 2018
[11] C. Sun, X. Qiu, Y. Xu, and X. Huang, \"How to Fine-Tune BERT for Text Classification?\" arXiv preprint arXiv:1905.05583, 2019.
[12] L. Yao, C. Mao, and Y. Luo, \"Graph Convolutional Networks for Text Classification,\" arXiv preprint arXiv:1809.05679, 2018.
[13] M. Bayer, M.-A. Kaufhold, and C. Reuter, \"A Survey on Data Augmentation for Text Classification,\" arXiv preprint arXiv:2107.03158, 2021.
[14] F. Zhuang et al., \"A Comprehensive Survey on Transfer Learning,\" Proceedings of the IEEE, vol. 109, no. 1, pp. 43–76, Jan. 2021.
[15] R. Sharma, P. Gupta and R. K. Jha, \"LTE-A heterogeneous networks using femtocells,\" Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 4, pp. 131–134, 2019.
[16] A. Verma and M. Kumar, \"A comprehensive review on resource allocation techniques in LTE-Advanced small cell heterogeneous networks,\" J. Adv. Res. Dyn. Control Syst., vol. 10, no. 12, 2018.
[17] A. Singh and N. K. Agarwal, \"Power control schemes for interference management in LTE-Advanced heterogeneous networks,\" Int. J. Recent Technol. Eng., vol. 8, no. 4, pp. 378–383, Nov. 2019.
[18] 18 . M. Sharma, R. Sharma and A. S. Yadav, \"Performance analysis of resource scheduling techniques in homogeneous and heterogeneous small cell LTE-A networks,\" Wireless Pers. Commun., vol. 112, no. 4, pp. 2393–2422, 2020.
[19] S. Kumar and P. Bansal, \"Design and analysis of enhanced proportional fair resource scheduling technique with carrier aggregation for small cell LTE-A heterogeneous networks,\" Int. J. Adv. Sci. Technol., vol. 29, no. 3, pp. 2429–2436, 2020
[20] R. Yadav and A. Singh, \"Hybrid optimization-based resource allocation and admission control for QoS in 5G network,\" Int. J. Commun. Syst., Wiley, 2025, doi: 10.1002/dac.70120