Career guidance and college admission decisions represent critical junctures in a student’s academic trajectory, yet conventional approaches rely heavily on manual counseling and subjective assessments. This paper presents SmartCareer Guide, an intelligent web-based system that integrates Machine Learning, Artificial Intelligence, and Natural Language Pro- cessing to provide automated career guidance and data-driven college admission predictions. The system implements four core modules: secure user authentication using Flask and MySQL with password hashing, an AI-powered resume analyzer utilizing OCR and GPT-based Large Language Models for ATS score generation, an NLP-driven job recommendation engine that matches skills to relevant roles, and a machine learning-based college admission predictor employing Logistic Regression for probability estimation. Resume analysis accepts multiple formats (PDF, DOCX, images) and leverages Tesseract OCR for text extraction, followed by NLP-based skill identification and regex- based structured output parsing. The college admission predic- tion module utilizes a pre-trained Logistic Regression model with Label Encoding and feature scaling to compute admission probabilities based on college name, branch, and percentile inputs. Experimental evaluation demonstrates accurate ATS scoring, relevant job recommendations aligned with extracted skillsets, and reliable admission probability predictions. The system effectively reduces manual counseling overhead, provides objective skill assessments, and delivers data-driven admission forecasts, serving as a comprehensive decision-support platform for students navigating career planning and college selection.
Introduction
The text discusses the development of SmartCareer Guide, a web-based intelligent system designed to support students in career planning and college admission decisions. Traditional career counseling methods rely heavily on manual interviews, subjective assessments, and limited personalized guidance, while college admission decisions are often based on historical cutoffs and informal advice rather than data-driven probability estimates. These limitations create information gaps and reduce the effectiveness of decision-making.
Advances in Artificial Intelligence (AI), Machine Learning (ML), and Natural Language Processing (NLP) have enabled automated resume analysis, skill extraction, and predictive modeling for admissions. The proposed system integrates four major functionalities into a single platform:
Secure user authentication
AI-powered resume analysis with ATS scoring
NLP-based job recommendation
Machine learning-based college admission prediction
The main objective of the platform is to provide students with objective career guidance, skill-based job recommendations, and probabilistic admission forecasts based on academic performance.
The problem statement identifies several limitations in existing systems:
Limited access to personalized counseling for large student populations.
Lack of objective resume evaluation and ATS compatibility analysis.
Absence of skill-based job recommendations derived from actual resumes.
Uncertainty in college admission outcomes.
Manual effort required to process resumes in different formats.
To address these challenges, the system introduces several key contributions:
A secure authentication system using Flask, MySQL, and password hashing.
Automated resume processing for PDF, DOCX, and image files using parsing libraries and OCR.
GPT-based ATS scoring to evaluate resume quality.
NLP-based skill extraction and intelligent job recommendations.
Logistic Regression-based admission probability prediction using historical admission data.
A fully integrated web interface covering all functionalities.
The literature review explains the evolution of career guidance systems from traditional psychometric assessments to AI-driven recommendation systems. Earlier resume analysis methods relied on regular expressions and template matching, while modern approaches use Named Entity Recognition (NER), dependency parsing, and Large Language Models for semantic understanding. OCR technologies such as Tesseract OCR improve text extraction from image-based resumes. In college admission prediction, supervised learning algorithms such as Decision Trees, Random Forests, Support Vector Machines, and Logistic Regression have been widely used, with Logistic Regression proving effective for probabilistic prediction tasks.
The study identifies a major research gap: most existing systems focus on either career guidance or admission prediction separately, without integrating resume analysis, ATS scoring, job recommendation, and admission forecasting into one platform. The proposed SmartCareer Guide addresses this gap through a modular architecture consisting of:
A frontend web interface for user interaction.
A Flask-based backend server.
A MySQL database for storing user and prediction data.
An AI/ML processing layer for resume analysis and admission prediction.
A data processing layer for OCR and text extraction.
Conclusion
This paper presented SmartCareer Guide, an integrated intelligent system combining Machine Learning, Artificial Intelligence, and Natural Language Processing for comprehen- sive career guidance and college admission prediction. The implemented platform addresses critical gaps in traditional counseling approaches through four core modules: secure authentication, AI-powered resume analysis with ATS scoring, NLP-based job recommendation, and probabilistic admission forecasting using Logistic Regression.
Experimental evaluation demonstrated 98.5% text extraction accuracy for standard documents, 89.7% skill identification precision, 100% ATS score generation success, and 91.2% admission prediction accuracy. The system successfully pro- cesses multiple resume formats via OCR and parsing libraries, employs GPT-based models for semantic resume analysis, and utilizes regex for structured output extraction. The Logistic Regression model effectively transforms categorical inputs (college, branch) and numerical features (percentile) into cali- brated admission probabilities through sigmoid transformation.
The modular architecture enables scalable deployment in educational institutions, providing students with objective skill assessments, relevant career recommendations, and data-driven admission forecasts. By automating resume analysis and ad- mission prediction, the system reduces counseling overhead while delivering personalized, accessible guidance at scale.
Future work will focus on deep learning-based resume parsing, real-time job market integration, ensemble admission models, and multilingual support to enhance system capa- bilities and broader applicability across diverse educational contexts.
References
[1] M. Johnson and R. Smith, “Expert systems for career counseling: A survey,” Journal of Career Development, vol. 42, no. 3, pp. 215–230, 2018.
[2] A. Gupta and P. Sharma, “Collaborative filtering approaches to career recommendation systems,” International Journal of Information Tech- nology, vol. 11, no. 4, pp. 567–578, 2019.
[3] K. Zhang and L. Chen, “Resume information extraction using regular expressions and template matching,” Proceedings of the IEEE Interna- tional Conference on Data Mining, pp. 234–241, 2017.
[4] S. Patel et al., “Named entity recognition for skill extraction from resumes,” ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 5, pp. 1–18, 2020.
[5] T. Brown et al., “Language models are few-shot learners,” Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, 2020.
[6] R. Smith, “An overview of the Tesseract OCR engine,” Proceedings of the International Conference on Document Analysis and Recognition, pp. 629–633, 2007.
[7] V. Kumar and A. Singh, “Preprocessing techniques for improving OCR accuracy in mobile-captured documents,” Pattern Recognition Letters, vol. 128, pp. 45–52, 2019.
[8] H. Lee and J. Park, “Machine learning approaches for college admission prediction,” Expert Systems with Applications, vol. 156, pp. 113–125, 2020.
[9] D. Hosmer and S. Lemeshow, Applied Logistic Regression, 3rd ed. Hoboken, NJ: Wiley, 2013.
[10] G. James et al., An Introduction to Statistical Learning with Applications in Python. New York: Springer, 2021.
[11] T. Peters, “Timsort description,” Python Software Foun- dation, Technical Report, 2002. [Online]. Available: https://hg.python.org/cpython/file/tip/Objects/listsort.txt
[12] N. Jain and S. Verma, “Automated resume screening using natural language processing,” International Journal of Computer Applications, vol. 175, no. 12, pp. 21–26, 2020.
[13] M. Anderson et al., “Feature engineering for admission prediction systems,” IEEE Transactions on Learning Technologies, vol. 13, no. 2, pp. 298–310, 2020.
[14] R. Zhao and Q. Liu, “Job recommendation systems: A survey,” ACM Computing Surveys, vol. 52, no. 5, pp. 1–35, 2019.
[15] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.