Authors: Smitha Kurian, Mohammed Aman Khan, Hema Priya P, Manavpreet Singh, Lobsang Norbu
Certificate: View Certificate
Traditional recruiting approaches are becoming insufficient due to the fast expansion of employment markets. This is due to the fact that the corporate business is expanding at a rapid pace, and each organisation receives a large number of applications. On the practical side, reading and analysing each and every resume manually is an impossible and time-consuming task. To address this issue, this research offers a detailed and repeatable evaluation of the performance of Automatic Interviewing software, providing the framework for future benchmarking and relevant comparison studies. We also discovered that SpaCy consistently outperforms for Information Extraction and Named Entity Recognition.
Online video-based interviews have become increasingly popular in the hiring process in recent years, bringing many benefits to both interviewers and interviewees, including the convenience of offline reviewing and decision making by human resources (HR) staff, allowing HR staff to assess multiple job applicants in a short time frame. It also paves the way for computerized performance analytics to help with initial HR decisions and maybe decrease human biases. According to the study papers, the interviewer traditionally assesses the success or failure of the interviewee's attempt subjectively, either through a holistic impression or quantitative assessments. As a result, in a limited period of time, interviewers must successfully broaden their excitement and knowledge using multimodal behaviors such as speech content, prosody, gaze direction, facial emotions, and other nonverbal indications. In comparison to the study that will be briefly discussed, it was discovered that it includes audio and video analysis that will be captured in a web browser, making it exceedingly expensive and heavy to execute on every system. It has been discovered that the spaCy library is the best fit for most use cases owing to its adaptability and ease of use in a variety of environments. According to other research articles, spaCy provides all of the essential algorithms for this analysis. The findings, which will be explored more below, argues for training the model by using the present crowd. When sourcing for interview collection, it is critical to train the models with the best interviews first so that they can anticipate the best conclusion that is near to human judgment. To proceed, the papers demonstrate that data should be collected from the resume, which is then sorted and a number of applicants for online interviews should be chosen. This decreases the quantity of video interviews, and hence the computing cost. The existing Automated resume screening that reduces time to hire by saving recruiters the hours spent manually reading resumes. Automated pre-qualification through chatbots that enhances the candidate experience by providing continuous, real-time feedback. Few of the existing systems failed because it was not able to bridge the gaps like humanising the recruitment process, inefficient hiring, technology failure to understand interpersonal skills and so on. Advanced technologies and methodologies can bridge the gap between systems that, at some time, failed to satisfy the user's requirements or were difficult to use. This survey may anticipate that concerns or problems will be solved with this new system, providing users with a user-friendly atmosphere and experience that will improve their confidence. It is anticipated to address efficiency, accuracy, and reliability difficulties, as well as limits of present technologies.
II. INFORMATION EXTRACTION
LSTM-Long Short Term Memory (RNN) or NER-Named Entity Recognition both work well for extracting information from resumes. LSTM is a neural network architecture used for sequence prediction, whereas NER is the name of an NLP task. Most of the companies use AI scanners. In 2018, an algorithm was introduced and a paper was published regarding that. . "A Two-Step Resume Information Extraction Algorithm" was the algorithm which was used to extract the information from the resumes and then proceed with it further. Usually text extraction is done using NLP.  Naive-bayes algorithm was used for the text extraction from resumes but it failed to meet the required efficiency. So then, deep learning helped to overcome this issue. Named Entity Recognition (NER) is the algorithm that deep learning can be applied to, for information extraction in resumes. “NER is a sub-task of information extraction that seeks to locate and classify named entity mentions in unstructured text into predefined categories such as the person names, organizations, locations, etc, based on context”.
It should be clear that NER is a very domain-specific problem, and thus it is required to build a deep learning model from scratch. , , , More advanced algorithms use natural language processing, machine learning, statistical approaches, or a combination of these to extract complex concepts like sentiment or tonality.
NLP algorithms are used to provide automatic summarization of the key points in a given text or document. NLP algorithms are also used to classify text according to predefined categories or classes, and is used to organize information.
Named Entity Recognition, Sentiment Analysis, Text Summarization, Aspect Mining, and Topic Modelling are examples of common information extraction approaches.
The first stage in building a model from scratch is determining the model architecture. LSTMs (a type of Neural Network) are utilised in the model, according to research papers and other NLP literature, since they consider the context of a word in a phrase. Once the overall architecture has been determined, work on selecting a dataset for model training and testing should begin. This is the most time-consuming stage and must be prepared in advance.
The data utilised to train the system is one of the most important elements to consider. The data must be unlabeled and must not contribute to the uncertainty. Online services that can be used to collaborate on manual annotation efforts within the team are also extremely beneficial. 
III. SENTIMENT ANALYSIS USING AUDIO, VIDEO AND TEXT
Manual classification of nonverbal behaviours, according to research, is arduous and time consuming, and thus does not scale with large amounts of data. As a result, automated data-driven quantification of both verbal and nonverbal behavioural techniques has been found to be useful, albeit this has yet to be tested in the context of job interviews . This study provides an overview of the issues encountered during automated interpretation of multi-modal human interactions, such as facial expression, prosody, and language. It focuses on forecasting social interactions in the context of college students' job interviews, which is an intriguing and somewhat unexplored subject , , , . Such analysis aids decision-making and is an essential human feature since it informs us of "what other people think." As a result, sentiment and emotion analysis are widely used to comprehend social, political, and corporate activities. According to the papers, nonverbal actions are nuanced, transient, subjective, and might even be contradicting. Following a review of the paper, it was determined that the following technique will be most effective for analysing and predicting interviewee attitudes and emotions.
The system captures videos and audios of the user and provides feedback and judgement on a variety of low-level behavioural patterns, such as average grin intensity, pause duration, speaking tempo, and pitch change, using a camera and a microphone. In the survey, it was discovered that automatic extraction of numerous characteristics from interview videos for the prediction framework. The goal of this training is twofold: first, to anticipate Turker evaluations on overall performance and each behavioral feature, and second, to quantify and derive significant insights on the relative relevance of each modality and their interplay. 
According to the results of a survey, face characteristics for interviewees are extracted from each frame of the video. Faces were first recognised using the Shore framework, and data was then trained for a classifier to discriminate between neutral and happy faces. The AdaBoost technique is used to train the classifier. 
There are two types of datasets that have been presented for computer studies in emotion analysis: short text and long text.
xi = sR(¯xi + Ψiq) + t, equation is used to calculate the emotion , where xi is the coordinate of ith interest point and ¯xi denotes its mean location pre-trained from a large collection of hand-labeled training images. Ψi denotes the bases of local variations for the ith interest point.  According to the survey, models are trained to predict additional interview-specific variables. Characteristics such as eagerness, friendliness, engagement, awkwardness, and so on. It was also discovered that it was tested using a variety of regression models: SVR (Support Vector Machine Regression) , Lasso , L1 Regularized Logistic Regression, Gaussian Process Regression, and others.
IV. MATCHING SIMILARITY IN TEXT
The resemblance of text based on meaning is referred to as semantic similarity. Semantic matching is required to determine the relevance or similarity of text phrases, sentences, or paragraphs. Finding commonalities in text documents is an NLP problem with numerous applications. In order to find commonalities among texts, two aspects must be defined. The process of converting text into an embedding (real valued representation of the text).
The method or methodology for calculating the similarity of embeddings.
Normalization is the process of converting the text to lower case and eliminating any special letters and punctuation.
Tokenization is the process of taking normalized text and separating it into a list of tokens.
Remove stop words: Stop words are the most regularly used terms in a language that offer little meaning to the text. The terms 'the,' 'a,' 'will,' and so on are some instances. Stemming is the process of determining the root of words, which may or may not be the same as the morphological root of the word, but the purpose of stemming is to connect related words to the same stem.
The simplest way to achieve this is with a dictionary. Lemmatization: This is the process of getting the same term for a group of inflected word forms. Examples include "is," "was," and "become."
This pipeline produces a list of tokens that have been formatted.
V. COMPARISON OF NLP TOOLKITS
A. Natural Language Toolkit (NLTK)
NLTK is a popular open-source Python package for natural language processing (NLP) (NLTK Project, 2018). There are several methods available for text tokenization, stemming, stop word removal, classification, clustering, PoS tagging, parsing, and semantic reasoning. It also offers wrappers for other NLP packages. One important aspect of NLTK is that it allows access to over 50 corpora and lexical resources, including the WordNet. 
Stanford CoreNLP is a Java toolkit that integrates NLP research with application development (The Apache Software Foundation, 2018). It is open-source, flexible, and may be used as a basic web service. It is multilingual in Arabic, Chinese, English, French, German, and Spanish. Annotation of arbitrary texts is also supported by CoreNLP. It combines numerous NLP tools from Stanford, including the PoS tagger, named entity recognizer, parsers, coreference resolution system, sentiment analyzer, bootstrapped pattern learning, and open information extraction (IE) tools. 
TextBlob is a Python (2 and 3) text processing package. It offers a straightforward API for delving into typical natural language processing (NLP) activities including part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. 
spaCy is a Python-based open-source Natural Language Processing (NLP) package with several built-in features. It's becoming increasingly used in NLP for data processing and analysis. Unstructured textual data is generated on a massive scale, and it is critical to analyze and gain insights from it. To do so, the data must first expressed in a computer-readable manner. spaCy can assist accomplishing this. 
According to the survey papers, video and audio processing for sentimental and emotional analyses is a lot of work in terms of interviewing systems, because it requires processing of live video and audio at the same time. The study demonstrates how NLP, NEM, SVM and other machine learning algorithms are employed with the help of NLP toolkits like spaCy and NLTK to provide the best natural language processing mechanisms.
 ftekhar Naim, M. Iftekhar Tanveer, Daniel Gildea and Mohammed (Ehsan) “Automated Analysis and Prediction of Job Interview Performance”. p. 1  R. Curhan and A. Pentland, “Thin slices of negotiation: predicting outcomes from conversational dynamics within the first 5 minutes.” Journal of Applied Psychology, vol. 92, no. 3, p. 802, 2007  G. Sandbach, S. Zafeiriou, M. Pantic, and L. Yin, “Static and dynamic 3d facial expression recognition: A comprehensive survey,” Image and Vision Computing, vol. 30, no. 10, pp. 683–697, 2012  G. Castellano, S. D. Villalba, and A. Camurri, “Recognising human emotions from body movement and gesture dynamics,” in Affective computing and intelligent interaction. Springer, 2007, pp. 71–82  V. Soman and A. Madan, “Social signaling: Predicting the outcome of job interviews from vocal tone and prosody,” in ICASSP. Dallas, Texas, USA: IEEE, 2010  Iftekhar Naim, M. Iftekhar Tanveer, Daniel Gildea and Mohammed (Ehsan) “Automated Analysis and Prediction of Job Interview Performance” . p. 5  B. Froba and A. Ernst, “Face detection with the modified census transform,” in Automatic Face and Gesture Recognition (FG). IEEE, 2004, pp. 91–96  Iftekhar Naim, M. Iftekhar Tanveer, Daniel Gildea and Mohammed (Ehsan) “Automated Analysis and Prediction of Job Interview Performance”. p. 7  A. J. Smola and B. Scholkopf, “A tutorial on support vector regression,” Statistics and computing, vol. 14, no. 3, pp. 199–222, 2004  R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 267–288, 1996  Edward Loper and Steven Bird “NLTK: The Natural Language Toolkit”  Venkat N. Gudivada, Kamyar Arbabifard, in Handbook of Statistics, 2018”6.2 Stanford CoreNLP Toolset”  Steven Loria “Textblob”  Mark Neumann, Daniel King, Iz Beltagy, Waleed Ammar, Allen Institute for Artificial Intelligence, Seattle, WA, USA  Jie Chen, Chunxia Zhang, and Zhendong Niu: A Two-Step Resume Information Extraction Algorithm  Anaswara R, Aswathy T: Resume Information Extraction Framework  Evanthia Faliagka, Athanasios Tsakalidis, Giannis Tzimas: An integrated e?recruitment system for automated personality mining and applicant ranking  Suhas Tangadle Gopalakrishna, Vijayaraghavan Varadharajan: Automated Tool for Resume Classification Using Semantic Analysis  Jayashree Rout, Sudhir Bagade, Pooja Yede, Nirmiti Patil: Personality Evaluation and CV Analysis using Machine Learning Algorithm  Document Similarity for Texts of Varying Lengths via Hidden Topics Hongyu Gong* Tarek Sakakini* Suma Bhat* Jinjun Xiong*, University of Illinois at Urbana-Champaign, USA, T. J. Watson Research Center, IBM, arXiv:1903.10675v1 [cs.CL] 26 Mar 2019  The Implementation of Cosine Similarity to Calculate Text Relevance between Two Documents, D Gunawan, C A Sembiring, M A Budiman Department of Information Technology, Universitas Sumatera Utara, Jl. dr. Mansur No. 9 Kampus USU Medan 20155 2 Department of Computer Science, Universitas Sumatera Utara, Jl. dr. Mansur No. 9 Kampus USU Medan 20155  Semantic Sensitive TF-IDF to Determine Word Relevance in Documents Amir Jalili Fard 2 , Vinicius Fernandes Caridá 1 , Alex Fernandes Mansano 1 , Rogers S. Cristo 1 , and Felipe Penhorate Carvalho da Fonseca 1 1 Data Science Team - Digital Customer Service, Itaú Unibanco, São Paulo, Brazil 2 Federal University of Minas Gerais, Brazil, arXiv:2001.09896v2 [cs.IR] 25 Jan 2021
Copyright © 2022 Smitha Kurian, Mohammed Aman Khan, Hema Priya P, Manavpreet Singh, Lobsang Norbu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.