Multimodal Intelligence in Recruitment: Modelling Personality and Behavioural Traits from Video Interviews

Authors: K. R. Rajput, A. J. Kharade, A. P. Pawar, T. S. Wakhare, Prof. D. D. Ahir

DOI Link: https://doi.org/10.22214/ijraset.2025.71416

Abstract

With advances in AI and deep learning, automated personality analysis from video interviews has emerged as a key area in personality computing and psychological assessment. Leveraging computer vision and pattern recognition, modern models now interpret nonverbal cues to estimate personality traits directly from visual input. The recruitment landscape often depends on manual assessments that are susceptible to bias and inconsistency, making objective candidate evaluation challenging. While asynchronous video interviews (AVIs) offer scalability and convenience, they still fall short in capturing deeper personality-related cues. This research introduces an Automatic Personality Recognition (APR) framework that leverages multimodal data—text, audio, and visuals—to assess candidates along the Big Five personality traits. By applying advanced deep learning techniques to analyze recorded interviews, the system delivers objective and scalable personality evaluations. This approach enhances the fairness and effectiveness of hiring decisions, addressing key limitations in both conventional and technology-driven recruitment practices.

Introduction

Overview:
Asynchronous Video Interviews (AVIs) are gaining popularity in recruitment due to their flexibility and scalability. However, traditional personality assessments are prone to human bias and inconsistency. To address this, a study introduces an Automated Personality Recognition (APR) system that uses multimodal AI techniques to objectively evaluate the Big Five personality traits—Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness.

Key Components of the APR System:

1. Multimodal Data Integration:

Textual, visual, and audio data are extracted from video interviews.
Each modality offers unique personality cues:
- Text: word choice and sentence structure
- Audio: vocal tone and pitch
- Video: facial expressions and body language

2. Model Architecture:

Combines CNNs, LSTM networks, and pretrained models (like VGG16 and BERT).
Uses ChaLearn First Impressions V2 dataset with 1,891 interview clips (≈63 hours).
Features are extracted using:
- MFCCs for audio
- Frame-wise CNN + LSTM for video
- BERT embeddings + dense layers for text

3. Fusion and Prediction:

Features from all modalities are concatenated and passed through fully connected layers.
The model regresses continuous scores for each of the Big Five traits using Mean Absolute Error (MAE) as the loss function.

Performance Results:

Training MAE: 0.0781
Validation MAE: 0.1085
Test MAE: 0.1105 → Accuracy (1 - MAE): 88.95%
The model generalizes well and avoids overfitting, confirming its robustness.

Real-World Application:

A web-based platform was developed to implement this model.
It allows recruiters to run AVIs and get automated, data-driven personality insights via an intuitive dashboard—improving both scalability and fairness in hiring.

Conclusion

This research marks a significant advancement in recruitment and selection methodologies through the development of an Automated Personality Recognition (APR) system. By leveraging multimodal data—spanning audio, visual, and textual inputs—and employing a hybrid deep learning framework, the proposed system demonstrates strong potential for delivering accurate, scalable, and unbiased personality assessments. The comparative analysis of various models, underlines the importance of balancing predictive accuracy with computational efficiency, particularly in real-world, large-scale applications. This work addresses inherent limitations of traditional assessment methods by reducing human bias and establishing an automated pipeline that aligns with the dynamic needs of modern hiring processes. Moreover, it lays the groundwork for ethical and interpretable AI solutions within recruitment and talent analytics. Looking ahead, this research opens several promising avenues. One critical future direction is the incorporation of personalized interview feedback, wherein AI-driven insights can help candidates reflect on and improve their communication style, confidence, and engagement. Another is the development of real-time personality prediction systems, enabling live feedback during video interviews to assist recruiters with immediate behavioural insights. Furthermore, integrating APR systems into Applicant Tracking Systems (ATS) could standardize candidate evaluations across organizations, enhancing consistency and fairness in hiring. The framework also supports candidate screening based on personality fit, team building through trait complementarity analysis, and personalized training tailored to individual learning styles. There is also scope to expand trait analysis beyond the Big Five model, incorporating extended personality and behavioural metrics for richer psychological profiling. Finally, embedding principles of Explainable AI (XAI) into personality prediction models will be essential for transparency, helping stakeholders understand and trust model decisions—thereby fostering wider adoption in sensitive decision-making contexts. Together, these directions pave the way for the evolution of ethical, intelligent, and impactful AI systems that not only enhance recruitment but also redefine how organizations understand and engage with human potential

References

[1] Bounab, Y., Oussalah, M., Arhab, N., Bekhouche, S. (2024). Towards job screening and personality traits estimation from video transcriptions. Expert Systems with Applications, 238(D), 122016. https://doi.org/10.1016/j.eswa.2023.122016. [2] X. Duan, H. Li, F. Yang, B. Chen, J. Dong, and Y. Wang, \"Multimodal Automatic Personality Perception Using ViT, BiLSTM and VGGish,\" 2024 5th International Conference on Computer Engineering and Application (ICCEA), Hangzhou, China, 2024, pp. 549-553, doi: 10.1109/ICCEA62105.2024.10604109. [3] K. A. Dnyaneshwar and G. Poonam, \"AI-Driven Insights: Personality Evaluation in Asynchronous Video Interviews for Informed Hiring Decisions,\" 2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems (ICITEICS), Bangalore, India, 2024, pp. 1-6, doi: 10.1109/ICITEICS61368.2024.10625601. [4] O. T.-C. Chen, C.-H. Tsai, and M.-H. Ha, \"Automatic Personality Recognition via XLNet with Refined Highway and Switching Module for Chatbot,\" 2024 IEEE International Symposium on Circuits and Systems (ISCAS), Singapore, Singapore, 2024, pp. 1-5, doi: 10.1109/ISCAS58744.2024.10558116. [5] S. Ghassemi et al., \"Unsupervised Multimodal Learning for Dependency-Free Personality Recognition,\" IEEE Transactions on Affective Computing, vol. 15, no. 03, pp. 1053-1066, July-Sept. 2024, doi: 10.1109/TAFFC.2023.3318367. [6] Zatarain Cabada, Ramón Barron Estrada, Maria Bátiz Beltrán, Víctor Sapien, Ramón Ruiz, Gerardo. (2023). Sentiment Analysis of Spanish Text for Automatic Personality Recognition in Intelligent Learning Environments. pp. 1-4. doi: 10.1109/ENC60556.2023.10508699. [7] D. Nagajyothi, S. A. Ali, P. H. Sree, and P. Chinthapalli, \"Automatic Personality Recognition In Interviews Using CNN,\" 2023 4th IEEE Global Conference for Advancement in Technology (GCAT), Bangalore, India, 2023, pp. 1-7, doi: 10.1109/GCAT59970.2023.10353423. [8] Holtrop, Djurre, Oostrom, Janneke, Breda, Ward, Koutsoumpis, Antonis, and de Vries, Reinout. (2022). Exploring the application of a text-to-personality technique in job interviews. European Journal of Work and Organizational Psychology, 31, 1-18. doi: 10.1080/1359432X.2022.2051484. [9] J. R. Lima, H. J. Escalante, and L. V. Pineda, \"Sequential Models for Automatic Personality Recognition from Multimodal Information in Social Interactions,\" 2022 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), Ixtapa, Mexico, 2022, pp. 1-6, doi: 10.1109/ROPEC55836.2022.10018711. [10] X. Duan, Y. Yu, Y. Du, H. Liu, and Y. Wang, \"Personality Recognition Method Based on Facial Appearance,\" 2022 3rd International Conference on Computer Vision, Image and Deep Learning (CVIDL ICCEA), Changchun, China, 2022, pp. 710-715, doi: 10.1109/CVIDLICCEA56201.2022.9824658. [11] X. Duan, Q. Zhan, S. Zhan, Y. Yu, L. Chang, and Y. Wang, \"Multimodal Apparent Personality Traits Analysis of Short Video Using Swin Transformer and Bi-directional Long Short-Term Memory Network,\" 2022 4th International Conference on Frontiers Technology of Information and Computer (ICFTIC), Qingdao, China, 2022, pp. 1003-1008, doi: 10.1109/ICFTIC57696.2022.10075178. [12] Camati, Ricardo Enembreck, Fabrício. (2020). Text-Based Automatic Personality Recognition: a Projective Approach. pp. 218-225. doi: 10.1109/SMC42975.2020.9282859. [13] Z. Su, Z. Lin, J. Ai, and H. Li, \"Rating Prediction in Recommender Systems Based on User Behavior Probability and Complex Network Modeling,\" IEEE Access, vol. 9, pp. 30739-30749, 2021, doi: 10.1109/ACCESS.2021.3060016. [14] K.Yesu, K., Shandilya, S., Rekharaj, N., Ankit, K., and Sairam, P. S. (2021). Big Five Personality Traits Inference from Five Facial Shapes Using CNN. IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), Kuala Lumpur, Malaysia, pp. 1-6. doi: 10.1109/GUCON50781.2021.9573895. [15] Reimers, N., Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. Conference on Empirical Methods in Natural Language Processing. doi: 10.48550/arXiv.1908.10084. [16] L´eo Hemamou, Ghazi Felhi, Vincent Vandenbussche, Jean-Claude Martin, and Chlo´e Clavel. 2019. HireNet: a hierarchical attention model for the automatic analysis of asynchronous video job interviews. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence (AAAI’19/IAAI’19/EAAI’19). AAAI Press, Article 71, 573–581.https://doi.org/10.1609/aaai.v33i01.3301573. [17] S. Zhang, S. Zhang, T. Huang, W. Gao, and Q. Tian, \"Learning Affective Features With a Hybrid Deep Model for Audio–Visual Emotion Recognition,\" IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 10, pp. 3030-3043, Oct. 2018, doi: 10.1109/TCSVT.2017.2719043. [18] J. Gorbova, E. Avots, I. Lüsi, M. Fishel, S. Escalera, and G. Anbarjafari, \"Integrating Vision and Language for First-Impression Personality Analysis,\" IEEE MultiMedia, vol. 25, no. 2, pp. 24-33, Apr.-Jun. 2018, doi: 10.1109/MMUL.2018.023121162. [19] T. Baltrusaitis, A. Zadeh, Y. C. Lim, and L.-P. Morency, \"OpenFace 2.0: Facial Behavior Analysis Toolkit,\" 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, pp. 59-66, doi: 10.1109/FG.2018.00019. [20] L. Chen, R. Zhao, C. W. Leong, B. Lehman, G. Feng, and M. E. Hoque, \"Automated video interview judgment on a large-sized corpus collected online,\" 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), San Antonio, TX, USA, 2017, pp. 504-509, doi: 10.1109/ACII.2017.8273646. [21] Chen, Lei, Feng, Gary, Leong, Chee Wee, Lehman, Blair, Martin-Raugh, Michelle, Kell, Harrison, Lee, Chong Min, Yoon, Su-Youn. (2016). Automated Scoring of Interview Videos using Doc2Vec Multimodal Feature Extraction Paradigm. doi: 10.1145/2993148.2993203. [22] Schroff, Florian, Kalenichenko, Dmitry, Philbin, James. (2015). FaceNet: A Unified Embedding for Face Recognition and Clustering. Proceedings of CVPR. [23] Biel, J., Teijeiro-Mosquera, L., & Gatica-Pérez, D. (2012). FaceTube: predicting personality from facial expressions of emotion in online conversational video. International Conference on Multimodal Interaction. [24] Batrinca, Ligia Maria, Mana, Nadia, Lepri, Bruno, Pianesi, Fabio, and Sebe, Nicu. (2011). Please, tell me about yourself: automatic personality assessment using short self-presentations. Proceedings of the 13th International Conference on Multimodal Interfaces. doi: 10.1145/2070481.2070528.

Copyright

Copyright © 2025 K. R. Rajput, A. J. Kharade, A. P. Pawar, T. S. Wakhare, Prof. D. D. Ahir. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET71416

Publish Date : 2025-05-21

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here