Emotion Detection using Machine Learning

Authors: Chirag Sharma, Harshit Singh, Harshit Rana, Kartik Kumar

DOI Link: https://doi.org/10.22214/ijraset.2023.53251

Abstract

In this article, a machine learning algorithm is used to identify human emotions. The method will provide a detection precision of 87%. Three separate algorithms will have good accuracy thanks to this. Using a person\'s facial expression as a base, it records them. One of the most disruptive advances of the past ten years has been the development of machine learning. It has a big impact on reliability and accurate prediction. One of the crucial non-verbal skills employed in this communication is emotion recognition, which helps to ascertain a person\'s mindset and attitude. Machines could be more useful to mankind if they could recognise and understand human emotions. The two most popular methods for determining emotions are those that depend on speech and facial expression. According to some psychologists, about 55% of communication takes place through facial expressions. The objective of this paper is to conduct an examination of recent FER (automatic facial emotion recognition) deep learning research. We concentrate on the architecture and databases used, the contributions that were handled, and we highlight the advancement by contrasting the recommended processes and the obtained results. This study\'s goal is to aid and guide scholars by analysing earlier work and making recommendations for how to further this subject.

Introduction

I. INTRODUCTION

For human-machine interactions to progress in the beginning, communication is essential. It is clear that people prefer using natural language, which has developed over time, to communicate with technology. In addition to the language we use to express ourselves, emotions also serve as rational means of communication between others. It would be advantageous and take communication one step further if machines were able to understand human emotions.

There are numerous ways that this emotion recognition may occur, but speech and emotion recognition based on facial expression are the main areas of focus. These speech- and face-based emotion recognitions can be done in a number of ways, including with deep learning and standard machine learning techniques.This article employs non-deep learning- based,often known as classical, machine learning techniques.

The only way that human-machine contact may have a major impact is through effective communication. One of the crucial non-verbal techniques used in this communication is emotion identification. Both verbal and nonverbal communication techniques are useful. Emotions can be recorded through a variety of ways, such as speech, gestures, and facial and bodily expressions. According to some psychologists, about 55% of communication takes place through facial expressions. Face features can change in a variety of ways as a result of emotional-induced changes in facial muscle activity. By identifying variations in face features, one might infer the mood depicted in the pictures. Due to advancements in related fields, particularly machine learning, image processing, and human cognition, FER has evolved greatly in recent years. As a result, the influence and prospective applications of autonomous FER have been expanding in a variety of fields, such as human-computer interaction, robot control, and driver state monitoring.

Facial emotion identification is a difficult problem since it requires the ability to recognise a variety of different facial forms, positions, variations, etc. The eyes, mouth, and brows are some of the important features that are detected and evaluated to identify the mood. The nose wrinkle, lip tightener, inner brow raiser, upper lid raiser, outer brow raiser, mouth stretcher, lip corner depressor, and lip parts are additional vital elements in facial expressions that help identify the emotion. The nasio-labial, brows, eyes, forehead, cheeks, and lips are therefore regions of focus because these areas are where the various emotions are produced by the movement of underlying muscles.

Affective computing technologies may recognise the user's emotions through the use of sensors, microphones, and cameras and respond by implementing certain, predetermined characteristics for a good or service. One way to think about effective computing is through human- computer interaction; in this case, a device is able to perceive and respond to the emotions that users are showing..

Flow Chart

2. The Technology Behind Diabetes prediction

Machine learning techniques attempt to increase the accuracy of software programmes in detecting outcomes without explicit programming, in addition to assuring data integrity. For the most part, machine learning predicts output value using previously used data as its input.

Since ML can solve issues at a speed and scale that cannot be matched by the human mind alone, it has shown to be beneficial. Machines can be trained to recognise patterns in and correlations between incoming data by creating links between the massive processing power behind a single activity or a number of individual occupations. Because of this, machines can now automate repetitive operations.

The purpose of this paper is to examine how much machine learning enhances detection results with accuracy and data from diverse sources. In this study, CNN is utilised to make predictions with a better degree of accuracy because it performs more accurately than other machine learning algorithms. After facial expression analysis has successfully detected emotions, additional steps or actions can be taken.

A. Models used

CNN: Neural networks with convolutions: You can adjust the dropout, number of thick layers, and activation functions to increase performance. We also applied transfer learning to a CNN for image categorization dubbed VGG, a pre-trained convolutional neural network.
KNN: KNN is a non-parametric learning technique, therefore it doesn't assume anything about how the data are distributed. Our data source were the euclidean distances between the spots. Our precision was around 50%, thus we investigated different non-linear models to assess the model's precision..
Multi Layer Perceptron: MLPs are a subset of neural networks. They are made up of one or more layers of neurons. After the input layers receive the data, there may be one or more hidden layers. The forecasts originate from the output layer. Our accuracy increased from about 50% to about 80% when we used the distances between facial landmarks rather than pixel values. However, as we required models that would be far more accurate, we opted to use CNNs.
Pooling: The CNN layers should have a pooling or subsampling layer added after the convolution layer once you have the feature maps. Similar to the Convolutional Layer, the Pooling layer is in responsibility of reducing the spatial size of the Convolved Feature. The amount of processing resources required to process the data will be decreased through dimensionality reduction.
Fully Connected Layer: Once we have given the input image a proper form, we will flatten it into a column vector.The flattened output is sent into a feed-forward neural network, and each training cycle makes advantage of back propagation. The model uses the Softmax Classification method to discern between dominant and particular low-level features in images.

B. Software Requirement Proprieties

A number of software development projects can use Python 3.6.0, a dynamic object-oriented programming language. It can be learned in a few days, comes with sizable standard libraries, and offers superb support for integration with other languages and technologies. Many Python programmers assert that their productivity has significantly increased and that the language encourages the writing of better-quality, easier-to-maintain code. Using Jupyter Notebook The Jupyter Notebook App is a server-client software that enables editing and running notebook documents through a web browser. The Jupyter Notebook App can be operated locally on a desktop without requiring an internet connection or remotely on a server that is available over the internet (as described in this article)..

II. RELATED WORK

The capacity to accurately read emotional facial expressions is one of the elements impacting the quality of interpersonal relationships. When one can effectively discern the feelings of others, more dialogues like these take place. Social interaction problems are a feature of some psychopathological illnesses, which may be partially explained by difficulties reading facial expressions of emotion. Numerous clinical populations have reported these deficits. The findings of the examinations to date, meanwhile, have been contradictory in terms of face emotions. This essay's objectives are to explore the topic of emotion and the development of emotional expressions, to emphasise the benefits and drawbacks of related studies, to contrast the results, and to call attention to this novel issue for Turkey.The market is divided into software and services based on its component parts. IoT-based technology makes it possible to monitor and respond to any devices that are tracking a person's behaviour and emotions. The devices communicate using sensor input to remotely regulate an industrial process' output without the requirement for human interaction. The use of technology to detect emotions is progressively becoming commercially viable. Deleted from it without going into detail regarding the pre-processing techniqueused. With a revenue share of 30.6% in 2021, North America had the highest revenue share. The Internet of Things (IoT) is expanding quickly, and wearable technology gadgets are becoming more and more popular, opening up a wealth of potential for the local market. Increased government spending and the growing demand from different businesses for better services and security both contribute to the expansion. Depending on its constituent pieces, the market is separated into software and services. The ability to monitor and react to any devices that are tracking a person's behaviours and emotions is made feasible by IoT-based technology. The devices exchange data such as sensor input that controls the output of a far-off industrial process without the need for human intervention.

III. METHODOLOGY

The following are the three key elements of emotion detection: Image preprocessing, Feature Extraction and Classification of Features.

A. Image Preprocessing

The technology that can identify places, brands, people, products, and other items in photos is known as image recognition. Computer vision is a subset of image recognition, a process that can identify and find an object in a digital video or image. Techniques for gathering, processing, and analysing data from movies or still photos captured in the actual world are included in the field of computer vision. These sources generate high-dimensional data that can be used to reach numerical or symbolic choices. In addition to image identification, computer vision also includes object recognition, learning, event detection, video tracking, and picture reconstruction. A computer can tell raster graphics from vector ones. While vector images are made up of a collection of polygons with coloured annotations, raster images are composed of discrete numerically valued pixels. In order to interpret images, geometric encoding is transformed into constructs that represent physical properties and objects. The computer then examines these constructs logically. The second stage involves developing a predictive model that can be used with a classification algorithm..

B. Feature Extraction

To fulfil a processing demand, a starting collection of raw data is dimension-reduced by feature extraction. The characteristics of an image govern how it behaves. A feature, such as a point or an edge, is essentially a pattern in an image. Feature extraction might be useful when you need to analyse data with fewer resources while maintaining the important and relevant data. The amount of duplicated data can be reduced through feature extraction. After applying various image preprocessing techniques to the sampled image, such as thresholding, scaling, normalising, binarizing, etc., the features are then extracted. To obtain features for image classification and recognition, feature extraction techniques are used. The Colour Gradient Histogram and ORB are two feature detection techniques.. Computer systems use the technique of corner detection to extract the features. The contents of an image are inferred using those extracted features.

Corner detection can be used for a variety of things, including motion detection, image registration, video tracking, image mosaicing, 3D modelling, and object recognition. During the detection stage, a window that is the target size is moved across the input image. For each section of the image, the Haar characteristics are computed. Different features represent various values. The difference is then put into comparison with a cutoff that separates objects from non-objects. It is referred regarded as a "weak classifier" because each Haar feature only detects marginally better than random guessing. CNN performs convolution on an input image using a filter or kernel. Convolution and filtering involve scanning the entire screen, starting in the top left corner and moving downward until the entire width of the screen has been covered. Up till the full screen has been scanned, this procedure is repeated. The features of the person's face match those in the image. The picture pixel multiplies the relevant feature pixel. The values are added after being multiplied by the total number of pixels in the feature..

C. Classification of Features

Once we have given the input image a proper form, we will flatten it into a column vector. The flattened output is sent into a feed-forward neural network, and each training cycle makes advantage of back propagation. The model uses the Softmax Classification technique to classify images by locating dominant and specific low-level features. We now possess all of the parts required to build a CNN. Convolution, pooling, and ReLU. Max pooling provides input to the multi-layer perceptron layer classifier that we first described. In CNNs, these layers are frequently applied numerous times, like in the following sequence: Convolution -> ReLU -> Max-Pool -> Convolution -

> ReLU - > Max-Pool. The layer that is fully connected won't be covered at this time.

IV. FUTURE SCOPE

Future developments in emotion recognition will make it possible for machines to understand human emotion, which is the first step in fulfilling our objectives. Visionify develops distinctive computer vision solutions that are customised to meet your specific needs. We assist companies in resolving urgent problems and improving their operations. Our systems gather information from video sources, analyse it, and grasp it to produce beneficial results for corporate operations. Technology is used in the data analytics process to clean, analyse, and model data. The data is then used to derive insights. After then, the knowledge is put to use in business-related decision-making. As organisations become more tech-driven and fast-paced, specialists in data analysis are already playing an increasingly significant role in those organisations. overfitting, to create even more accurate models. Both startups and global tech giants are looking to hire data analysts who can collect, analyse, and interpret data to help with decision-making. The performance of FER has increased by merging DL approaches. In the modern world, it is essential to create intelligent machines that can distinguish different people's facial expressions and respond appropriately. The creation and application of emotion- focused DL techniques using IoT sensors has been suggested This is expected to improve FER's performance to a level that is on par with that of humans, which will be very helpful in the sectors of security, surveillance, and investigation

Conclusion

Based on the video data, the suggested model\'s output predicts the subject\'s projected sentiment. Because the output determines the severity of the subject\'s mental problems and level of stress, it can be used in a number of situations. Peers and family members can therefore act to improve the subject\'s mental state and foster harmony and peace of mind if the subject hears \"critical\" comments. These sentiment analysis techniques are therefore essential to building a thriving society. This study compiled the findings of a number of studies, making an effort to incorporate as many references from recent years as feasible. Based on reviews, the study addressed some of the issues with facial expression identification by using a variety of face detection, feature extraction, analysis, and classification methods. The paper offers thorough details on techniques used for Facial Expression Recognition (FER) at all stages. The study offers thorough information regarding strategies that are currently being employed at all phases of the growth of that discipline, which is very beneficial to both seasoned and fresh researchers in the subject of FER. Researchers\' chances for future research are enhanced by this knowledge, which also helps researchers better comprehend existing trends.

References

[1] Renuka S. Deshmukh, Shilpa Paygude, and Vandana Jagtap\'s Facial Emotion Recognition System using Machine Learning Approach. [2] James Pao\'s article, \"Emotion Detection Through Facial Feature Recognition,\"Using a machine learning method, Renuka S. Deshmukh, Vandana Jagtap, and Shilpa Paygude created the 2017 paper \"Facial Emotion Recognition System.\" [3] \"The Research of Elderly Care System Based on Video Image Processing Technology,\" Dongwei Lu, Zhiwei He, Xiao Li, Mingyu Gao, Yun Li, and Ke Yin, 2017. [4] Shivam Gupta, \"Recognition of facial emotion in static and real-time images,\" 2018. [5] \"Facial Emotion Recognition\", Ma Xiaoxi, Lin Weisi, Huang Dongyan, Dong Minghui, and Haizhou Li, 2017. [6] \"Facial Emotion Recognition Using Deep Convolutional Networks,\" by Mostafa Mohammadpour, Hossein Khaliliardali, Mohammad. M. AlyanNezhadi, and Seyyed Mohammad. R. Hashemi, published in 2017. [7] Automatic Emotion Recognition Using Facial Expression: A Review, International Research Journal of Engineering and Technology, Dubey, M., Singh, P. L., 2016. [8] \"Affective learning: Empathetic agents with emotive facial and tone of voice expressions,\" by Christos N. [9] Moridis and Anastasios. 2012, pp. 260–272 in IEEE Transactions on Affective Computing, volume 3. [10] Grzegorz Brodny, Agnieszka Landowska, Agata Koakowska, Mariusz Swoch, Wioleta Swoch, and Micha R. Wrbel. Comparison of a few pre-made programmes for recognising emotions from facial expressions IEEE, 2016\'s 9th ICHSI, pp. 397–404. [11] Facenet2expnet: Regularizing a deep face recognition net for expression recognition, H. Ding & S. K. Zhou, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition, pp. 118-126. [12] T. Hassner, K. Kim, Y. Wu, Convolutional neural networks that have been modified for facial landmark detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017 [13] Coimbatore Institute of Technology, Punidha A, Inba S, Pavithra K.S., Ameer Shathali M, Athibarasakthi publication341782389_Human_Emotion_Detection_using_ Machine_Learning_Techniques.

Copyright

Copyright © 2023 Chirag Sharma, Harshit Singh, Harshit Rana, Kartik Kumar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET53251

Publish Date : 2023-05-28

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here