Authors: Koshe Ahana Hemant, Mankani Neha Haresh , Sayyed Shafiya Majid
Certificate: View Certificate
Crime is one of the dominant and alarming aspect of our society. Over the past few years, the crime rate across globe has increased exponentially. So, preventing the crime from occurring is a vital task. In the recent time, it is seen that artificial intelligence has shown its importance in almost all the field and crime prediction is one of them. However, it is necessary to maintain a proper database of the crime that has occurred. The ability to predict the crime on the basis of time, location and so on which can occur in future can help the law enforcement agencies in preventing the crime before it occurs from a strategical perspective. However, predicting the crime accurately is a challenging task because crimes are increasing at an alarming rate. Thus, the crime prediction and analysis methods are very important to detect the future crimes and reduce them. In Recent time, many researchers have conducted experiments to predict the crimes using various machine learning methods and particular inputs. For crime prediction, KNN, K-means and Random Forest and some other algorithms are used. Our system can predict regions which have high probability for crime occurrence and can visualize crime prone areas. The main purpose is to highlight the worth and effectiveness of machine learning in predicting violent crimes occurring in a particular region in such a way that it can be used by police to reduce crime rates in the society.
Crime is increasing considerably day by day. Crime is among the main issues which is growing continuously in intensity and complexity. Crime patterns are changing constantly because of which it is difficult to explain behaviours in crime patterns. Crime is classified into various types like kidnapping, theft murder, rape etc. The law enforcement agencies collects the crime data information with the help of information technologies (IT). But occurrence of any crime is naturally unpredictable and from previous searches it was found that various factors like poverty, employment affects the crime rate . It is neither uniform nor random. With rapid increase in crime number, analysis of crime is also required. Crime analysis basically consists of procedures and methods that aims at reducing crime risk. It is a practical approach to identify and analyse crime patterns. But, major challenge for law enforcement agencies is to analyse escalating number of crime data efficiently and accurately. So it becomes a difficult challenge for crime analysts to analyse such voluminous crime data without any computational support. A powerful system for predicting crimes is required in place of traditional crime analysis because traditional methods cannot be applied when crime data is high dimensional and complex queries are to be processed. Therefore a crime prediction and analysis tool were needed for identifying crime patterns effectively. This paper introduces some methodologies with the help of which it can be predicted that at what place and time which type of crime has a higher probability of occurrence. Ever Since the Coronavirus Pandemic hit the globe like a shockwave, multiple global data prediction systems have shown unexpected amount of fluctuations. Given the circumstances, it is essential to make accurate predictions based on data analysis.
One of the primary advantages of big data analytics software is that it can evaluate huge quantities of data much faster than humans can, plus spot trends they’d likely miss. So, from a crime-solving point of view, data analytics could help catch criminals who are trying to evade arrest.
Predictive modeling was used for making predictions since it has the method which is able to build a model and has the capability to make predictions. This method consists of different algorithms of Machine Learning that can study properties from the data used for training which is used for producing predictions. It is split in two major classes one is Regression and other is classification of patterns. Regression models are based upon analysis of the relationship that are present between trends and variable in order to make predictions about the continuous variables. Whereas,the job of classification is to assign a particular class labels to a data value as output of the prediction. Division of pattern classification is in two ways i.e., Supervised and Unsupervised learning. It is already known in supervised learning that which class labels are to be used for building classification models. In unsupervised learning, these class labels are not known.
A. Working Of Project
Predictive modeling was used for making predictions since it has the method which is able to build a model and has the capability to make predictions. This method consists of different algorithms of Machine Learning that can study properties from the data used for training which is used for producing predictions. It is split in two major classes one is Regression and other is classification of patterns. Regression models are based upon analysis of the relationship that are present between trends and variable in order to make predictions about the continuous variables. Whereas, the job of classification is to assign a particular class labels to a data value as output of the prediction. Division of pattern classification is in two ways i.e., Supervised and Unsupervised learning. It is already known in supervised learning that which class labels are to be used for building classification models. In unsupervised learning, these class labels are not known. This paper deals with supervised learning.
The dataset that we have used is of 5 Denver City. The different columns or as we refer to them as features are as followed
Here the values are as followed :-
IV. PROPOSED SYSTEM
A. System Components
B. System Chart
C. Collaboration Diagram
V. EDA AND DATA PRE-PROCESSING
A. Data Cleaning
The dataset is not in the right format to use directly with the model hence data preprocessing is required . In our model we did a few steps as followed :
2. Changing the data types of the column / features .
3. Sorting the data by offense category id
VI. MODEL CREATION
A. Splitting the Dataset
For model creation we split the data for training and testing. We defined a function that splits the dataset . The function divided the dataset in training and testing and further into the training dependent and independent variables.
We fit the model using the training split of the dataset by passing the dependent and the independent variables . We passed the testing dataset in the validation dataset.
VII. EXPERIMENTAL ANALYSIS
After testing various machine learning models like Logistic Regression, Decision Tree, Random Forest, Support Vector Machine & K-Nearest Neighbor the result are as follows:
After inserting the Longitude and Latitude as input it predicted which crime is likely to happen in that particular area.
Crime prediction is one the current trends in the society. Crime prediction intends to reduce crime occurrences. It does this by predicting which type of crime may occur in future. Here, analysis of crime and prediction are performed with the help of various approaches some of which are KNN, K means Clustering & Random Forest. However which model will work best is totally dependent on the dataset that is being used. This research work offers a way to foresee and predict crimes and frauds within a city. It focuses on having a crime prediction tool that can be helpful to law enforcement. This paper is aimed at increasing the prediction accuracy as much as possible. As compared to the previous work, this work was successful in achieving the highest accuracy in prediction. The KNN system helps law implementing agencies for improved and exact crime analysis. The result of the optimized k-means algorithm is efficient and provides improved accuracy of the final cluster reduced the number of iterations. We know that the prediction accuracy of random forest model can be improved based on historical crime data and covariates (POI data and demographic information). From the overall prediction results, the prediction model of random forest with covariates has better performance compared with Naive Bayes model and logistic regression model. As a combinatorial classification model, random forest overcomes the limitations of single decision tree classification and can effectively avoid the problem of overfitting. Among the model evaluation indexes selected in this paper, the random forest prediction model with covariates is better than other models.
 Pratibha, A. Gahalot, Uprant, S. Dhiman and L. Chouhan, \"Crime Prediction and Analysis,\" 2nd International Conference on Data, Engineering and Applications (IDEA), 2020, pp. 1-6, doi: 10.1109/IDEA49133.2020.9170731.  S. Yao et al., \"Prediction of Crime Hotspots based on Spatial Factors of Random Forest,\" 2020 15th International Conference on Computer Science & Education (ICCSE), 2020, pp. 811-815, doi: 10.1109/ICCSE49874.2020.9201899.  S. G. Krishnendu, P. P. Lakshmi and L. Nitha, \"Crime Analysis and Prediction using Optimized K-Means Algorithm,\" 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), 2020, pp. 915-918, doi: 10.1109/ICCMC48092.2020.ICCMC-000169.  A. Kumar, A. Verma, G. Shinde, Y. Sukhdeve and N. Lal, \"Crime Prediction Using K-Nearest Neighboring Algorithm,\" 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), 2020, pp. 1-4, doi: 10.1109/ic-ETITE47903.2020.155.
Copyright © 2022 Koshe Ahana Hemant, Mankani Neha Haresh , Sayyed Shafiya Majid. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.