Keywords

1 Introduction

Crimes are real trammels to the societies irrespective of their cultural, social, economic background. Since the crime incidents are considered as negative indicators of the wellness of any civilization, they played a key role in the life of an individual, family and state. Punishment is one of the ways to condense the crime rates. The general public in the society always look for preventive measures which would stopover the fortuity of crime. Crime prevention could only be done by implementing effective strategies. Those strategies may perhaps be derived from the analysis of existing data. The law enforcement agencies practiced these organized processes for the past few centuries. It was called by the name of hot spot analysis, which required substantial number of human resources and abundant amount of time.

People always dreamed about a predictive mechanism, which correctly identifies and forecasts the crime events that may occur in the future. The concept of crime prediction is a cradle for many science fiction books. The Minority Report, a novel by Philip K Dick, was published in 1950s, and then, the same was adapted as a movie in 2002. That story accomplished a massive success globally in both the forms in spite of cultural, linguistic and geographic differences of readers and viewers. Since safety is a universal factor, everyone has an expectation about predicting and stopping crime. It was a fantasy or fictional until the last decade, however not now [1] (Table 1).

Table 1 IPC and SLL crimes in India

There are prolific quantities of crime-related data available in law enforcement agencies like police department, judicial organizations and NGOs. They have information about the type, spatial–temporal properties of crimes along with socioeconomic background. The above-listed data was gleaned from NCRB-India [2]; it shows 85% of augmentation in cognizable crime incidents in the past 35 years. The trend clearly indicates the possibilities of increment in crime occurrences in the future. The crime controlling strategies could only be designed by cramming the data. Hence, we have to use data mining and machine learning algorithms in the process of exploring historical information and predicting crimes. Nevertheless, there are few challenges in the accuracy of prediction and decision making. Those obstacles would have been resolved by the forthcoming methodologies.

2 Survey Classification

Crime data mining is an interdisciplinary field; it comprises criminology, data mining algorithms and statistics. Presently, many researchers are working on designing an effective crime prediction models. Comprehensive observation about variegated extensive researches in crime data mining is presented here. This review involves the identification of techniques used, methodologies implemented, conclusions obtained and forthcoming routings directed in the existing works. The specified articles are classified based on the following factors:

  1. 2.1

    Predicting crimes based on socioeconomic, spatial–temporal, demographic factors.

  2. 2.2

    Predicting crimes based on their nature (i.e., grading by severity)

  3. 2.3

    Predicting crimes by the characteristics of victims (i.e., women, children, elderly people)

  4. 2.4

    Predicting crimes by using different methodologies (i.e., machine learning, deep learning, transfer learning).

3 Literature Survey

Cyber forensic is a process of finding the evidences linked to a technical crime. There are numerous possibilities of e-crimes which include hacking user accounts, financial frauds, data theft, phishing and blackmailing. These unlawful activities are severely affecting the people, organizations and society. In most cases, the detection of cybercrime and locating the criminals are considered as intricate tasks. Whenever the technological advancements got rolled out, it elevates the risk of crime occurrence. To tackle this issue, the technical forensic experts are studying the past data, identifying the risks associated and trying to fix them by new approaches. Prevention of cybercrime would save millions of dollar every year. Once any e-crime happened, it would make irreversible damage to the system. Even though there are limitations, data mining and machine learning models are considered as the most successful methodologies in e-crime prediction. The emerging methodologies like deep learning are giving hope that these challenges could be fixed through effective forecasting. Since the deep learning is a methodology of artificial intelligence, it tries to imitate like a human brain during the execution. It could also be effective with processing unstructured data.

Karie et al. [3] are proposing a framework to aid the cyber forensic process by implementing deep learning algorithm—shortly called as DLCF framework. It contains various layers, including initialization, data sources identification, deep learning-enabled investigation, reporting, decision making and closure. After the evidences are identified, the relationship between entities would be established by DL. The authors hope that the algorithm could study various incidents, derive the pattern and deliver superior predictions.

Comparatively, urban crime data mining researches are outnumbered the rural ones. Frequency of crime occurrences, availability of datasets and density of population are the few factors attracting researchers to concentrate on issues related to urban community. There are very few researches on small cities, town and rural areas. The characteristics of urban and rural societies are contrasting with each other, hence the crimes. The approaches toward urban crime prediction may not be suitable for the rural and vice versa. Bolger and Bolger [4] performed a survey on fear of crime in a small town. Fear of crime had been scrutinized by dissecting individual demographic factors or community-level effects or combination of both. They have taken the vulnerability model and the incivilities model for their analysis.

The vulnerability model focuses on the susceptible demographic factors and their influence on fear of crime, i.e., females are more likely to be victimized in sexual crimes like rape. The incivilities model tries to explain the effects of neighborhood disorder on fear of crimes. The social disorder and dissatisfaction with law enforcement agencies are increasing the fear. The authors examined the trend of fear by analyzing neighborhood and demographic variables. They predict that social disorder is a strongest factor which induces the fear of crime. Based on the findings, it confirmed that the presence of police or other legal authorities reduce the fear of crime in town and rural population.

Crime is a spasmodic event, associated with many external factors including economy of the country. The relationship between crime and economy was examined by various scholars in the past. The USA had gone through recession in 1930s, and the crime rate was dipped. Later, in 1950s, the economy got boomed, and the crime rate too upraised and rumbled. During 1980s and 1990s, there were ups and downs in crime rates. Hence, the researchers superannuated the connection between economy and crime rate. Few others who still believe in the correlation of these factors are working on the same.

Mittal et al. [5] deliberated the economic crisis in India and try to prove the influence of economy on crime rates. The crime data and the economic indicators like Gross State Domestic Product (GSDP), Net State Domestic Product (NSDP), Per capita Net State Domestic Product, unemployment rate and Consumer Price Index (CPI) of the years 2004–2013 were elicited from various government portals. Independent variables (Gross District Domestic Product (GDDP) and the unemployment rate) and dependent variables (theft, burglary and robbery) were evaluated by the decision trees, random forest, linear regression and neural networks algorithms. Out of four, linear regression had generated results with high accuracy. It corroborated the correlation between unemployment and robbery. This paper started the debate about the influence of economic factors on crime incidents again.

Population of the area is an important factor in crime prediction. Mostly, predictions are derived by referring the specific residential population. However, the influence of ambient population and other related factors on crime rate is not negligible. Kadar and Pletikosa [6] collected the data about population from various resources like crime data from police department, census information, location-based social networks, subway rides and taxi rides. There are two types of models practiced in crime prediction. Long-term crime prediction models are aiming at the accumulated crime rates over 1–5 years. Instead the other one, short-term crime prediction models are pointing at the short period which may vary from one day to one month.

The location-based social networks like Foursquare are providing information about people on move and their purpose of transportation. Data about public had been correlated with the crime data elicited from police department. Census, spatial and temporal features were evaluated by three different tree-based machine learning models. The residential population along with spatial–temporal feature would delineate the characteristics of ambient population and that leads to effective crime prediction by the machine learning models. The outcome of this human mobility data-based model is superior than census-based models.

Cyber-attack is an action executed against a computer or entire network. The invaders primarily try to access the data available in the computer or knocked out the system or partially disabled the utilities. These types of activities could cause a catastrophic effect in national security. Hence, it is called as ‘cyberterrorism.’ The experts estimated that cybercrimes are going to cost six trillion dollar annually by 2021 [7]. There are many types of cyber-attacks including phishing attacks, drive-by-download attack, denial-of-service attack, man-in-the-middle attack, password attack, snooping attack, cryptographic attack, malware attack, zero-day exploits, cryptojacking, SQL injection attack, cross-site scripting attack and spoofing attack.

A cyber-attack could possibly damage the reputation of a person, collapse a company or even create a socioeconomic warfare. Guns are muted in modern society, instead cybercrimes firing the shots. Okutan et al. [8] analyzed the unconventional signals which were perceived from Global Database of Events, Language, and Tone (GDELT), Open Threat Exchange (OTX) and Twitter data. Then, they appraised the predictive signal imputation (PSI), aggregating signals with significant lags (ASL) and SMOTE++ for imbalanced data techniques against the incidents. The missing data in some attributes was filled with possible values to increase the quantity of training data and reduce the insignificance. Their model had addressed the problem of predicting the type of cyber-attack.

CCTV cameras are installed in public places to ensure the safety of individuals, organizations and institutions. The technological advancements lead to mass production of budget-friendly CCTV cameras. Hence, people engrossed to install these devices in and around their property. The government is also concerned about public safety; the respective public departments are allotting money for installing surveillance cameras. There are millions of cameras installed throughout the globe. They are producing enormous amount of data every second. Usually, these data were referred while the investigation process to collect the proof. There are few technologies which are automatically identifying the suspicious people by studying their actions. Nevertheless, the effectiveness of those technologies is still questionable.

Collecting data from video surveillance system, background subtraction, identification of objects and detecting the moving ones are the few steps followed in video data mining process. Video surveillance systems concentrate on region of interest (ROI); their focus is on the particular area or object including human. The decisions would be made based on the suspicious action or movement of object. In real scenario, these systems mostly failed to identify the exact object. Kim et al. [9] propounded a new approach with the help of Gaussian mixture model (GMM) and convolutional neural networks (CNNs). GMM was used for extraction of moving objects with background subtraction, and CNN algorithm classified the objects inside the ROI. The above said model even identified small objects which are moving in distance, and the accuracy is comparatively better than the existing approaches.

Social disorganization theory suggests that the possibilities of a person to be a criminal lie on the location where he lives. It argues that ecological or neighborhood factors are contributing more on designing a person’s character than any other factors. Vomfell et al. [10] had taken social disorganization theory as a tool for analyzing crime data. They have considered the population of locality, point of interest (POI), taxi flow and the tweets of people who resides in the specified location. Crime analysis has been done with social and structural point of view. The collected data was analyzed by the spatial linear regression, Poisson generalized linear model (GLM) and machine learning methods. They have deliberated the property and violent crimes and then confirmed the correlation between crime and spatial factors. Hence, it would be effective in crime prediction process as well as designing policing strategies.

Organized crimes like corruption, drug distribution, arms trafficking, smuggling, blackmailing, human trafficking, money laundering, robbery, gambling and murder are mostly committed by the gangs. Since these crime organizations are running in a tight-knit environment, their activities are not perceptible. Most of the time, the legal agencies received partial information about these gang crimes. It is really a byzantine task to predict a crime with the availability of partial data. Specially trained police officers are deployed in the criminal identification process. They analyze the circumstances, suspect, victim, type of crime and come to conclusion about particular incident. However, it is neither a time competent approach nor an effective way to resolve cases. Instead the machine learning algorithms are identifying patterns by analyzing the relationship between attributes. If there is any missing data, it would affect the learning process and accuracy of prediction. Seo et al. [11] proffered a solution to resolve this issue with the help of partially generative neural network (PGNN) architecture. It is effective in generating missing values and improves the performance of classification process and prediction.

In case, regions with same demographic trait were analyzed by the algorithm, then it would obviously yield the similar results. To overcome this challenge, inter- and intra-temporal–spatial patterns should be considered as primary factors. Zhao and Tang [12] recommended a new approach that deals with two regions with common characteristics. The model would be trained in one region or administrative block. Then, the learning would be transferred and tested with another region.

Crime partition could be done in many ways; one of such ways is segmenting the crimes by their severity. Mohd et al. [13] studied the filtering methods for crime data features selection. The factors like race, income class, age, family structure, education, population, locality and unemployment rate were considered as important entities in crime subdivision. The influences of these features in crime would differ case by case; hence, finding out the relevant features of crime is considered as a complex process. Appropriately identified features would persuade the outcome of prediction algorithm. In this research, hybrid features selection method was also implemented. Later, it was identified as a best performer compared to other features selection methods.

Hardyns et al. [14] used the term intelligence-led policing (ILP) in the sense of integrative approach. It includes explication of tactical strategies and intelligent pre-emptive policing. The crime hot spots are identified by the knowledge-based smart model. Amsterdam Police Department-based Crime Anticipation System (CAS) is able to process 200 attributes and results with 3% of high risk cells. The ensemble model is a combination of logistic regression and neural networks. In this study, the ensemble model could manage to achieve a better balance between hit rate and precision. The specific predictive policing mechanisms assist the city law enforcement agencies by providing early forecasting.

Classification is a supervised learning approach, which helps computers to learn from the input data. There are many types of machine learning classification algorithms including linear classifiers, kernel estimation, support vector machines, decision trees, quadratic classifiers, neural networks and learning vector quantization.

Vural and Gok [15] had chosen the decision tree, Naive Bayes classifier and assessed their capability. The decision tree algorithm denotes possible decisions in a tree-like model. The tree comprises roots, leaf nodes and branching; it also encompasses the sets and subsets. They are representing the attributes of data. Naive Bayes classifier is a linear classification algorithm, and it is working based on the Bayes’ theorem. It reconnoiters the probability of an event happening while the evidence occurred. It assumes that every feature is independent and equal. Mehmet Sait Vural et al. found a correlation between the size of data and the accuracy of predictions. If there is an elevation in data size, the accuracy would also get improved. The Naive Bayes classifier outperformed the decision tree algorithm in the precision. It also has more sensitivity while processing more amounts of data. Effectiveness of Naive Bayes classifier has already proven in other domains like medical, genetics and phishing. Through the outcome of this research, it confirmed that the same could be used in crime prediction. Even though it surpassed the other machine learning classifier by its high accuracy rate, still there is a necessity for improvement.

Whenever crime hot spot identification process got initiated, it would deal with the spatial–temporal attributes. None of the crime could be defined without the place and time of the specific event. The combination of spatial–temporal data and socioeconomic factor would be the main attributes in crime hot spot identification. The socioeconomic factors would have hardly changed in the specific period of time. In case, the user wants to predict the crime hot spot weekly by giving the above said two attributes, and then, the system would not indicate any variation in its prediction. Hence, Ding et al. [16] were not only concentrated on hot spot identification; they have also focused the borders on hot spots. They have tried three different options of RNN architectures and propose spatio–temporal neural network (STNN)-based model. During the evaluation, authors used deep recurrent neural network algorithm and it outperformed the traditional machine learning algorithms.

Location and environment are the prime factors of any crime. The characteristic of urban crimes is varying from rural crimes. These days, the crime prediction and prevention methods are included in urban planning processes. Urban areas are equipped with public transportation facilities like buses, taxies, trains. At the same time, the transportation facilities are concomitant with crime incidents. These risks affect the day-to-day activities of public whoever using that provision. Geographic information systems (GIS), spatial clustering and artificial neural networks are collectively utilized by the transportation risk identification tools. The data mining methods, specifically the neural network-based algorithms, are used to assess the risks associated to the transportation and forecast the crimes. As Kouziokas [17] emphasized, scaled conjugate gradient algorithm is considered as a quick learning and time effective compared with others. Optimum neural network model categorized the crime hot spot areas by examining spatial data.

Predictive analytical process consists of statistical and analytical tools which study the historical and current data. These tools are designed based on the algorithms; they would carefully anatomize the relationship between different attributes and impart predictions about future. Russell [18] suggested that the effectiveness of predictive modeling could be measured by validity, equity, reliability and usefulness. Validity is determined by the successful outcome derived from the tests executed. The receiver operating characteristic (ROC) curve contains true positive and false positive values of the tests. True positive rates designate how many positive predictions have been envisioned by the model. False positive rates represent how many incorrect positive predictions have been speculated by the model. True negative and false negative rates also played a vital role in validity. The performance of model in sub-population is measured by equity. The model should result equivalent outcome while analyzing the same information by different users. It is called as reliability. The usefulness of model is very important in the end user point of view irrespective of its effectiveness (Table 2).

Table 2 Key findings and interpretation

4 Research Gap

Criminology is one of the oldest intellectual departments in the world. Crimes are evolving with the human civilization. They challenge the society with their intensified catastrophic effects. As a response, the society tries to build systems that would eradicate the reoccurrence of crime. The systems like policing, law and firewall could only be designed by analyzing the existing crime incidents. The role of identified crime features in classification and prediction process is inimitable. The crime features are depending on socioeconomic-culture-demographic factors. Particularly, the influence of socio-economic factors on crimes is inevitable. These factors are varying based on the social structure and hierarchy.

Most of the western world is constructed by class hierarchy. Hence, the researchers from developed countries concentrated on the association of crimes and economic factors. However, Indian society still got influenced by the castes and religions. There are many subcultural groups inside these social structures. Few of these groups are insisting others to follow some impractical customs and values. When there is a resistance against this compulsion, it leads to social unrest and crimes. The subcultural theory of criminology describes that these crimes are committed through learned behavior. The youth population of the problematic subgroups has to be treated with strict legal and social policies to resolve these issues.

In India, the hate crimes are upraising in the past few years. Amnesty International report [19] declares 721 such cases from 2015 to 2018. They have been committed against religious minority, dalits, sexual minority, women and children. The association of subcultural groups with these crimes could no longer be ignored.

During the feature identification process in data mining, the dominance of subgroup on crimes should be considered as an element along with other social factors. The hate crimes should be classified under distinct category (Ex: Murder and honor killing should not be classified under a same group). The distinct categorization of hate crimes and spatial–temporal information along with other data would be helpful to improve the accuracy of predictions (Fig. 1; Table 3).

Fig. 1
figure 1

Inclusion of attributes related to subcultural groups in mining process

Table 3 Pseudocode of the workflow

5 Conclusion and Future Work

The features of unlawful act play a significant role in crime occurrences. Since the features are determined by sociocultural factors, the feature identification process should be localized. The influence of economic factors on crimes was investigated by many Indian researchers. As the cultural factors also make considerable effect, the impact of those factors is yet to be intensely navigated. Hence, this study proposed a new conceptual data mining approach toward crime feature identification.

In future, the impact of cultural factors on different crimes should be deliberated by machine learning algorithms. The crime data has to be classified based on the impact of features. The deep learning algorithms are going to be used for knowledge extraction, pattern identification and prediction. Based on the accuracy of prediction, a new crime prediction model would be designed and implemented. This feature-based crime prediction model would be helpful in predictive policing and legal policy making.