Keywords

1 Introduction

In recent years, cyberattack has been emerging at drastic pace. As per [1], in the year of 2017, average cyberattack cost is recorded as 153 million USD. The increased number of malware and sophistic, an enhancement in security of the network particularly cyberthreats related to CTAs is required. At these circumstances, information related to cyberthreat plays major role in indication of attack and vulnerabilities employed in the CTA at particular campaign. The major limitation related to present Cyber Telephony Integration (CTI) is limited sharing of information with respect to static- and heuristics based signature approach those are not able to defend any dynamic and complex nature of threats. Generally, this CTI is classified into four categories such as technical, strategic, operational, and tactical threat intelligence [2]. The strategic-based CTI is related to identification and analysis of risks associated with collected information for making decision for gathered information. The CTI belonging to operational has minimal vulnerabilities and attacks at zero day, dark web, or collection of hackers in closed forums. The CTI with tactical is related to CTAs and APT’sTTPs which involves CTA operand and training with integral part. The representation of APT’sTTPs based on the characteristics of CTA interacts with affected network and operating system. CTA offers substantial effort in the development of operational module by means of customized software tools for attack campaigns. Based on this, it is difficult for CTA to cope with changes in the APT’sTTPs due to minimal attack or new attack emerged in specified time [3]. Hence, it is necessary for the identification of attack in APT’sTTPs with incident of attack for prevention of loss in the organization. The recent prevalent incident of cyberattack and behavior of CTAs are extracted for establishment of relation between APT’sTTPs attack pattern [4].

The CTI employs several technical resource factors such as intrusion detection system and firewalls. Those indicators are the signatures involved in attack such as IP address, file hashes, Command and Control (C&C) domains, and viral signatures. The indicators are ineffective in the identification of cyberthreats which are involved in CTA modification each time for bypass firewalls and intrusion detection system [5, 6]. The tactical and strategic threat intelligence are considered as an effective tool and provides long-life time when compared with technical and operational threat intelligence. To evaluate the TTP process, machine learning has exhibited significant performance; hence, for evaluation of TTP process in CTI, neural network offers significant performance characteristics.

Even, burgeoning market is involved in sale of those stolen data and malicious threats specifically in the Eastern Europe and Russia. Researches on enculturation of subculture of hacker are involved in recognition of status of particular hacking community. Due to the usage of the Internet in almost all countries, hacking has been increased drastically. Throughout the world, mainstream political factor and social movements are based on dependent broadcast ideologies in the world. To overcome such hacking, several tactics have been applied for perceived injustices for secure access to people. Other countries such as Zapatistas, in Chiapas, and Mexico focused on post-information quality [7]. Cyberattacks have been engaged in drastic range by Chinese hackers where sensitive information is obtained through resources of USA by means of network infrastructure. In April 2006, hackers of Russian and Estonian specifically during war data have been hacked [8,9,10,11,12].

In the Internet, user application involves several assets specifically for industrial applications. The application of attacks is involved in identification of different assets in the network [13]. Hence, it is necessary to identify the assets those are targeted by hackers. In this research, a deep learning approach integrated with metaheuristics approach is developed for the identification of APT’sTTPs and assets. The data for analysis are collected from several hacking forums for training the deep neural network. The focused domain is CTI document which is based on structured threat information expression standard (STIX). Through testing and training, APT’sTTPs are identified in the hacking forums. Results demonstrated that file and discovery of TTP have been highly performed by hackers. For evaluation, this research considers 15 APT’sTTPs. In analysis of assets, hackers focused on technological aspects particularly for hacking data.

This paper provides the metaheuristics-based approach for developed for target asset. The proposed metaheuristics focused on 15 target assets to predict the hackers target for targeted assets. This paper is organized as follows: Sect. 1 presented general description about cyberassets. In Sect. 2, existing works related to target assets are presented. Section 3 presented about materials adopted for target asset along with TTP regulatory framework. In Sect. 4, results obtained for proposed metaheuristics approach are presented. In Sect. 5. the overall conclusion obtained for proposed metaheuristics approach is stated.

2 Illustrations

The analysis of CTI focused on defense mechanism in proactive and reactive approach for analytics. The CTAs are based on the consideration of behavior, capability, and persona. With consideration of cyberdefense factor profiling involved in collection of personal data [14]. In real-time scenario, based on attributes, threats are estimated. To evaluate the performance of hacking community, CTAs are involved in utilization of several software tools and assets for campaigns hacking [15].

In hacking communities, data related to generic topic is target asset for hackers. Those collected data from social media are labeled with inclusion of several emerging tools such as crypters, vulnerabilities in database, Web, and keyloggers. Those tools are subjected to minimal vulnerabilities and exploits certain factors. Even this provides effective guidance for administration security through proactive defense mechanism. In order to identify the attack patterns, [16] is necessary to streamline the workflow memory and performing the Indicator of compromises (IoC’s)analysis of generated attack pattern in each and every phases of attack chain using the machine learning algorithm. Analysis of malware is involved in provision of automated solution to evaluate interrelationship instances of malware [17]. Through the review of existing literature review, several domains are evaluated for measuring significance of tactics involved in CTI. Hence, this research focuses on several TTPs involved in regularities of attack in the system at right time. The analysis of existing literature stated that target assets are evaluated in terms of detecting attack. However, the existing techniques are focused on classification of attacks influenced by attackers alone but fails to examine the hackers target assets,

The APT’sTTPs for CTA’s involved in formulation certain regularities. The framework comprises of feature selector, ARM miner modules, and feature extractor in which feature selector is used for encoding CTI document using STIX. The selected CTI document subjected to cyberthreat is considered in this research [18]. The collected CTI document contains unique attribute ID in elements of STIX. In elements of STIX, TTP is represented where similar to CTI, TTP also has unique IDs. Each and every TTP provides description about sub-elements of TTP features. For the selected CTI document using feature extractor module, TTP’s information is gathered. This extraction is performed through CTI document unique reference and database information storage. The feature extracted is applied into module of feature selector for identification of most effective APT’sTTPs. To represent effective CTA terms, several approaches are implemented by means of deep learning approach for attack identification.

At present, deep neural network has been employed in due to evolution of artificial neural network. It facilitates training of network with tens and millions [19,20,21] or even billion parameters [22]. To withstand against cyberthreats, details from hacking forums are implemented by means of training and testing large amount of data for processing. In hacking community, minimal research has been conducted to drive hacker community. According to Honeynet projects, hacker community has six targets identified, namely money, entertainment, ego, cause, entrance to a social group, and status [23]. The above-mentioned factors are the major aim of hackers to steal particular data.

Even though deep neural network is similar to neural network, it is hard to train which requires appropriate number of data for processing. The major advantage of deep neural `network is it provides information to the human for possible prediction of attack in the network. The common process of DNN model involved in data ingesting from the different data sources which offers artifacts sequence for probabilistic ML model using pre-trained performance.Every model perform training using application of various procedure to identify artifacts through pre-existing and labelled sequence data. The output derived from this model is using probabilities model with application of set of procedure fed by other trained ML models [24].

Fig. 1
figure 1

Architecture of target asset prediction and prioritization (TAPP)

Through the collected data from the hacking forums, data are classified for testing and training. For collected TTP data, with assigned variables, training is adopted in deep network using cognitive agent. The network cognitive agent gains prior knowledge about previous observed attack models and assets inventory [25]. The cognitive agent determination of extent to which sequence of methods adopted by ML models with indication of attack compromised for provided asset. In case, attack within the cognitive agent is observed as ongoing, it will be able to identify human controller. Those are identified from the rank list involved in probable operation for prediction of objective [26].

3 Proposed System

3.1 Target Asset Prediction and Prioritization (TAPP) Architecture

Comparative analysis of the proposed scheme with existing technique observed that Samtani et al. [2] collected data from various online hacker forum such as OpenSc, Reverse 4you, and ExploitIN. It is observed that hacker details are sufficient trends related to reliable and credible for mislead defending community. Brynielsson et al. [1] examined the collected data related to cyberdefense exercises. The main purpose of this function is participants train with defend skill for the constructed network and organization of cyberwarfare. According to [1] the data from those sources are credible and reliable are out of scope exercises are limited which cannot be full-fledged campaign for CTA. Another researcher, M. Lee and D. Lewis., [11] examined the data related to Symantec Anti-Virus (AV) and Intrusion Prevention System (IPS) corpus for tactical CTI in terms of authentic and reliable for attack processing. The comparative analysis of results expressed that proposed analysis technique offers proactive and reactive defense strategy for recent CTA tactics (Fig. 1).

Data collection

To evaluate the TTP assets in industrial application, deep learning with metaheuristic approach is applied. For analysis, data were collected from hacking forum. According to Alexa, the service provider is involved in management of Web traffic; hence, those forums are ranked as No. 1 which is considered as subcategory hacking forum. Based on analysis, Alexa is relied on Indian Web site forum those includes various hackers in India at the rate of 16.7%, USA 21%, and UK 9.1% for various countries. However, specific hacker forum is represented as hacker forum with inclusion of semi-closed forum which is used for login forum process. The setting of forum is similar to that of other forums where threats are discussed and organized. In those formats, user post-initiates threat is referred as header. With respect to post of header, comments of other user post with several threads are included in replies. On the other hand, post of header needs to be discussed as thread and provides vast range of replies. In this paper, to evaluate the target assets, online hacker forum is utilized for assessment, hackers are not provided with intention of users based on the information forum except same interest share mechanism. During data analysis, the users are selected based on the noticeable active performance in terms of posting in the forum.

The forum data for analysis are collected from February 2007 to August 2018. The downloaded forum data consist of wide range of posting related to hacking process. Specifically, the type of information was considered in the dataset [27,28,29,30].

Post-centric data: The data of each post consist of header or replies which includes post ID, post title, post category, post content, and post author. User-centric data: The data registered forum is utilized for user reputation, and this includes user ID, date, user level, and user name.

The final downloaded dataset consists of information related to 26,691 users with initiation of 90,054. Also, the downloaded threads consist of 47,257 threats and 749, 9555 posts. In Table 1, the details of collected forum are presented as follows:

Table 1 Forum details
Table 2 TTP and its description

APT’sTTP

For analysis, 15 TTP’s are identified through hacking forums and presented about description. The selected APT’sTTPs are fed into deep learning technique through process of testing and training. The prediction of operation leads to inclusion of adversary techniques, and potential measures are utilized for prediction of objective and future technique. The human controller acts as information provided to offer information about cognitive agent through utilization of deep learning. For deep learning process, selected feature sets are larger and need to identify associations among them. In the following Table 2, describes about TTP’s for APT which is used for prioritize the assets in our research.

Asset Inventory

Data related to industrial control system have been evaluated for schematic illustration of the assets. Hacking forums are involved in prediction of assets for prioritizing the assets. Through the optimization approach, assets are scheduled which helps in reducing the attack in the network [31]. This research considers 14 assets for evaluating the priority of the network. The assets considered are listed in Table 3

Table 3 Asset inventory

Metaheuristics of ABCTAPP

This paper utilized industrial data for target asset focused by the hackers; for analysis, data were collected from Alexa. The collected dataset from Alexa was processed as follows using ABC meta-heuristics algorithm. Initially, ABC optimization algorithm involved in construction of elements for randomly identify the position within the boundary range of asset values. The dataset elements with attribute 1 are selected from processing else it will be eliminated. The selection of elements is stated in Eq. (3).

$$x_{m} = l_{i} + {\text{rand}}(0,1)*(u_{i} - l_{i} )$$
(1)

In the above equation, food source is stated as xm, which means assets. The parameters ui and li provide the upper- and lower-level solution space. rand(0,1) represents random number value of range [0,1]. The targeted assets by the attacker are identified through consideration of target solution which is represented in Eq. (4):

$$v_{mi} = x_{mi} + \phi_{mi} \left( {x_{mi} - x_{ki} } \right)$$
(2)

The parameter I represents randomly selected index, randomly selected attributes are denoted as xki, and parameter φmi denotes randomly selected integer value of [− 1, 1]. The parameter vmi provides estimated asset value using fitness evaluation in Eq. (5) as follows:

$$fi{t}_{i}=\left\{\begin{array}{c}\frac{1}{{f}_{i}+1} \, \, \, \, \, {f}_{i}>0\\ 1+\left|{f}_{i}\right| \, \, \, \, \, {f}_{i}<=0\end{array}\right.$$
(3)

In the above Eq. (5), the asset objective equation is estimated using fi which provides optimal value for targeted asset by the hacker. Through estimation of individual asset, values targeted by hackers are estimated using probability of asset selection using Eq. (6):

$$p_{i} = \frac{{{\text{fit}}_{i} }}{{\sum\nolimits_{n = 1}^{N} {{\text{fit}}_{i} } }}$$
(4)

The value N denotes total assets targeted by the hackers. fiti provides the optimal value for identification of asset targeted by the hackers. The ABC algorithm for identification of target asset prediction and prioritization (TAPP) in hacking forum is presented below:

figure a

Deep Learning—Deep Neural Network (DNN)

For Fig. 2, let us consider number of samples or assets as: \(D = \left\{ {\left( {X_{i} ,Y_{i} } \right)} \right\}N\) where N = 1. The collected data from the Web sites are represented as \(X_{i}\) for the time period \(\left[ {\Gamma - T + 1,\Gamma } \right]\) with length of T. The prediction of assets for hackers is denoted as \(Y_{i} \in Y = \left\{ {0,1} \right\}\); the targeted assets for window are shown as of length τ. \(X_{i} \left[ {\Gamma - T + 1,\Gamma } \right]\) For user data, \(X_{i}\) and \(\left[ {\Gamma + 1,\Gamma + \tau } \right]\) involves three heterogeneous primitive sub-components based on granularity of observed data as \(X_{ia}\), dynamic user information \(X_{id}\), and static user profiles \(X_{is}\), namely, and it is given in Eq. (5) as follows (Fig. 3) [31]:

$$X_{i} = \left( {X_{ia} ,X_{id} ,X_{is} } \right) \in X$$
(5)
Fig. 2
figure 2

ABCTAPP

Fig. 3
figure 3

Schematic overview of assets

For evaluation of log components applied in target window is shown as time _x0010_span T right in which dynamic information is denoted in Eq. (6):

$$X_{id} = X\left( {\Gamma - T + 1} \right)_{id} ,X\left( {\Gamma - T + 2} \right)_{id} , \ldots ,X\left( {\Gamma - 1} \right)_{id} ,X\left( \Gamma \right)_{id}$$
(6)

The deep learning approach utilizes mapping rule for attribute for estimation of feature space attributes, which is represented as R(·): X → Y and subsequently R(·) involved in estimation of future sample. The probability of sample i in attrition can be denoted as \(p\left( {y_{i} = \left. 1 \right|X} \right)\).

4 Experimental Analysis and Results

The metrics considered for comparison are data sources, CTI type, data features, defense strategy, and outcome. The data sources metric is represented as data source collected for analysis. This data source metric is involved in authentication and credibility of analysis process. For future comparisons, ground truth is provided for data sources baseline. As stated earlier, this research utilizes ATT and CK dataset for processing [16]. The extracted ATP’sTTP list from ATT&CK is performed in each phases of intrusion Cyber Kill chain which connects CTA’s and utilize appropriate software for processing. In Fig. 4, TTP in the industrial application is presented which is targeted by attackers. Through analysis of proactive and reactive defense strategy, it is observed that IC and SUD exhibit higher utilization in TTP asset by hackers at the rate of 93%. The DC offers TTP % values of 92%, failed directory deletion (FDD) provides TTP utilization rate of 90%, VA provides TTP value of 83%, WAS, WMI provides 73% and 70%, respectively. The parameter CD exhibits TTP of 78% and FD provides TTP of 85% (Tables 4 and 5).

Fig. 4
figure 4

APT’sTTP utilization in each phases of intrusion chain

Table 4 Data sources with hacker’s forum
Table 5 Applying the ABCTAPP with DNN to calculate target asset value

Table 4 portraits that the data can be collected from various sources such as cyber defense exercises [1], online hacker forum such as OpenSc, Reverse 4you and ExploitIN [2], Symantec Anti-Virus (AV) and Intrusion Prevention System (IPS) corpus [16]. The comparative analysis with existing technique is observed that Brynielsson et al., [1] said the function is to train participants with defend skill for the constructed network and organization of cyber warfare. Samtani et al., [2] that hacker details are sufficient trends related to reliable and credible for mislead defending community. Another researcher, M. Lee and D. Lewis., [16] examined for tactical CTI in terms of authentic and reliable for attack processing. The proposed analysis technique of combined model of deep neural networks and ABCTAPP. Table 5 offers proactive and reactive defense strategy for recent CTA tactics. Through analysis of proactive and reactive defense strategy, it is observed that IC and SUD exhibit higher utilization in TTP by hackers at the rate of 93%. The DC offers TTP % values of 92%, failed directory deletion (FDD) provides TTP utilization rate of 90%, VA provides TTP value of 83%, WAS, WMI provides 73% and 70%, respectively. The parameter CD exhibits TTP of 78%, and FD provides TTP of 85% .The asset prioritization is presented in Fig. 5; with asset prioritization, the collected dataset consists of 667 users with exhibition of similar pattern in casual hacker forum. From analysis, it is observed that behavioral pattern decreases with knowledge provision. The examination of assets stated that IP address, protocol, and role number are identified assets by the attackers. In secondary stage, serial number and name of host are targeted. From the graphical representation of calculated value of ABC TAPP wit9n front are targeted highly by the attackers (Fig. 6).

Fig. 5
figure 5

Target asset prediction and prioritization

Fig. 6
figure 6

ABCTAPP with DNN to calculate target asset value

5 Conclusion

The TTPs represent the behavior of a CTA when interacting with the victims’ resources such as operating system and network. This research adopted deep learning method for identification of number of TTP used in hacking forum. Through the application of deep neural network, target asset and multiple target asset have been evaluated. The collected cyberassets are process adopted, people and technology, and infrastructure (PPTI). The assets are prioritized with the help of DNN by training 15 TTP and inventory asset processed with ABC algorithm. Simulation analysis stated that IC and SCO are highly utilized assets for hacking forum. Further, hackers use protocol, host address, and IP address as targeted assets in the network. In the future, this research can be implemented in other wireless communication system such as WSN and IoT for attack identification.