Keywords

1 Introduction

A so-called “phishing attack” is a cyber crime in which an attacker (the phisher) deploys a website that mimics another site in order to induce victims to provide sensitive information. Although significant efforts to the defend against phishing attacks have been made both in academia and in industry, the fight between attackers and defenders keeps going on. The Anti-Phishing Working Group (APWG) reports having detected 266,387 phishing sites during the third quarter of 2019, the highest number in three years, with more than two third using SSL, the highest percentage seen since this is being tracked [5]. Phishers keep improving their techniques to avoid detection, for example using SSL or adding multiple redirections [6].

Most of the literature on anti-phishing focuses either on detecting phishing emails sent to the victims (e.g. [11, 22, 26, 36, 49, 53] and many more) or on detecting phishing web pages(e.g. [2, 12, 13, 18, 23, 32, 44, 45, 59] and many more). These solutions are centered around the victims, the goal is to protect the victims from the attacks, that is, to cut off the channel between victims and attacks. However, very little work has been done focusing on the channel between the attackers and the attacks. However, that channel is as critical as the other ones: breaking it defeats the attack. This is the topic of this paper, and more specifically the channel through which the attacker collects the stolen information from the phishing site. This approach helps web hosting provider and network owner to combat phishing by detecting immediately that an attack is being deployed on their network. It is a new idea that is not centered around the actual victims (most victims have no connection to the network on which the attack is being deployed) and thus this is a new tool which can work in combination with existing ones.

In most phishing attacks, the stolen information is exfiltrated back to the phisher by email: the code of the phishing site simply sends an email to a “drop address” each time someone submits something to the phishing site. Each email contains the data submitted by one victim. In the case of a multi-page phishing site, it is even often the case that several emails are sent for a single victim. Therefore, detecting and blocking these emails is a different and complementary means to combat phishing attacks. In the following, we call these emails sent by the phishing site to the phisher “exfiltrating emails”.

In this paper, we evaluate three different machine learning technique to detect exfiltrating emails: word-based, pattern-based and structure-based detection. We test the robustness of our three models against potential attacks. Although all three models are shown to be very effective at detecting these messages, the model using a deep-learning approach, which is called DeepPK, is the one that is the overall best since it remains quite effective even when the messages are altered to avoid detection. The key idea of DeepPK is to deal with the email structure as a sequence of components that follows specific grammar rules. The key component of DeepPK is a bidirectional Long Short-Term Memory (LSTM) network [46]. It allows DeepPK to automatically learn the difference between the structural grammar rules of exfiltrating emails and regular emails. In order to effectively represent email structure, we propose a new encoding method, the structure token, which uses a small corpus containing only 14 symbols.

We train and test our models on a realistic database of exfiltrating emails. These emails are built from the combination of two real datasets: a database of exfiltrating emails generated by actual “phishing kits”, which gives us the patterns of the exfiltration emails, but not the data provided by the victims, and a database of values that have been submitted to real phishing sites. By meshing up these two databases, we end up with a system that can generate a large number of exfiltrating emails. In this paper, we use almost 65,000 such messages to train and test our models.

The solution described in this paper has a key advantage over most of the existing ones: it does not require the attack to be first reported, or to be somehow actively discovered. Instead, it is the attack’s own network traffic that is being detected and stopped. Therefore, this technique can be used to stop a phishing attack immediately, preventing a single delivery of stolen information to the phisher.

The paper is organized as follows: In Sect. 2 we explain what detecting exfiltrating email achieves that current phishing detection method do not. In Sect. 3, we present our exfiltrating emails database. Then in Sect. 4, we introduce our machine learning-based approaches. In Sect. 5, we present the evaluation of our models, which is followed by the robustness test in Sect. B. We provide an overview of the literature in Sect. 6 before concluding in Sect. 7. All of the source code and some of the non-sensitive data used in this paper will be made available after the anonymous review.

2 Motivations

As already mentioned, most of the antiphishing efforts are directed at protecting victims, either by preventing the attacker’s message from reaching its target, or by detecting that a site is not genuine. However, not every potential victim uses these mechanisms, and even when they do, these mechanisms are not perfect: for instance, Hu et al. [29] have shown that even email providers that do apply anti-spoofing detection techniques fail to always prevent forged emails from reaching the victims. It is therefore also important to help network administrators to proactively detect that a phishing site has been deployed on their network, without being notified of the URL fist. Such a phishing site can be deployed on a network because the attacker has compromised one of the servers, or because the attacker as a legitimate right to deploy a website there. Very little work has been done in this area. Of course, one could use any phishing site detection method and scan the network to look for such sites, but this can be difficult due to the network’s size and the lack of control over what is being deployed there. More importantly, scanning would probably yield limited success without first somehow knowing the actual URL of the phishing site on the server. Waiting to be notified about the attack has the obvious disadvantage of being out of the control of the network’s administrator, and of opening a window of time during which the attack is live on the network. Closing that gap is necessary to prevent victims from providing data to the phishing site, and to preserve the reputation of the network, which may otherwise end up being blacklisted.

Detecting exfiltrating emails is a new tool which provides a web hosting providers a new way to learn that a phishing attack is hosted on their network and stop it immediately, by monitoring outgoing emails instead of scanning their own network. It addresses the problem of having to find out or to be informed of the exact URL of the phishing attack, since that is instead the network traffic of the phishing site itself that triggers the detection. Another considerable advantage of such a system is that it can detect and block the attack as soon as someone submits data to the site, preventing the attackers from collecting any information. In addition, drop email addresses are uncovered and can be reported to the email providers and suitable authorities.

This tool will be useful to web hosting companies, but also to any entities managing a large and relatively open network, such as a university for example.

As already demonstrated in [52], our work can also be useful to email providers: it is possible to reliably and rapidly detect that one of the mailboxes is the recipient of such exfiltrating emails and block its access immediately, also completely preventing the attacker from accessing the data. This is of particular interest to free email providers, which are used extensively by phishers to create dedicated drop email addresses.

In this research, we focus on phishing attacks that are using clear-text drop emails as exfiltration techniques. It is always surprising to the academic community that such a basic and vulnerable exfiltration technique could be used in practice. A vast array of other techniques are of course possible to exfiltrate the data, including but not limited to simple email encryption, pushing the data out using another protocol such as http(s) or (s)ftp, more covert methods such as DNS-based exfiltration [38], or storing the data on the server and switching from a push-based model to a pull-based model. Several easy-to-find implementations of phishing-kit proof-of-concepts do in fact provide alternate ways of exfiltering data. In practice, the almost exclusive reliance on plain-text emails is well documented by practitioners [20, 31, 34, 39, 42, 52, 57]. Most recently, in [52] it is reported that all of the 10,000 kits analyzed in that study use the PHP mail() command to exfiltrate data. In [31] an analysis of 1,000 Phishing Kits done in 2018 found that “the vast majority of kits (98%) used email to exfiltrate stolen data to attackers”. Another study from 2018, [39], does not mention any other mean of data exfiltration. This is certainly also our empirical evidence having worked with well over 10,000 live attacks over the past couple of years: attackers today use almost exclusively clear-text drop emails for data exfiltration. Even when the phishing kit offers other alternative (usually some level of encryption), these alternatives are almost never enabled in live attacks. One explanation for this is that phishing attacks are very low-skill attacks, and any complication would negatively impact the model (see Sect. 7 for some more discussion about this). It is also possible that only some of the attacks are using clear-text drop emails, and for some reasons these are the attacks that we discover.Footnote 1 Even if that is the case, it remains that a large number of attacks are using clear-text drop emails as exflitration techniques and stopping these ones is a step in the right direction.

Our tool is not meant to replace existing ones. Detecting classical phishing emails is still necessary but serves a different purpose: it prevents email users of the domain from being victimized by phishing sites that are usually hosted somewhere else. Our tool is as a new and effective mechanism to secure networks against hosting phishing attacks themselves. Classical phishing email detection does not provide any direct protection against that.

We believe that there are two main contributions in this paper: first, we provide a new direction for detecting exfiltrating emails using neural networks trained on the structural information of the message. We introduce a encoding method which effectively extracts that structural information with only 14 characters. Second, we identify a missing piece in the fight against phishing. The hosting mechanism and the data exfiltration techniques are an essential and somewhat overlooked part of the equation. The detection models that we present here do work very effectively on current phishing attacks. Other detection models might be as effective, and attackers will certainly take countermeasure to prevent detection in the future. Nevertheless, it remains that web hosting providers must now be included in the defense against phishing, and that proactive techniques such as the one presented here must be developed and maintained as the situation evolves.

3 Exfiltrating Emails Generations

One difficulty with this research is to access to exfiltrating emails to train and test the models. We are not aware of any such database prior to this work. Some prior work could have indirectly access to some exfiltrating emails (e.g. using honeypots [27]) but in limited quantity.

In this work, the starting point for the generation of exfiltrating emails is two datasets that the forensic teams of our industry partners have collected from real attacks:

  1. 1.

    A set of 3,162 distinct Phishing Kits which are actual phishing websites written in the PHP language,

  2. 2.

    and a collection of 370 files containing various amount of data collected by real phishing sites.

The generation process involves three stages: Phishing Kit Deployment, Files Parsing, and Email Generation described in the next subsections.

3.1 Phishing Kit Deployment

Each phishing kit is deployed in a custom sandbox environment. By redefining functions and certain global objects of the standard library (PHP language) used by the phishing kits, it is arranged that the calls requesting values for HTTP GET/POST request variables and cookies will return special placeholder values which we can later use to identify the value which is requested e.g. a POST request variable named “username”.

Any email messages sent are captured. These messages are parsed, identifying all special placeholder values in addition to a small number of special patterns including IP address, date/time and user agent, with the end result being a sequence of static strings and dynamic value specifications termed an email template. A sample is shown in Fig. 1. A total of 6,448 unique email templates acceptable for use in subsequent steps are generated from the data. As previously noted, phishing kits often send more than one message, either because the attack is done in several steps and each step triggers a separated message, or because the phishing kit contains more than one phishing sites.

Fig. 1.
figure 1

Sample exfiltrating email template extracted from a phishing kit (manually modified for obfuscation)

3.2 Data File Parsing

Our data files contain sets of values that have been collected during phishing attacks and recovered by forensics teams. These values correspond to what the victims provide to the phishing site (and thus what is then exfiltrated in the emails). The type of data found in this dataset is what one expects from a phishing site: mostly credentials for websites and other systems, but also credit cards information and other personal information. In addition, the IP address of the victim, time of access, type of browser etc. is often collected by phishers.

It is worth noting that in a typical phishing attack, the majority of the values submitted to the site are not genuine. Instead, the majority of the inputs seem to come from users attempting to “get back” at the phishers by submitting a flurry of random data, insults and denial-of-service attempts. Nevertheless, these are the values that a typical phishing attack will receive and exfiltrate, and thus all of these values are valid and indeed necessary for our purpose.

We did parse all of our data files to extract the individual values and match them to the values requested by the phishing kits. The end result of this process is the population of an Exfiltration Database with data for 115,713 entries comprising 332,224 values.

3.3 Email Generation

The general idea is to generate emails from each email template by filling in placeholders using data from the exfiltration database. For each of the 6,448 email templates, we generate 10 email messages randomly filling in placeholders using data from the exfiltration database. When doing so, we require that all template values are populated, although we do not insist that the data all belongs to a single entry or even to data from the same file. This resulted in 64,480 exfiltration emails, two examples are provided in Fig. 2.Footnote 2 To ensure that our models are trained and tested on different datasets, email messages coming from the same template are either all used for training or all used for testing.

Fig. 2.
figure 2

Two instances of exfiltrating emails generated from the template of Fig. 1, values manually obfuscated.

4 Methodology

We have trained three different models to recognize exfiltrating emails. In this Section, we first introduce two approaches that are commonly used in email classification: word-based and pattern-based detection model. We then introduce our structure-based model.

4.1 Word-Based Detection Model

Naive Bayes approaches have been shown to be very successful in text classification task [8, 58]. Therefore, we included one such implementation in our exfiltrating email classifiers.

Specifically, the model learns the conditional probability and the independent probability of each word from the training set, and uses these probabilities to predict the probability that a new text belongs to a certain category. Formally, we work from a set of documents consisting of n unique word tokens \([w_1, w_2,\ldots , w_n]\). These documents are classified into p categories \([C_1, C_2,\ldots ,C_p]\). Each document can be represented as a vector \(x=(x_1, ..., x_n)\), where \(x_i\) represents the relative weight of \(w_i\) in that document.

In our case, to effectively represent word features, we first extract consecutive alphanumeric characters using the regexp [0-9A-Za-z] to get a “word” list. We then apply 1-gram and 2-gram to create word tokens. The corpus of the model is built using the 5,000 most frequent tokens. We apply a “scaled term frequency” to calculate the frequency of the token. Formally, the scaled term frequency of the word token \(w_i\) in the document \(d_j\) is

$$\begin{aligned} 1+\log ( \# \text { of occurrences of } w_i \text { in the document } d_j). \end{aligned}$$

We then apply tf-idf using the scaled tf to vectorize the tokens. For vector normalization, we apply an “L2” normalization: the sum of squares of vector elements is 1. Finally, for each document (email), we end up with a 5,000-dimension vector.

The probability that a document of vector \((x_1, ..., x_n)\) belongs to the category \(C_k\) is \(p(C_k|x_1, ..., x_n)=\frac{p(C_k)\prod _{i=1}^np(x_i|C_k)}{p(x_1, ..., x_n)}\).

Note that \(x_i\) is a TF-IDF value of the word token \(w_i\), which is only related to the set of documents. In other words, given a set of documents, \(p(x_1, ..., x_n)\) is a constant for each category \(C_k\). Therefore, \(p(C_k|x_1, ..., x_n)\) is proportional to \(p(C_k)\prod _{i=1}^np(x_i|C_k)\). We apply the Gaussian Naive Bayes algorithm to estimate the likelihood of features, \(p(x_i|C_k)=\frac{1}{\sqrt{2\pi {\sigma _{C_k}}^2}}\text {exp}\left( -\frac{(x_i-\mu _{C_k})^2}{2{\sigma _{C_k}}^2}\right) \), where the parameters \(\sigma _{C_k}\) and \(\mu _{C_k}\) are learnt by the model during training. \(p(C_k)\) is also a learnable parameter, which is equal to

$$\begin{aligned} \frac{\# \text { of documents in }k^{th}\text { category}}{\# \text { of documents}} \end{aligned}$$

Once the model is trained, it is used to assign a new document of vector \(x'_1, ..., x'_n\) to the category \(C_i\) which maximizes \(p(C_k|x'_1, ..., x'_n)\). In the following sections, we name this model NB.

4.2 Pattern-Based Detection Model

In addition to using different set of words (when compared to regular emails), exfiltrating emails also tend to use singular patterns. For instance, they are often organized following the format: <header> + <field name> + <delimiter> + <value>. Therefore, we also trained a classifier to look for patterns. We first encode the content of the messages using only five character classes: letters (L), digits (D), punctuation (P), newline (N) and whitespace other than newline (W). Each email is first encoded using these five classes. We then compute all n-grams of lengths 10 to 16 on the encoded email sets, exfiltrating emails and regular emails, and we keep only the n-grams that appear only in one of the two sets, that is, n-grams that are found at least once in the exfiltrating (resp. regular) training set but never appear in the regular (resp. exfiltrating) training set. A greedy set cover algorithm is applied to obtain a token cover set, which only covers the same set of documents. We derive a classifier using only the token cover set for the class of exfiltrating emails which classifies a document as exfiltrating if and only if its token set contains one of the tokens in the exfiltrating token cover set.

Formally, let t be a tokenizer function and A and B be email classes. Let \(D(A,B) = \bigcup t(A) \setminus \bigcup t(B)\) and similarly let \(D(B,A) = \bigcup t(B) \setminus \bigcup t(A)\). Finally, using a set cover algorithm, select a small subset \(C(A,B) \subseteq D(A,B)\) such that \(\{ m\in A \mid t(m)\cap D(A,B) \ne \emptyset \} = \{ m\in A \mid t(m)\cap C(A,B) \ne \emptyset \}\) and similarly for C(BA). Let \(C_0\) be the set of clean messages, and \(C_1\) be the exfiltrating emails. Define a classifier c by

$$ c(M)= {\left\{ \begin{array}{ll} 1 &{} \text { if } t(M) \cap C(C_1,C_0) \ne \emptyset \\ 0 &{} \text {otherwise} \end{array}\right. } $$

In the following sections, we name this model Set-cover.

4.3 Structure-Based Detection Model

As discussed in Sect. 4.2, exfiltrating emails tend to follow a specific format that is rarely used in regular emails. If we look at the structure of the document as a grammar, exfiltrating emails and regular emails follow two different grammars. Deep learning algorithms are known to be effective at learning underlying grammars of text documents [7, 14, 51], therefore we also include a deep learning-based classifier.

As we did in Sect. 4.2, we first encode the message using using a new structure token using 14 symbols. The details of that encoding is provided in Appendix A.1. In addition to the structure token, our model also include two “semantic” features: the content entropy and the text proportion, which are detailed in Appendix A.2.

Recurrent Neural Networks (RNN) are often used for problems with sequential information as input and have been shown to be effective in a variety of natural language processing problems [9, 35]. For this model, we use a Long Short-Term Memory (LSTM) RNN, which has been proved to perform well in dealing with complex patterns and long sequences [28, 50]. The details of our use of LSTM, which we call DeepPK, are provided in Appendix A.3.

5 Experiment

We now report our basic results, starting with a description of our experiment environment.

5.1 Experiment Environment

We have developed DeepPK using KerasFootnote 3 with Tensorflow as the back end. For HTML emails preprocessing, we use BeautifulsoupFootnote 4 to extract the text from the HTML emails. Our models NB and Set-cover are implemented using Scikit-learnFootnote 5. Our experiments are performed on a Windows-based system with an Intel i5 CPU at 3.5 Ghz and 16 GB RAM. DeepPK is trained and tested on a NVIDIA Geforce GTX 1060 with 6 GB RAM. Our source code can be found on our website, http://ssrg.site.uottawa.ca/phishing_kit/.

5.2 Exfiltration Email and Regular Email Database

We obtained our regular emails database from the Enron email datasetFootnote 6, which was collected and prepared by a third party organization, and contains about 0.5 million messages coming from 150 users. Our exfiltrating emails database, which consists of 64,480 messages from 6,448 unique exfiltration email templates are generated by the approach discussed in Sect. 3.

To ensure that training and testing data is separated, we first split our 6,448 unique exfiltration email templates into two sets at a ratio of 4:1: 5,158 templates are randomly selected for training, and the remaining 1,290 are used for testing. This yield 51,580 email instances for training, and 12,900 email instances are for testing. For the regular email database, we create a balanced training set by randomly sampling 51,580 messages from the Enron email dataset. For the regular email test set, we use 5 times the number of test exfiltrating emails, for a total of 64,500 regular emails. This unbalance is to mimic a real-life scenario in our tests, since exfiltration emails would be a fraction of the mail traffic in reality.

As described in Appendix A.3, we inject into some of the (encoded) exfiltration emails some length of tokens taken from regular emails in order to avoid learning only the prefix of these messages. Specifically, we inject into 8 of the 10 instances generated from the each template a token segment randomly sampled from the regular training set. The size of the segment is randomly selected between 1 and 50 characters.

In order to avoid overfitting during training, we further split our training set: 80% is used for the actual training, while 20% is used for validation. Accordingly, we end up with 41,260 messages in each exfiltration email set and regular email set used for training, and 10,320 messages in each set used for validation. During training, we store the model which yields the best performance on the validation set and then evaluate it on the test set.

5.3 Model Evaluation

In order to evaluate the effectiveness of our models, we compared them on similar experiments and report here the results. By default, we use the following parameters for DeepPK: input length is 600 and number of memory units is 128. Since DeepPK uses tumbling windows to process the data, to ensure a fair comparison, we also test the NB model with tumbling windows (that model is noted NB-Window below). We tried window sizes of 5 to 10 lines and report only the one with the best performance.

We apply five standard metrics to evaluate the performance of the models: false positiveFootnote 7 (FP), false negative (FN), precision (pre)=\(\frac{\text {TP}}{\text {TP}+\text {FP}}\) (TP stands for true positive), recall (rec)=\(\frac{\text {TP}}{\text {TP}+\text {FN}}\) and f-score=\(\frac{2*\text {pre}*\text {rec}}{\text {pre}+\text {rec}}\). The results are shown Table 1.

Table 1. Performance comparison between models

For the NB model, we note that using a tumbling window improves the false negatives rate but at the expense of the false positives rate. For DeepPK, the model that only uses a single LSTM yields the best false negative rate (0.29%) but the worst false positive rate (0.97%). Through manual inspection of these false positives, we found that most of them are very short regular emails. The model that uses bidirectional LSTM fixes this issue thanks to the additional information provided by the backward direction. The performance is further improved by using our semantic features, which help the model correctly classify regular emails with a structure similar to that of the exfiltrating emails (e.g. the case shown in Fig. 3). In general, the model which uses bidirectional LSTM and semantic feature yields the best false positive rate (0.34%) and the best F1 score (98.91%) across all models.

5.4 Model Robustness

Our results in the Sect. 5.3 show that all three proposed models perform well in detecting exfiltrating emails. In this section, we discuss several possible ways an attacker could modify exfiltrating emails to evade detection, and we evaluate how resilient the models are to these modifications. When looking at these potential detection evasion techniques, we specifically focus on solutions that would be relatively easy to implement for the attacker and would modify the exfiltrating emails without preventing automatic processing at the receiving end. More advanced evasion techniques are of course possible, but they would likely impact negatively the “business model” of phishing by requiring more advanced technical skills from attackers (see Sect. 7). Here, we consider two potential attacks:

  • Injection attack. In this attack, the phisher injects additional noise into the exfiltrating email, which is otherwise unchanged. In practice, the injected text can be random strings, or pieces of text extracted from regular emails. The latter is a more effective attack because it introduces “negative” noise (segments possibly matching what the model has learned from the regular emails), which is more likely to result in misclassification. In our study, we consider a worst case scenario and use actual text segments from our regular email database to increase the chances of defeating the models. We test four different ways of injecting “negative” noise : injecting at the top of the message, at the bottom of the message, in the middle of the message, and finally scattering the injected text throughout the exfiltrating email.

    We run several experiments. When injecting top, middle or bottom of the message, we injected a size of text ranging from 10% to 100% of the original exfiltration email, measured by the length of the resulting structure token. So in the worst case, 50% of the resulting structure token comes from injected text. When scattering the injection throughout the text, the injection is measured in terms of number of lines in the original text. In our experiment, we increase the number of injected lines, going from one line randomly inserted in the original text to one line inserted between each line of the original text.

  • Replacement attack. In this attack, the phisher replaces the text of the structure of the exfiltrating emails with strings that the model has rarely or never seen. The purpose of the attack is to eliminate “positive” indicators. An easy way to perform such an attack is to systematically replace existing field names with other strings. Note that because DeepPK detects exfiltrating emails based on our structure token and not on the message itself, this model is not impacted by this attack if the strings used for replacement are of the same length as the strings they replace (since it would yield the same structure token). In order to have an effective attack against our model, we apply what we have called “incremental injection”, where the size of the injected stings is gradually increased.

    We run several experiments with this attack as well. First, as mentioned we change the length of consecutive tokens, trying various increments from 17 to 101. This ensures that each experiment produces a different structure token fragments. For each length, we try three different types of replacements: we try to replace only “words” (that is, sequences encoded as C in the structure token). We then try to replace only “non-words” (that is, sequences encoded as N, L or S in the structure token), and finally, we try to replace everything.

Table 2. Attack test sets

In these experiments we use the model trained on the original database, so the modified exfiltration messages have never been seen by the models before. We do not report the results on the regular emails again, since these would not be impacted by these experiments. We use the test set discussed in Sect. 5.2. Instead of using 10 instances per template, we randomly choose one instance from each template, and end up with 1,290 exfiltrating emails that we modify for the experiments. As explained, the injected text segments are randomly sampled from the regular test set. In order to facilitate the comparison, we use the same random seed for all our experiments (Table 2).

When faced with injection attacks, in general, DeepPK performs well, with an error rate of at most 5%, except with the test set inject_line. On that test, the error rate increases with the proportion of injected text, to reach 28% at the top. This is because, as expected, this injection destroys the sequence of structure tokens, eliminating some key tokens. The Set-cover model is stable in the injection test, with an error rate of at most 6%. This is not surprising since the Set-cover model only looks for learned “bad” token in the message. Injecting noise does not impact the presence of these tokens and the noise is just ignored by this model. Still, except for the test set inject_line, the Set-cover model performs worse than DeepPK even with a relatively high proportion of injected text (up to 70 to 90% of the original message depending on the test). The NB model does not perform well in the injection test. The model breaks down significantly as more “negative” content is injected. The use of tumbling windows does help, but the performance is still worse than the other two models. More details are available in Fig. 9 of Appendix B.

The word replacement attacks has almost no effect to the performance of DeepPK, with an error rate peaking at 5%. On the other hand, the performance of DeepPK on the test sets replace_non_word and replace_all is quite inconsistent: it sometimes performs very well with an error rate of less than 2%, but in some cases the error rate goes above 80% (Fig. 10 of Appendix B). To better analyze this phenomenon, we have conducted a complete set of tests on the test set replace_all, ranging the injection proportion from 1 to 100, step by step. Out of these 100 tests, the error rate is below 10% 42 times, and below 5% 31 times. The explanation might be that “non-words” in the template are important indicators of exfiltrating email for DeepPK. However, to successfully conduct such an attack, the attacker needs to successfully break up the part of the structure that happens to have been learned by DeepPK, which is quite challenging and a process of trial and error. Generating such exfiltrating emails would be significantly more difficult than what is currently done. What is more, interpreting these emails once that are received would also be orders of magnitude harder than the current situation. Therefore, this attack, however effective, seems of limited practicality. Set-cover and NB are basically defeated by this attack, see Appendix B for more details.

6 Related Works

Most of studies on phishing attack detection focus on identifying phishing pages and phishing emails that are used to spread phishing links.

Most proposed phishing sites detection techniques look for some intrinsic characteristics of the attack. For instance, [16, 24, 33, 37, 40, 54, 55] use an array of machine learning models to train a binary classifier. Some work has also been done to compare these approaches [1, 36]. But as mentioned in Sect. 2, detecting that a site is a phishing site does not address the needs of a network administrator if, as is the case in these papers, the site’s exact URL is needed for the detection.

The main general approach for detecting phishing emails is to apply machine learning techniques to detect the characteristics of a content that is designed to deceive the victim. Fette et al. [22] propose such as method. The feature of their model mainly focus on the phishing link embedded in the email, such as the number of dots and the number of domains in URL, rather than the email content. They report a 99.5% accuracy and 0.13% false positive on a dataset of 860 phishing emails and 6,950 regular emails. In [48, 53], the authors suggest to combine natural language processing techniques and contextual information to identify phishing emails. In [53], the authors report a 98% true positive rate and 0.7% false positive rate on a dataset of 2,000 phishing emails and 1,000 regular emails. In [48], the authors report an accuracy of 92.2% and a 4.9% false positive rate on a dataset of 14,370 phishing emails and 14,370 regular emails. Some researchers suggest to also use delivery information to detect phishing emails. In [11, 26], sets of features such as the consistency between sender domain and the embedded link are used. Stringhini et al. [49] propose a detection model for spear phishing attacks by profiling the email sender: writing habits, composing habits, and interaction habits. Such behavioral-based detection would not be directly suitable for our purpose, since in our case no impersonation is taking place. However, none of these techniques would probably be very effective at detecting exfiltrating emails because exfiltrating emails do not contain URLs or deceptive text, and are sent to the attacker’s drop email address and from the header’s viewpoint are not different from regular emails.

The work most related to ours is [52], in which a large scale analysis on credential theft is conducted. The author work on a source of about 10,000 kits, and propose a method to extract phishing templates by parsing the kits source code. They then look for instances of these templates on Gmail, using Gmail’s built-in anti-abuse detection system. They detect over 12 millions exfiltration emails between March 2016 and March 2017. This works confirms that most phishing attackers (who use Gmail 75% of the time for drop email address) simply use plain-text when exfiltrating emails and thus detecting and blocking these messages at the hosting site would currently be extremely effective. The detection method that they use is however based on text matching; as we have shown in the Sect. 5.4, attacker could simply evade detection merely by using different keywords. Our method is more resistant and is aimed primarily at hosting provider.

One general problem with the above methods is that the attacks need to be first discovered and reported, and this means some delay between the attack and its detection (about 10 h according to the report from APWG [25]). Our method can identify a phishing attack as soon as it starts to collect information. It basically prevents the attack to succeed at all if exfiltrating emails are scanned in real time, at the source or at the receiving end. In [17], two “zero delay” phishing attack detection methods are presented: one uses domain names to infer that a site will host an attack, and the other does proactive “blind” scanning of the network. By contrast, the method proposed here works regardless of the domain name used (in particular it works even when the domain name is not related to the attack) and will work without knowing nor guessing the URL of the attack.

The main difference between our work and all the above methods is that the goal of these methods is to protect a victim from an attack. Although it could indirectly help network administrators to detect a phishing site on their network, it usually requires the URL of the attack to be known, which usually means that someone needs to first report the attack to the administrator. In contrast, the goal of our work is to directly help the administrator to detect a phishing site on their network, and it does it automatically and without delay. In [27], a system is presented in which honeypots are safely deployed and phishing kit are monitored. This is probably the closest work to ours, but the aim is quite different. That system does not provide a way to detect an attack being deployed on a live network. It is however one possible way to learn new email exfiltration patterns and thus it can work and combination to our system. In [43, 56], the authors propose to monitor spam botnets and infer regular expressions matching the messages sent by these botnets. A similar approach may also achieve good performance in our context. However, as explained before, in our case the attacker controls the entire channel, from the message creation to the message consumption, and thus simple rule-based systems would be easier to be defeated by simply changing the messages body, as we did in Sect. B. As we showed, the models that we propose, in particular our deep learning-base model, can be quite resistant to simple pattern modifications of the messages.

In addition to phishing detection, there is a significant body of academic work focusing on email classification for several purposes, such as spam detection. For instances, Blanzieri et al. [10] present a survey of supervised machine learning algorithms for spam detection in 2008. These methods treat the email content either as a set of word tokens, or as a text in natural language. A binary classifier is then trained based on the extracted features to identify spam. Some methods also combine other information, such as attachments, headers and embedded images to improve the performance. Elssied et al. [21] apply a k-means clustering technique to identify spam. Not all solutions rely on machine learning-based classifier, e.g. Pérez-Díaz et al. [41] propose a method using a set of rules.

In general, all these spam detection methods mainly focus on email content and use semantic features to build classifiers. To the best of our knowledge, we are the first to propose a machine learning method which uses structural features of the messages to classify emails.

7 Limitations and Conclusion

One clear limitation of our empirical evaluation is that the attacker does control the entire exfiltration system and therefore can in theory very easily change it to avoid detection. As previously mentioned, using simple email encryption or switching to a completely different exfiltration technique would defeat the detection methods evaluated in this paper. Such a switch would not be terribly difficult to achieve from the attacker viewpoint. We argue that forcing phisher to step-up their game and implement more advanced exfiltration techniques is a good thing that will hurt the business of phishing attacks. The main reason for this is that phishing attacks are very low-skill attacks. In [15], 15 phishing attack “vendors” are surveyed. In general, these individuals have very low technical skills, and are only claiming the most basic web-programming abilities. Their clients, who are the actual attackers, presumably have even lower technical skills. Empirically, we can confirm that the code that we have seen in thousands of phishing kits is of very low quality and does not suggest any kind of programming understanding. In [19], an analysis of the evolution of phishing attacks over time also shows that only the most basic updates are performed on live attacks by the attackers. Raising the technical bar even slightly will likely exclude many of the current players. Another reason is the low return that phishing attacks yield, and the poor quality of the data collected. In [15], it is reported that the cost of a tailor-made phishing site ranges from 15$ to 250$. As mentioned in Sect. 3.2, in our experience the vast majority of the data sent to a phishing site is bogusFootnote 8 and thus processing the data to identify usable information is a time consuming process. Adding a decryption step, or using less structured exfiltration format will complicate data processing further and reduce profitability even more. We can report that in practice, we have almost never seen an attack in which the phisher bothered encrypting the content of the exfiltration emails.

Of course, if our system or a similar one becomes widely adopted, this will force attackers to step up their game and e.g. start encrypting their messages. As explained, we think that this will hurt their business. Nevertheless, when that time comes, new detection techniques will have to be found, depending on the new exfiltration trends. For example, several approaches have been proposed to work on encrypted traffic by comparing the traffic pattern ending to the same destination [3, 4, 30]. If the main exfiltration technique remains email-based, then some protection could be expected from a wide adoption of standards such as SPFFootnote 9 and DKIMFootnote 10, which will limit the ability to successfully send email from hacked servers that are not meant to send emails.

Another possible criticism to our work is that we will not be able to detect exfiltrating emails that follow a completely different pattern. This criticism is mitigated by the fact that this new pattern can simply be added to our training set once known, and that we see much fewer patterns than there are attacks, suggesting a vast amount of code-sharing among phishers. It is in practice likely that our current model would catch many actual exfiltrating emails sent in North America and Europe at the time of writing. System such as the one described in [27] could also be used to discover new patterns as they are introduced.

We also acknowledge that our database is heavily biased toward North-American and European attacks. This is not a limitation of our method but a limitation of our database. Training our model on a larger database should address this issue.

The solution proposed here is, as far as we know, the first one that suggests to detect exfiltrating emails using structural information. This method has the advantage of working very well in our experiments, and being robust against evasion techniques trying to avoid detection by modifying the email content. We also introduce a new “structure token” which proves to be very effective when combined with our deep learning algorithm. Our work is also the first one to our knowledge to be tested on synthetic but realistic exfiltration emails, using a combination of two real datasets.

Unlike usual solutions that can be deployed at the end-user end, our solution needs to be deployed by host providers, where the phishing sites are being deployed, or by email providers, where the exfilrating emails are being received. This can be seen as a limitation, but also as a strength, since a handful of very large scale players could deploy our system and have a significant and immediate impact on phishing activities.