Keywords

1 Introduction

The use of web-based applications has experienced phenomenal growth over the last two decades [1,2,3]. Applications such as online banking, e-commerce, online blogs and social networking sites have become a common platform for the transmission of information and the provision of services online. Since these applications deal with sensitive data and operations, they are an easy, lucrative and potential target for attackers. For instance, acquisition of confidential users’ data, financial gain, and the performance of several illegal activities. According to Symantec Internet Security Threat Report (ISTR) [4], over 76% of scanned web applications were rated vulnerable, and there was a 62% increase in malicious botnets targeting web applications.

In this context, the worldwide revenue for Corporate Web Security solutions is expected to grow from nearly $3.7 billion in 2019, to an estimated $6.1 billion by 2023 [5]. In fact, the long-term security goal of web applications is to maintain the trust of users. Thus, the security policy for web applications must therefore be defined to guarantee security objectives such as [6]:

  • Confidentiality: which ensures that only authorized users have access to information and exchanged resources intended for them.

  • Integrity: which determines the absence of inappropriate alterations of information to ensure the accuracy of information, not modified by unauthorized third parties.

In order to meet these security objectives, several solutions have been proposed [1, 6, 7] to ensure adequate security for web applications.

Fig. 1.
figure 1

The logical mapping between TCP/IP stack and cyber-security attacks. Web-related attacks target the application layer and network-related attacks focus on the other layers.

Among recent protection solutions, the use of intrusion detection systems (IDS) and intrusion prevention systems (IPS) for modern web applications [1, 8]. An IDS (Intrusion Detection System) is a defense tool used to detect and report intrusions to the administrator. An IPS (Intrusion Prevention System) is just an extension of an IDS capable of responding to attacks [9]. IDS and IPS are primarily designed to observe, detect, and prevent malicious activity on the network. However, the characteristics of traditional network attacks are very different from those of web attacks [1, 2, 10]. The first targets the network layer while the second focuses on the weaknesses of the application layer (Fig. 1).

In addition, modern web applications architecture and its running environments are complex (Fig. 2), managed by cross-platform databases, and typically created by developers with limited computer security skills [6, 10, 11]. Therefore, designing an IDPS to recognize and prevent suspicious activity on a web application requires a very different approach to that of an IDPS designed to monitor network traffic.

To help the cybersecurity community in building secure and efficient IDPS for moderns web applications, this paper aims to:

  1. 1.

    overview the core concepts of intrusion detection and prevention;

  2. 2.

    present the design challenges of an IDPS specifically proposed for the web;

  3. 3.

    evaluate certain open-source IDPS based on criteria related to the current context of modern web applications.

Fig. 2.
figure 2

An Abstract model of modern Web application architecture and its running Environments [12]

The rest of the paper is structured as follows. Section 2 overviews the core concepts of intrusion detection and prevention research area. In Sect. 3, we identify and discuss several specific challenges that make it difficult for an IDPS to monitor, detect, and prevent web attacks. Section 4 evaluates four of the most deployed open-source IDPS, namely AppSensor, ModSecurity, Shadow Daemon, and AQTRONIX WebKnight. The assessment is based on security features that a web IDPS must incorporate in order to surpass the identified IDPS challenges. Finally, Sect. 5 concludes the paper.

We should note that in the literature the terms “ID/IP systems” [13], “IDP” [14, 15], and “IDPS” [9, 16,17,18,19] can be used interchangeably to refer to intrusion detection and prevention systems. In addition, many vendors and web security experts consider Web Application Firewalls (WAFs) a special case of IDPS that work at the TCP/IP application layer [20, 21]. In this paper, we use IDPS to designate intrusion detection and prevention systems in the context of modern web applications.

2 An Overview of Intrusion Detection and Prevention

2.1 Intrusion Detection and Prevention

Intrusion detection was introduced in 1980 by J.P. Anderson [22] with the aim of identifying the use of a computer system for unauthorized purposes and detecting possible breaches of a system’s security policy. Anderson defines an intrusion as a violation of the system’s security policy, that is, a violation of one of the confidentiality, integrity or availability properties of the system. The purpose of intrusion detection is to report intrusions to a computer system's security administrator, so that they can take appropriate action. IDPS are required to follow two criteria [9]:

  1. 1.

    The reliability of the IDPS: any intrusion must electively give rise to an alert. An unreported intrusion constitutes a failure of the IDS (false negative).

  2. 2.

    The relevance of alerts: We can consider four alerts types listed in Table 1. Any alert must correspond to an effective intrusion. Any false alarm (false positive) decreases the relevance of the IDS. A good IDPS should have as low a number of false positives as possible.

Table 1. Types of IDPS alerts

Intrusion detection research also includes the notion of automatic response to intrusions; in addition to warning the security administrator, the intrusion detection system takes measures to block the intrusion using an intrusion prevention system (IPS) [9]. An IPS is an extension of an IDS with all of the functionality of the latter, but it is also capable of responding automatically to attacks. For this, IPSs use several response techniques, which can be divided into the following groups [18]:

  • The IPS stops automatically the attack without the network admin intervention. For example, the IDPS terminates the network connection or the user session used for the attack.

  • The IPS changes the security environment: IDPS could change configuration of other security controls to disrupt an attack. Some IDPS can even cause a host to be patched if it detects that the host has vulnerabilities.

  • The IPS modifies the attack content: Some IDPS technologies can remove or replace malicious parts of an attack to make it harmless

2.2 Detection Methods in IDPS

In general, there are three main approaches to intrusion detection [7, 9, 23, 24]: signature-based (known also as misuse detection), anomaly-based detection, and hybrid detection. Each detection approach operates on a specific set of principles. Table 2 gives a summary about the advantages and disadvantages of each detection methodology.

  1. 1.

    Signature-based detection: Signature-based intrusion method base their detection on the recognition, in the flow of events generated by one or more probes, of attack signatures that are contained in a signature database. Thus, it uses a set of signatures representing patterns of attacks already known to filter malicious activity [23]. A signature-based intrusion detection system consists of [18]: (1) one or more probes, generating a flow of events, which can be network or host type, (2) a signature database, and (3) a pattern recognition system in the flow of events.

  2. 2.

    Anomaly-based detection: In the literature, they are so many definition of anomaly based on its problem domain [25, 26] (e.g., network intrusion detection, fraud detection, Natural Language Processing, Image Processing, etc.). One of the most acceptable definition is “An anomaly is an observation which deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism [27]”. Anomaly detection was first presented in 1987 by Denning in her seminal work on the host-based IDES system [28]. The problem of anomaly detection can be modeled as a classification problem [29]. Anomaly-based IDSs refers to systems that are able to model the network and/or system normal behavior (legitimate actions), and after that identify outlier or deviation of a normal behavior to detect attacks (malicious actions). In particular, an anomaly-based approach is used to recognize unknown attacks (called also zero-day attacks). Due the complexity of today technologies, it is very hard to define precisely the normal behavior, which lead to high false-positive rates [30, 31]. In the literature, they are very rare purely anomaly-based IDSs, because most systems rely on hybrid-based approaches [30].

  3. 3.

    Hybrid-based detection: A hybrid system is the fusion of different intrusion detection approaches into a single integrated detection system. Hybrid systems provide better performance by using the strength of several approaches to overcome the limitations of individual techniques. Hybrid-based detection was first explored in [32]. This method tries to increase detection accuracy and in the same time make detection more efficient.

Table 2. Comparison between IDPS detections methods

2.3 Basic Architecture of an IDPS

As illustrated in Fig. 3, a typical IDPS is composed of four components [33]:

  1. 1.

    Event generators (E-Box): This type of block is made up of sensor elements that monitor the target system, thus capturing events and information to be analyzed by other blocks. For example, the decoding process would be included in this phase.

Fig. 3.
figure 3

Typical architecture for IDPS systems.

  1. 2.

    Event analyzers (A-Box): Processing modules for event analysis and detection of potentially hostile behavior, so that a certain type of alarm is generated if necessary. In anomaly-based IDPS, this component should include two sub-components:

    1. a.

      Preprocessor (preprocessing phase): performs all actions before the classification of the incoming requests, such as dataset creation, data cleaning, normalization, parsing the requests, feature extraction and feature selection. This sub-phase is very critical in anomaly-based IDPS [34]

    2. b.

      detection engine (detection phase): An engine that analyzes the requests searching for intrusions

  2. 3.

    Event databases (D-Box): These are elements intended to store information from blocks E for further processing by boxes A and R.

  3. 4.

    Response units (R-Box): The main function of this type of block is the execution, if any intrusion occurs, of a response to thwart the detected threat. The action depends on the type of system. IDS raise an alarm when intrusions are found, while IPSs block the requests to avoid them reaching the target server.

3 Web IDPS Design Challenges

Intrusion detection and prevention is still immature in the field of web application security [1, 8]. Intrusion detection and prevention systems are primarily used as a network security appliance. However, the design of Web IDPS requires a different approach than the traditional network IDPS to manage the complexities associated with modern Web applications [1]. In our study, we identified several specific issues that make it difficult for an IDPS to monitor and detect web-related security issues. In this section, we present what we truly believe the main design challenges of IDPS in the context of modern web apps.

3.1 Web-Related Security Issues

In general, web specific security problems are very different from traditional network-related attacks. In fact, a web security threat or issue is defined as a potential malicious activity that exclusively targets one or multiple components of web application’s architecture such as the user’s browser or the web app hosting server [2]. Li and Xue [6] classified web security issues into three core web-specific vulnerabilities:

  1. 1.

    Input validation or injection vulnerabilities: Validating inputs data (for example, data entered by a user into an authentication form) is an important part of a security policy for a web application. For this reason, web developers use validation and sanitization routines to identify, clean up unreliable user input, and let pass only secure data by filtering or avoiding suspicious characters. Incorrect or insufficient input validation could cause various injection attacks, such as SQL injection, XSS attacks, code injection attacks or memory buffer overflow attacks. This makes it possible to alter program executions, make unexpected commands or obtain unauthorized access to resources and sensitive data of users of the application. For example, authentication interfaces where the user can enter malicious code and access the customer area, without a reliable verification of their authentication data.

  2. 2.

    Business logic vulnerabilities: Business logic designates the set of rules and algorithms implemented in an application to manage the flow exchange between the web application web browser and the back-end servers (e.g. database server). Business Logic or logic vulnerabilities allow the legitimate processing flow of an application to be used in a way that has negative consequences for the application, which lead to various attacks such as access control bypass and application flow bypass attacks.

  3. 3.

    Session management vulnerabilities: Modern web applications rely mainly on the HTTP protocol to send requests and receive resources from the web application’s servers [35]. However, HTTP is a stateless protocol; it treats each request as independent of all the others, this design is not suitable for modern web applications (for example, e-commerce applications and online banking), which require a mechanism above the HTTP protocol to manage user sessions. The exchange of information between the user and the webserver is done by the creation of a web session by the server; it is a sequence of the HTTP request and response transactions associated with the same user. Designing and implementing a secure and efficient web session management is a complex task, which usually leads to several security attacks (e.g. session hijacking attacks, session fixation, and CSRF) that compromise web applications security [36].

Fig. 4.
figure 4

Deepa and Thilagam [2] web apps security issues classification.

Deepa and Thilagam [2] extended Li and Xue [6] taxonomy by introducing attacks that exploit each vulnerability category. Figure 4 illustrates the scholars’ classification.

3.2 Placement of the IDPS

According to our study, an IDPS for web apps has four possible deployment approaches.

  1. 1.

    Database-side: The database is a sensitive component in the architecture of web applications (Fig. 2). Multiple attacks are usually made possible by the fact that a single back-end database is used to store all of a web application’s persistent information. Therefore, if malicious queries are allowed to access data stored in the main database or by exploiting a code vulnerability intended to gain access to a limited portion of the database content, it is possible to extend the access to the database and retrieve sensitive information. In [10], authors have proposed anomaly-based IDS in the database-side. This system is able to detect malicious SQL query and withstand SQL injection attacks. An overview of this the database-side anomaly detector is illustrated in Fig. 5.

Fig. 5.
figure 5

Architecture of the database anomaly detector proposed by Vigna et al. [10].

  1. 2.

    Web server-side: To mitigate the security risks associated with web servers, IDPS are deployed to analyze and filter incoming requests. The goal is to quickly detect malicious activity and potentially prevent more serious damage. G.Vigna et al. presented in [37] WebSTAT, an intrusion detection system, which analyzes web requests for evidence of malicious behavior. This system is based on signature-based detection methodology. It is capable of detecting and describing attack scenarios, as well as allowing the detection of variants of attacks similar to the malicious behavior already specified. Another signature-based IDS presented by M.Almgren et al. in [38]. The proposed system is able to analyze log entries to recognize malicious activity on the web server and includes mechanisms to reduce the number of false alarms.

Fig. 6.
figure 6

WebSTAT architecture [37]

  1. 3.

    IDS/IPS as a proxy: It is also possible to deploy an IDS or and IPS as a proxy that intercepts incoming requests and outgoing responses from client and web servers, respectively. This placement option is also known as standalone appliance. In this placement, all traffic will pass from this point and be analyzed. The authors in [39] have presented an intrusion detection system that relies on anomaly-based detection to identify attacks against web applications. The system analyzes client requests referencing programs at the server level and creates models for a wide range of functionalities related to those requests. Another proxy IDPS proposed by N. Agarwal and S.Z. Hussain in [8]. It is a web intrusion detection system with a prevention mechanism that acts as a reverse proxy that intercepts incoming requests and outgoing responses from client and server.

Fig. 7.
figure 7

Block diagram of the IDPS proposed by N. Agarwal and S.Z. Hussain [8].

  1. 4.

    Browser-side: A browser-side approach is considered to be one of the latest approaches or trends in IDPS deployment [40]. The idea here is to extend browsers with intrusion detection and prevention functionalities. This has an advantage for web applications, which will be automatically protected against a large number of browser-side bugs and vulnerabilities. In [40], the authors have introduced WPSE (Web Protocol Security Enforcer), a browser-side attack detection and prevention system that addresses the unique challenges of web protocols such as OAuth2.0 and SAML. The current prototype of this solution is implemented as an extension for Google Chrome.

Table 3 gives an overview of some IDS/IPS offered in the literature and their placement. After all, the question then arises as to whether there is an ideal approach that should be privileged when deploying an IDPS for web apps. Unfortunately, each one of the above deployment strategies is vulnerable to specific attacks [3]. Moreover, it is very difficult to build an IDPS that will withstand all types of web attacks, without affecting the system efficiency. Therefore, we believe that the ideal solution would be to design an IDPS based on a distributed approach deployed across all the four basic placements.

Table 3. Overview of some IDS/IPS designed for web applications

3.3 Communication Protocol (HTTP/HTTPS)

Attackers exclusively use HTTP / HTTPS protocols to exploit vulnerabilities in web applications. Hypertext Transfer Protocol (HTTP) is a request-response protocol designed to facilitate communication between client and server, and HTTPS provides secure and encrypted connection. When HTTPS traffic is observed from an IDPS perspective, the packet data exists in the encrypted form that the system fails to inspect. However, these systems can verify HTTPS traffic if they have access to the private key of the SSL certificate. However, due to security considerations, it is so inadvisable to share the private key of the web server.

Additionally, HTTP/HTTPS requests and responses carry a variety of values, and choosing the appropriate validation approach (whitelist or blacklist) depends largely on the type of value defined. The positive validation approach (whitelist) defines the data expected by the application. It includes several types of validation checks, such as data type (string, integer), minimum and maximum length, and specific patterns. In contrast, the negative (blacklist) approach involves filtering out the values containing the attack patterns. Signature-based systems include both positive (whitelist) and negative (blacklist) validation, while anomaly-based systems only deal with positive validations.

3.4 Users and Session Management

Web applications are generally accessible to several users with different privilege levels [36]. These privileges are controlled by the authorization process, which ensures that the user performs only authorized operations. Applications use a session management mechanism [35] to track individual client-server interaction and map the request to a particular user in order to decide whether the request should be processed or denied. Allowing different users with a different set of privileges places several demands on the IDPS:

  • the functionality to track user sessions to link user’s requests to the specified session;

  • the ability to monitor the integrity of a session to ensure that the session is being used by the same person who logged in to the application;

  • the ability to perform continuous monitoring of resource usage and user activity during a session.

3.5 Continuous Changes and Performance

Todays’ web apps change pretty often. This rapid and continuous change over time are a major challenge for IDPS. IDPS should also be tuned and maintained over time to accommodate changes to an application. In the context of anomaly detection: frequent modifications handicap the system that is to say that the current model ignores the modified version of the application [41]. Apparently, this would incorrectly classify the new legitimate behavior as intrusive. These detectors need to be reconverted to accommodate the changes. In the signature-based IDPS: the blacklist is not affected too much, because the attack patterns remain the same, but the whitelist should be updated according to the change in application behavior [8].

In addition, the great complexity and interactivity of modern web apps profoundly affect the performance of intrusion detection systems [1]. It makes IDPS job more difficult, and the more interactivity, the harder it is for IDS to detect intruders.

3.6 Bots Requests

Web traffic can be generated by bots rather than humans. Bots are automated scripts designed to perform a specific set of activities on web applications. Automated attacks are comparatively cheaper than manual attacks because they allow adversaries to target large numbers of web applications with less time and effort. For example, Angler [4] a sophisticated exploitation kit which was the source of many advanced attack techniques. Angler was able to download and run malware from memory, without having to write files to disk, to avoid detection by traditional security technology. Signature-based systems are not strong enough to recognize requests from scripts designed to mimic user activities, because malicious requests differ from legitimate requests by intent, but not by content, and the signature-based IDPS uses malicious content rather than intentional.

4 Assessment of Open-Source Web IDPS

The evaluation of intrusion detection and prevention systems is a very important concept in improving the security of computer systems [9, 24]. It helps provide essential data and conclusions to help developers improve their IDPS and let users know the capabilities and limitations of the IDPS system in use. In this section, we will evaluate some intrusion detection and prevention tools designed for the web. The assessment is based on various security features that must be incorporated in an IDPS. In particular, to overcome the IDPS design challenges discussed in the previous section, we argue that an IDPS for web applications is required to integrate the following functionalities:

  • Correct input validation: The main problem with insecure web applications is that users have complete control over what data is submitted to the server. They can provide any arbitrary entry in the settings including header fields, or even change the values ​​stored by the application on the client side in the form of hidden fields, cookies and URL parameters [42]. Incorrect validation or no input validation on the server side is the root cause of most vulnerabilities in web applications, including XSS, SQL Injection, and uncommitted redirects [2, 43]. The validation process checks whether the input meets a predefined set of rules to prevent insecure data from entering the application. It includes checking all input parameters, including URLs, form data, cookies, and query strings.

  • Output sanitization: In general, output validation or sanitization protects a web application against unintentional disclosure of sensitive information. For example, the web application can also expose internal details if it does not handle error messages. Specifically, this attack technique takes advantage of overly descriptive error messages returned by the database when a query is rejected [1].

  • Session verification: A large part of web attacks involve session hijacking (session hijacking), session fixation (session fixation) and session replay (session replay) [11, 35, 36]. When an attacker steals or overwrites a user's session ID to impersonate the valid user and perform operations on their behalf, the session verification performed by an IDPS checks if the session is used by the same user as the one who logged in to the application. It may also include detecting any attempt by an adversary to illegitimately obtain the session ID.

  • Access control: Access control defines the policies to regulate the privileges granted to the user of the application. Some detection systems also provide the functionality of monitoring users’ activities to prevent unauthorized access to information or services offered by the application [23]. Systems can also detect requests that attempt to gain unauthorized access to objects (such as files and directories) that are mistakenly exposed via URL or a form.

  • Web bots detection: In order to resist automated attacks, an IDPS must also be able to differentiate the requests of normal users from the requests generated by certain malicious bots [44]. More specifically, an IDPS should have the potential to distinguish between good and bad bots. Non-malicious bots are automated programs that are beneficial to the service provider, such as search engine bots that help improve the ranking of a web application.

  • Response-time: The required time of an IDPS to deal with a potential threat is a very critical issue. For instance, it was demonstrated that less than fifteen minutes some sophisticated attacks were able to stop a large area of the Internet from normal functioning [45]. Thus, online detection also known as real-time detection is considered to be more efficient and secure than off-line detection [9, 46, 47].

Using these above critical functionalities, we assess some of the most used open-source IDPS exclusively designed for securing web applications. The evaluated IDPS are: OWASP AppSensor [48], ModSecurity [49], Shadow Daemon [50], and AQTRONIX WebKnight [51]. OWASP AppSensor [48] is a conceptual framework that offers prescriptive guidance to implement application intrusion detection and automated response in real-time. OWASP offers a reference implementation of this framework in a Java [19].

Since modern web application firewalls (WAFs) also offer security services similar to IDPS designed for web applications [1], we will also evaluate three commonly used open source WAF (Web Application Firewalls), namely ModSecurity a signature detection based web application firewall, Shadow Daemon [43] free software that intercepts requests and filters malicious settings and AQTRONIX WebKnight [44].

The results of the assessment are shown in Table 4. Three of the four evaluated IDPS rely on signature-based detection methodology. This demonstrate that like in the network security context, signature-based approach is widely deployed in web applications [52]. As discussed in Sect. 2, this detection methodology has several advantages such as easy installation and deployment. Real-time response is integrated in all the evaluated IDPS. For the session verification criterion only ModSecurity which contains a function to fix insecure session cookies. AQTRONIX WebKnight is a good example of an IDPS integrating access control and bot detection, as it can monitor access to certain important files or limit the number of requests from a single IP address. In particular, AQTRONIX is the only IDPS that include web bots detection using four possible ways:

  1. 1.

    A large database to block known bad bots or any additional bots the administrator specifies.

  2. 2.

    Bad bot trap mechanism that enable blocking bots that are note included in (1).

  3. 3.

    Aggressive Bot Trap filter, which prevent bots that are requesting too many web pages in a short period of time,

  4. 4.

    WebKnight 2.5 and later supports URL rewriting to prevent certain robots or hackers from seeing the true contents of your robots.txt file.

Table 4. Assessment of some IDPS exclusively designed for web applications

5 Conclusion

The rapid evolution and advance in web technologies have made the structure and interaction between the web client-side and server-side components of modern web applications more and more complexes, which have led to several security issues. In this context, intrusion detection and prevention systems (IDPS) are among the recent security solutions to protect web applications. In fact, IDPS are primarily designed to observe, detect, and prevent malicious activity on the network. However, the characteristics of traditional network attacks are very different from those of web-related attacks. The first targets the TCP/IP network layer while the second focuses on the weaknesses of the application layer.

In this paper, we overviewed the core concepts in the area of intrusion detection and prevention. Next, we discussed IDPS main design challenges for moderns’ web applications, which make it difficult for an IDPS to monitor, detect, and prevent web-related security attacks. Finally, we evaluated four of the open-source and most deployed IDPS exclusively proposed for web apps security.