Keywords

1 Introduction

Big data technologies and algorithms along with their applications have attracted significant attention over the past few years. An increasing number of enterprises invest on big data analytics and try to exploit their potential in order to obtain useful insights about their performance and gain a competitive advantage [1]. To this end, the scientific field of data analytics has emerged, going beyond a simple raw data analysis on large datasets [1]. Analytics, as a multidisciplinary concept, is defined as the means to acquire data from diverse sources, process them to elicit meaningful patterns and insights, and distribute the results to proper stakeholders [2].

Data analytics is categorized to three main stages characterized by different levels of difficulty, value, and intelligence [3]: (i) descriptive analytics, answering the questions “What has happened?”, “Why did it happen?” and “What is happening now?”. (ii) predictive analytics, answering the questions “What will happen?” and “Why will it happen?” in the future. (iii) prescriptive analytics, answering the questions “What should I do?” and “Why should I do it?”. The maturity of the first two stages has been substantiated by the large amount of research works, associated platforms and business solutions. The current paper investigates the literature on prescriptive analytics and identifies the existing research challenges on this topic. To the best of our knowledge this is the first literature review on prescriptive analytics.

The rest of the paper is organized as follows. Section 2 presents an overview of prescriptive analytics along with three use cases in order to explicitly show the differences between the three stages of analytics. Section 3 describes our methodology for the literature review, while Sect. 4 presents the results of the literature review. Section 5 provides a discussion of the results and identifies the research challenges, while Sect. 6 concludes the paper.

2 Towards Prescriptive Analytics

Prescriptive analytics is able to suggest (prescribe) the best decision options in order to take advantage of the predicted future and illustrates the implications of each decision option [3]. It incorporates the predictive analytics output and utilizes artificial intelligence, optimization algorithms and expert systems in a probabilistic context in order to provide adaptive, automated, constrained, time-dependent and optimal decisions [4,5,6]. Prescriptive analytics has two levels of human intervention: decision support, e.g. providing recommendations; decision automation, e.g. implementing the prescribed action [6]. It is the most sophisticated type of business analytics and can bring the greatest intelligence and value to businesses [3]. The effectiveness of the prescriptions depends on how well these models incorporate a combination of structured and unstructured data, represent the domain under study and capture impacts of decisions being analysed [3, 5]. In order to show the potential of prescriptive analytics, we illustrate the following motivating scenarios from three different application domains.

Industry 4.0

Industry 4.0 indicates the current trend of automation and data exchange in manufacturing technologies in order to facilitate manufacturing. For example, consider the case of predictive maintenance in which sensors generate a multitude of data dealing with indicators of equipment’s degradation. Descriptive analytics algorithms monitor the current condition of the manufacturing system and provide alerts in cases of abnormal behaviours. This is achieved by comparing the actual measurements of several parameters that constitute indicators of degradation. When they vary from the normal values, an alert triggers the predictive analytics algorithms. The alert is evaluated and, if it indicates a potentially hazardous state of the manufacturing equipment, the predictive analytics algorithms generate predictions about the future health state of the manufacturing system, e.g. a prediction about the time-to-failure. On the basis of this prediction, prescriptive analytics algorithms are able to provide recommendations about the optimal mitigating actions and the optimal time for their implementation in a way that the expected loss and the risk are minimized. The Industry 4.0 scenario is based upon the research works of [7] and [8].

Transportation

The traffic congestion control concerns more and more modern, crowded cities. To this end, there are attempts to release the city centers from the traffic jams. Currently, sensors can detect vehicles that pass corresponding areas. This data along with historical data from traffic monitoring networks can be utilized for further analysis by descriptive analytics algorithms. These algorithms can derive outcomes such as induction loop information and vehicle location information in an aggregated form. These results feed into the predictive analytics algorithms which provide predictions about the traffic flow (congestion level) of the system. To do this, they also exploit predictive models that have been developed based on historical data and that take into account contextual information (e.g. peak times). The predictions trigger the prescriptive analytics algorithms which execute actions with the aim to reduce the congestion level proactively (e.g. traffic lights control). The actions will change the current states of the system and affect the future states in order to maximize the total reward (reduction of congestion). The transportation scenario is based upon the research work of [9].

Healthcare

Healthcare is a key domain that can benefit from data analytics due to the regulatory requirements and the large amounts of data that have the potential to improve the quality of healthcare delivery. In several cases, reliable analytics can mean the difference between life and death (e.g. trauma monitoring for blood pressure, operating room monitors for anesthesia). For example, capturing real-time large volumes of data from in-hospital and in-home devices can feed into descriptive analytics algorithms for safety monitoring. When hazardous correlations of streams of physiological data related to patients with brain injuries are detected, an alert is received by predictive analytics algorithms, which provides a prediction about a bleeding stroke from a ruptured brain aneurysm. On this basis, prescriptive analytics algorithms provide medical professionals with critical and timely prescriptions in order to aggressively treat complications. The healthcare scenario is based upon the research work of [10].

3 Literature Review Methodology

In this Section, we outline the methodology of the literature review which is based upon the methodology proposed by [11]. We searched the following scientific databases: ACM, ArXiv, Elsevier, Emerald, IEEE and Springer. Since prescriptive analytics is a new and emerging research field, we used only the query term “prescriptive AND analytics”. For the first phase, we queried the scientific databases to find papers that contain the query in their full record, including the full text of the publication. As shown in Fig. 1, there is almost an exponential growth of the use of the term “prescriptive analytics” in publications throughout the last years. This trend outlines an increase of interest for this domain and constitutes a motivation for our literature review.

Fig. 1.
figure 1

The trend for the “Prescriptive Analytics” term

The first phase of our search resulted in 2,971 papers. Since the first phase of the search includes works that do not necessarily contribute to the field of prescriptive analytics, we conducted a second phase in order to look for research works with the query term in their metadata, i.e. title, abstract, keywords or other metadata of their record. The second phase resulted in 107 papers. The third phase of our search was conducted according to the following inclusion criteria: (i) The papers contribute to the field of prescriptive analytics; (ii) the publication date is between January 2010 and February 2018; (iii) the publication type is journal, book or conference. The third phase resulted in 44 papers, consisting of 13 journal articles and 31 conference papers. The results of the three phases are shown in Table 1.

Table 1. The three phases of search

4 Analysis

4.1 Classification of Reviewed Papers

We classified the reviewed papers in four categories according to their contribution: (i) conceptual models, frameworks and architectures; (ii) algorithms and methods; (iii) information systems; (iv) algorithms and methods along with information systems. This classification along with the number of papers in each category and the specific references is shown in Table 2. The fact that most of the reviewed papers propose prescriptive algorithms or/and platforms indicates that the potential of prescriptive analytics is already recognized from the research community. Therefore, the researchers focus on exploring the aspects of its applicability and utilization.

Table 2. Classification of papers

The most prominent application domains of prescriptive analytics in the reviewed literature are shown in Table 3, while individual approaches for other domains, e.g. aerospace, travelling and computer industry, are also proposed. Moreover, we found that nineteen (19) out of the forty four (44) papers deal with generic approaches for prescriptive analytics, while twenty five (25) papers deal with domain-specific approaches. This classification points out that the research interest to address specific topics with a prescriptive solution is almost equal with the quest for widely applicable prescriptive solutions.

Table 3. Application Domains

4.2 Methods and Techniques for Prescriptive Analytics

A broad coverage of the reviewed literature proposes optimization methods and techniques. Optimization has been considered to be the most appropriate approach for addressing prescriptive analytics [14, 17]. Indicative methods and approaches include: linear optimization, including mixed-integer, binary integer and fractional programming [16, 19, 26,27,28, 37, 44, 47], non-linear optimization methods like binary quadratic and mixed integer non-linear programming [23, 24], stochastic optimization for handling uncertainty in the decision making process [38], distributionally robust optimization and statistical bootstrap of Efron [33]. In addition, simulation methods and approaches have been developed as an enabler of prescriptive analytics [2, 13, 14, 17, 20, 50, 51, 54].

Since the business data may be non-numeric, their business solutions may rely on qualitative analysis, logic, reasoning, collaboration and negotiation [55]. This encourages the utilization of decision rules and decision trees in the decision-making process. Relevant research works include: decision rules for continuously improving business processes using real-time predictions and recommendations [22]; business rules in combination with a simulation and optimization prescription mechanism [2]; an information system for prescriptive maintenance in which the decision is derived according to rules in combination with mathematical functions [50]; an architecture with the use of proactive event processing rules by combining complex event processing (CEP) engines with predictive analytics [31].

Although, the role of machine learning in predictive analytics is well-established, the research works dealing with machine learning in prescriptive analytics is rare. Four (4) of the reviewed papers deploy machine learning techniques: decision trees and real-time Random Forests (RF) to support production maximization and cost minimization of natural gas and hydrocarbon liquids [49]; k-Nearest Neighbors (k-NN), kernel methods, trees and ensembles in order to construct the weights of a prescription problem [38]; Random Forest, Bayesian Belief Network (BBN) and Auto-Regressive Integrated Moving Average (ARIMA) in combination with stochastic simulation in order to identify significant KPIs and estimate the earnings per share in computer industry [18]. Finally, twelve (12) papers propose more sophisticated solutions that consist of combinations of optimization, simulation, custom ratings and measures, search policies and other heuristic techniques [8, 20, 25, 27, 32, 34, 39,40,41, 48, 52, 53].

4.3 Prescriptive Model Lifecycle

Based on a synthesis of the literature review, a prescriptive model lifecycle consists of three conceptual steps: model building, model solving and model adapting. These steps are further described below.

Model Building.

Model building may rely on expert knowledge, on data or on a combination of both. The literature review reveals a clear interest on modelling the problem in the best possible way. The first approach refers to the manual building of the model from an expert based on domain knowledge. The second approach is based on the statement that the optimization problem can be inferred or learned from previously observed decisions taken by an expert [56]. In this sense, the model can be built based on the collected data involved in past cases in a data-driven way without any user interference. The third approach has to do with learning and mining the model parameters/weights from data and provide them as input into a static model predefined by the domain expert. In the last two approaches, machine learning and rule-based techniques have been used.

Model Solving.

Model solving takes place after model building and provides the expected prescription. This step is a well-studied area. The majority of model solving approaches deal with optimization algorithms. Examples include: a modification of the Goemans-Williamson’s MAX-CUT approximation algorithm for solving a binary quadratic programming problem related to price optimization [23] and the gradient-projection algorithm for solving a mixed-integer non-linear optimization problem related to industrial maintenance [24]. Other approaches have been developed during the last years, such as the evaluation and filtering of rules for recommendation-based business processes [22].

Model Adapting.

Model adapting is conducted in two different ways according to the approach followed for model building: rebuilding and training the model based on the observed data from prescriptions; updating the parameter values of a static model after mining and analyzing the gathered data. Model adapting usually includes model validation with the aim to assure reliability of the model. For example, feedback and adaptation mechanisms can be utilized in order to validate the accordance of the prescriptions with the system objectives [2].

Table 4 classifies the reviewed papers according to whether model building is conducted solely by the domain expert, solely in a data-driven way or in combination of both as well as according to whether it incorporates the step of model adapting. It should be noted that thirty (30) papers recognize and attempt to exploit the era of big data. In thirteen (13) out of the forty four (44) reviewed papers model building is conducted based on the knowledge of the domain expert, while in twenty four (24) papers, the domain knowledge is combined with the collected data. There are six (6) papers that provide a fully data-driven solution for model building. Moreover, six (6) papers consider the development of a mechanism for model adapting. One (1) of them deals with adapting a model that is fully built in a data-driven way, while five (5) of them deal with adapting models that have been built based on both domain knowledge and data.

Table 4. Model building and model adapting in the reviewed papers

5 Discussion and Research Challenges

Due to the emergence of big data technologies, there is the need for methodologies and algorithms capable of analyzing all these data and deriving useful insights. To this end, during the last years, there is an increasing amount of research works dealing with prescriptive analytics. However, the power of data is not yet fully incorporated in the prescriptive analytics solutions proposed in the literature. Currently, the field of prescriptive analytics is still immature due to several challenges.

Prescriptive analytics applications are usually developed in an ad-hoc way with domain-specific prescriptive models, while many generic approaches have been described only at a conceptual level. In some works, the prescriptive model is built by the domain expert but the values of its parameters are obtained through historical data. Since data-driven modelling has just started to emerge, there is the possibility to go a step beyond, with prescriptive models that are built by analyzing data in a batch mode for automated data-driven model building. Future research can focus on developing widely applicable approaches, more independent on domain knowledge.

The dynamic and complex nature of today’s enterprises causes a continuous change of a problem formulation, e.g., new types of constraints have to be added or there is a change in problem environment. To this end, approaches and algorithms for model adapting are required. Such approaches become even more important in a real-time/streaming computational environment. Currently, the research works using feedback mechanisms for tracking the suggested recommendations and for continuously improving the prescriptive models are limited, while the relevant research works deals mainly with conceptual frameworks.

Although it is clear by the definition of prescriptive analytics that it is applied on the basis of predictive analytics output, the related works existing in literature have a loose integration with predictive analytics, e.g. in [12, 19, 21, 25, 34, 39,40,41, 45, 48, 52]. This research direction would require methods, algorithms and systems enabling the flow of the prediction output to prescriptive analytics and its utilization for uncertain decision making ahead of time. To do this, several interoperability issues between predictive and prescriptive analytics should be researched.

Finally, the vast majority of the existing research works in prescriptive analytics deal with “decision support” in the sense that they provide recommendations and the human should implement the prescriptions. The research works dealing with “decision automation”, i.e. automated implementation of actions, is rather limited. This direction is expected to evolve throughout the next years due to the emergence of Internet of Things leading to a higher level of data analytics maturity.

6 Conclusion

Prescriptive analytics is an emerging and promising field of data analytics that aims to prescribe decisions on the basis of the predictive analytics outcomes. There is an increasing interest in literature on this topic due to its potential for advancing data analytics and enabling enterprises to gain a competitive advantage. In this paper, we conducted a literature review on prescriptive analytics, we formulated the prescriptive model lifecycle and we identified the existing research challenges. Moreover, we discussed potential future directions of research. Due to the emergence of big data technologies, there is the need for methodologies and algorithms deriving useful insights. Currently, prescriptive analytics applications are usually developed in an ad-hoc manner. Moreover, they are not usually able to adapt through feedback mechanism to the modern dynamic and complex business environment. To this end, data-driven model building and adapting has started to emerge in an attempt to build and update the prescriptive model by analyzing the existing data. Finally, there is a loose integration between predictive and prescriptive analytics, while decision automation has not been sufficiently investigated yet.