1 Introduction

In recent years, the prevalence of airborne infectious diseases such as SARS-CoV, influenza, and many others are one of the major challenges faced by our present-day society due to rise in the population density and social interactions. The future of global healthcare systems should depend on early detection of such diseases rather than depending on delayed intervention and costly treatments.

1.1 Ebola virus disease

Ebola is one of the deadliest and infectious virus which spreads rapidly and affects a large volume of the population. The Ebola virus (EboV) spreads among people through contacts with an infected person and causes a severe viral hemorrhagic fever along with other symptoms such as nausea, anorexia, abdominal pain, headache, myalgia, and sore throat (Sureau 1989). The incubation period of EVD varies from 2 to 21 days and in severe forms of the disease, death may occur within 5–10 days after onset of illness (Tseng and Chan 2015). According to the World Health Organization (WHO 2016), a total of 28,601 Ebola cases has been detected in Liberia, Guinea, and Sierra Leone and out of them, 11,300 cases reported death by January 2016. Healthcare professionals who are engaged in the treatment of EVD patients unknowingly contracted the virus from such patients and then further facilitate infection transmission (William et al. 2014). In a study conducted by Lehmann et al. (2016) observed that the healthcare professionals hesitate to associate with EVD patients due to high risk of infection. In 2014, EVD outbreaks in West Africa is the largest episodes having a fatality rate of 76.4 % (Fasina et al. 2015). The EVD has the large epidemic potential of the outbreak, which results in the rise in the number of EVD cases and might be transmitted from one country to another. The EVD patients can be cured by proper educational campaigns and treatment as in the case of the French nurse cured from EboV (Rachah and Torres 2016). The continuous monitoring of outbreak moves the attention from disease to prediction and prevention of the virus from spreading. Hence, there is a need of using both antiviral drugs or vaccines as well as IT-based strategies to reduce its harmful effects.

1.2 Cloud computing and wireless body area networks

In the modern era, remote detection and continuous long-term monitoring of patients with infectious diseases is an escalating requirement. The existing healthcare systems are unable to control and monitor such diseases efficiently accompanying with minimal cost. So, an intelligent and efficient system is required to provide ubiquitous healthcare support services in real time using IT infrastructure. A major challenge in any nationwide healthcare systems is the acquisition of health-related information from the patient in real time and requirement of huge storage capacity and high computational power to store and process the data respectively. Recent advances in sensing and distributed technologies such as Internet of Things (IoT) and cloud computing makes it possible to design a smart healthcare system that enables remote continuous monitoring of patients in an unobtrusive and seamless manner. Using such technologies, it not only reduces consumed manpower of health workers but also reduce the cost of medical healthcare services.

Cloud computing along with wireless body area network (WBAN) can be efficiently employed for the prevention and monitoring of the patient. WBAN consists of a group of small and lightweight body sensors attached to the patient’s body to collect vital parameters of the body that can be used for long-term monitoring of chronic diseases efficiently (Sareen et al. 2016a, b; Andreu-Perez et al. 2015). These sensors produce a vast amount of data that needs to be stored and processed in real time. Cloud computing can offer virtually unlimited resources for the remote storage and computation of such big data collecting from WBANs of the patients. Through the use the cloud computing, the physiological data generated by WBANs can be shared by doctors, healthcare agencies, and government agencies.

In the healthcare domain, the most challenging area is not only the continuous monitoring of an outbreak of any infectious disease but also to intervene in real time. To control the spread of the disease, the close proximity interactions between infected and uninfected users needs to be prevented. For this purpose, RFID technology is used to sense the CPIs between users. To store and process such huge sensor data from a large number of affected population in a specific region such as city, state or country in real time, dedicated and scalable resources are required to save millions of lives. Cloud computing provides massive computing power, high scalability, and immense storage. Hence, the integration of WBAN, RFID, and cloud infrastructure can provide highly reliable healthcare system for monitoring and detection of EboV at its initial stages.

1.3 Cloud computing and WBAN in EVD diagnosis and monitoring

The objective this work is to design a cloud-based model for predicting and monitoring of EboV outbreak efficiently and in real time using WBAN-based data collection. The proposed system will keep track of the current status of the outbreak and identify the infected users responsible for the spread of the disease. To achieve these objectives, real time Ebola diagnosis and monitoring system is proposed based on IoT, RFID, mobile phone, and cloud computing infrastructure is proposed. Initially, each patient is registered through a mobile application by entering personal and contact information. The system automatically generates a unique identification number (UID) and is allocated to each registered patient. The UID is used for all the future communications between patients, doctors, and the hospitals. The data of EVD symptoms from body sensors in digital form is collected through patient’s mobile phone using the Bluetooth technology. The stream data is continuously collected and stored in the cloud database for in-depth analysis. Data is classified using J48 decision tree which categorizes the users into six categories based on their respective symptoms. Once the users are classified into different categories, they are monitored continuously using WBAN and RFID. RFID senses the close proximity interaction between infected and uninfected patients and an alert message is generated and sent to the uninfected patient to avoid such contacts with infected patients. Several outbreak metrics such as clustering coefficient, centrality, and temporal path length are computed using Temporal Network Analysis (TNA).

1.4 Contributions

The contributions of this paper are summarized as follows:

  • We introduced cloud-based scalable and cost-effective computing model designed for monitoring and controlling EboV outbreak.This model has integrated the WBANs system and RFID with cloud computing to capture and process the massive data generated by the sensors in real time.

  • We provided an automatic categorization of patients into different categories using a J48 decision tree. This algorithm is applied periodically for the monitoring of any change in the categories of the users.

  • We proposed the use of a RFID technology for capturing the close proximity interactions between users. An alert message is generated by the system and is sent to the mobile phone of an uninfected user to avoid his contact with the infected user.

  • We proposed a Temporal Network Analysis (TNA) to generate a graph showing the contacts between infected and uninfected users. Using TNA metrics, the infected users or regions are identified that are involved in the spread of the disease.

  • We provided a detailed experimental testing of the proposed model on Amazon EC2 cloud in order to evaluate its performance and accuracy. Our model provides 94 % classification accuracy and 92 % utilization of cloud resources.

The remainder of the paper is organized as follows: Sect. 2 reviews related work on EboV infection and use of sensor technology and cloud computing in the detection and monitoring of EVD patients. A model to monitor and detect the EboV is proposed in Sect. 3. Section 4 presents RFID-based prevention of EVD outbreak. In Sect. 5, we present and analyze the experimental results of our proposed model. Section 6 offers conclusions coming out of this model and possibilities for future work.

2 Related work

Related work is divided into three Sections which are pandemic EboV infection, mathematical and network models in Ebola epidemic, and integration of IoT and cloud computing in healthcare services. First Section relates to characteristics, causes, and results of Ebola outbreaks. Second Section provides the use of mathematical and network models in EboV infections. Lastly, integration of IoT and cloud computing in the field of health services has been presented.

2.1 Pandemic Ebola virus infection

EboV outbreak was first occurred in 1976 in the region of the Equateur province of Zaire during which 318 persons were infected, and 280 persons died. Many authors studied the cause, effect and precaution measures of the 1976 attack. The history, epidemiology and infection cycle of the EboV has been analyzed in Guenno and Galabru (1997). They identified the initial symptoms and effects on the human body. Takada and Kawaoka (2001) examined the pathogenesis of EboV infection and the available vaccines and effective therapies. Park et al. (2015) performed an evaluation of sequences from 232 EboV infected patients in Sierra Leona and analyzed viral evolution during prolonged transmission between users. Matua et al. (2015) performed in-depth analysis on strategies used to control the spread of the outbreak and proposed new techniques to improve the management and control of EboV during future outbreaks. Edelsburg and Shir-Raz (2015) examined the role of media in the spread of disease in such a crisis situation. Moghadam et al. (2015) conducted another review of the different species and structures of the EboV. Liu et al. (2015a, b) reviewed the techniques that were used to deal with plague epidemic occurred in Northeast China. The authors made an effort to highlight the valuable experience that can be used to fight the current Ebola epidemic in West Africa. Passerini et al. (2016) studied the effect of EboV in males and females. They also evaluated the difference in the case fatality rate, incubation period, duration of hospitalization, clinical signs, and symptoms among males and females.

2.2 Mathematical and network models in Ebola epidemic

Mathematical and network model have been extensively used for the monitoring and early prevention of this deadly epidemic. Althaus (2014) proposed a SEIR model to predict and analyze the spread of EboV by estimating the basic and effective reproduction numbers of EBOV during the 2014 outbreak in West Africa. The model provides real time estimates of EboV transmission parameters during an ongoing outbreak. Nsoesie et al. (2014) proposed a Dirichlet process model for forecasting and classifying epidemic which is based on the matching of current influenza activities with historical patterns. Lamma et al. (2006) proposed a knowledge-based expert model for monitoring and analysis of dangerous infections using data mining techniques. Burkhead and Hawkins (2015) proposed an agent-based model for monitoring the spread of EboV. Browne et al. (2015) proposed SEIR model of contact tracing for the monitoring of Ebola outbreaks using effective reproduction number. Ivorra et al. (2015) designed a model to analyze the spread of infectious diseases within and between countries. They used the deterministic spatial-temporal and SEIHRDB methods to predict and control the Ebola outbreak.

Various social network based systems have been proposed by authors for preventing Ebola outbreaks. Salathe et al. (2010) performed an analysis of infectious disease transmission using human contact network. Data is collected from the wearable proximity sensors that are used to detect the close proximity interaction between individuals and dynamic time-varying graph is created and updated regularly. Bansal et al. (2010) explored the recent research efforts to study the effect of dynamic contact networks to monitor infectious disease transmission. Rizzo et al. (2016) proposed a model for monitoring the spreading of Ebola virus disease based on activity driven network of contacts that varies with time. Takaguchi (2015) analyzed dynamical social interactions using temporal networks used for monitoring disease spreading. Smith et al. (2009) analyzed different kinds of social media networks for developing additional network metrics and analytical tools. Tang et al. (2013) studied the use of TNA metrics to real-world networks. The authors demonstrated that metrics from temporal network analysis provide a more accurate information about dynamic contact networks. Isella et al. (2011) analyzed the behavioral networks of close proximity and interactions in the context of the static and dynamic process.

2.3 Integration of IoT and cloud computing in healthcare services

Mukhopadhyay (2015) reviewed the different human activity monitoring systems using wearable sensors. Zheng et al. (2014) reviewed the different sensing and wearable technologies that can be used to develop efficient pervasive healthcare systems. Liu et al. (2015a, b) proposed an architecture for smart urban sensing. In this framework, they provide service APIs that perform the functions of data collection, processing, and transmission with the help of service APIs. Different urban sensing applications deployed on the cloud customizes its data acquisition, transmission, and processing functions through the service APIs and reduces its complexity. Kaushik et al. (2016) explored different strategies to control transmission of EboV as well as diagnostic tools to detect the virus accurately and rapidly. The authors also demonstrated the use of miniaturized sensing technology to achieve a point of care EboV detection. Barbosa et al. (2016) demonstrated the use of sensing technologies, smartphones and networks, cloud computing, and Internet of Things (IoT) to develop a point of care testing devices used to diagnose the patients accurately and in real time. Sarangan et al. (2008) proposed a framework to increase the speed of reading RFID tags. Ma et al. (2015) proposed personal communication system using a mobile phone, RFID, and cloud computing technologies. The position of the mobile device is monitored using RFID and session initiation protocol establish a Voice over Internet Protocol (VoIP) connection for the mobile phone.

Laskowski et al. (2011) proposed an architecture to examine the spread of influenza virus for the emergency department in Winnipeg, Canada. They proposed agent-based modeling in which the systems are modeled as a collection of people and objects. Kumar et al. (2016) proposed RFID-enabled authentication scheme for cloud-based healthcare systems using Petri Nets-based authentication model. Chen et al. (2010) proposed a system that incorporates coded information which is dynamically stored in the RFID tag using mobile agents. It enables other applications to perform on-demand activities for different objects in different situations and can be useful in healthcare applications. Zhang and Liu (2016) reviewed the use of biosensors and bioelectronics on a smartphone for biochemical detection. Gope and Hwang (2016) proposed a secure IoT-based healthcare system using body sensor networks. They used authenticated encryption scheme offset codebook that provides expeditious and secure data communication. Hassan et al. (2017) proposed a network model using wireless body area network and cloud computing used to manage the data of patients in the form of text, image, and voice on the cloud. Quwaider and Jararweh (2016) proposed a model for public health awareness using body sensors and cloud computing. The big data generated by sensors are processed using MapReduce infrastructure and detect the abnormality in the data in real time. Mamun et al. (2017) proposed a framework for detecting and monitoring Parkison’s patients using cloud computing. Such patients can be monitored remotely by doctors by diagnosing their voice signals over the cloud. Patients can send their voice samples through their mobile phones regardless of their location. Zhang et al. (2015) proposed a cluster-based framework for the monitoring and controlling of epidemics using smartphone-based body area networks. In this model, the population is grouped into clusters and epidemic control strategies are applied at cluster level based on social contact networks. Fabian et al. (2014) proposed a framework for secure sharing of patient data among different organizations. In this model, secret sharing scheme is used to decompose data across multiple clouds. Abbas et al. (2016) proposed a cloud-based model for disease risk assessment of different types of diseases using social network analysis techniques. The framework also incorporated the facility to users seek advice from the health experts available on Twitter. Chen et al. (2015) proposed an intelligent emotion interactive system using wearable sensors and cloud infrastructure which is used to provide healthcare services in both physiological and psychological aspects. Botta et al. (2016) reviewed on the integration of cloud computing and IoT. They explored various application scenarios in which the combination of these technologies can be effectively used.

3 Proposed model

The proposed model to detect and monitor the spread of EboV is shown in Fig. 1. Our model is mainly based on the continuous remote monitoring of infected patients in real time using cloud computing. Table 1 lists the abbreviations that are used in the proposed model definition and construction. Table 2 represents the various tasks that will be performed by our proposed model.

Fig. 1
figure 1

An architecture of the proposed model

Table 1 Abbreviations used in the system definition and construction
Table 2 Task flow of EVD detection using cloud computing and WBAN

3.1 Data collection component

The data collection component is used to collect personal information, vital body symptoms, and social contact information simultaneously. Each user is first registered with the system by entering the mobile number and other personal details through users’ mobile phone. A UID is generated for each user that will be used in the future communications. The personal information of the users that is stored in (EVD) database as shown in Table 3.

The primary symptoms such as body temperature, blood pressure are captured through WBAN and is transmitted to the mobile phone via Bluetooth, from where the data is forwarded to the cloud server using WiFi 3G/4G in real time. At the same time, users can enter their secondary and advanced symptoms through the interface provided by the mobile application. The values related to different symptoms are entered in ‘Y’ or ‘N’. Once the user has entered his response related to his symptoms, the data is sent to the cloud.

A scalable storage is proposed for data generated from WBAN, which can handle the big data efficiently. Table 4 shows attributes of EVD symptoms and their respective responses collected from different users. These attributes are categorized as primary, secondary, and advanced symptoms. Secondary symptoms may be present in any user depending upon the condition of the user. As the EboV replicates in the body, it produces advanced symptoms which show more worsening conditions for a virus infected patient so that immediate hospitalization and treatment is required. A user with advanced symptoms is highly infectious, and the healthcare workers or uninfected users must take precautions to avoid coming into direct contact with that user.

The physical social interactions between different users which may cause epidemics spreading are captured through RFID attached to the user’s body. A user carries the mobile phone with a RFID reader sense the RFID tag and the information is transmitted to the cloud. An android based application is designed to upload the aggregated data to the cloud. Table 5 shows the close proximity interaction attributes of different users that are used to create or update TNA graph. The data collection component contains three types of information from the patient: (1) EVD symptoms; (2) personal attributes; (3) close proximity interactions. The personal attributes of a patient need to be kept confidential. The proposed system incorporates a secret sharing scheme to hide the personal information of the user (Sareen et al. 2016a, b).

Table 3 Personal attributes of users suffering from Ebola virus
Table 4 Symptoms of Ebola virus disease
Table 5 Close proximity interaction attributes of users

3.2 Data classification component

This component is used to classify the user depending upon EVD attributes data using J48 decision tree as category U (uninfected), category S (susceptible), category E (exposed), category I (infectious), category H (highly infectious) or category R (recovered). A decision tree based algorithm is used that graphically displays the classification process of a given EboV attributes for given output categories. A data mining software Weka 3.6 (Hall et al. 2009) containing a collection of machine learning algorithms for data mining tasks, is used to generate the J48 decision tree. As shown in Fig. 2 generated by Weka 3.6, each user is categorized as infected under category S if he has no infection and having a low-level immunity along with a cough or a sore throat. The category E of the infected patient shows mild fever along with a cough or a sore throat. During this phase, the level of infection in the body of the patient is low. However, if fever and sore throat are high along with secondary symptoms such as a headache, body ache, then the patient is infectious and the category of infection will be I. In this stage, the patient absorbs sufficiently large infection that can be transmitted to other susceptible individuals. In category H, the infected patients show severe symptoms along with symptoms of category I. Finally, once the patient’s immune system has cleared the parasite or infection and the patient are no longer infectious and comes under the category recovered (R). The user is treated as uninfected if he does not possess any of the above conditions. The life cycle of EVD moving through different states is shown in Fig. 3.

Fig. 2
figure 2

A tree visualization of classification algorithm in Weka

Fig. 3
figure 3

Life cycle of an SEIHR model for EVD patients

3.3 Monitoring component for Ebola virus infected users

EboV-infected users require continuous monitoring for at least 21 days in consultation with the relevant health department. It refers to the regular examination of the treatment and symptoms of individual users so that complete history of progress report for each patient can be maintained by the system. Monitoring of patient is done at different intervals of time that depends upon the infected categories of the patient as classified by a J48 decision tree. A time interval of 2 h is chosen for the highly infected patients as they show severe symptoms and are required to monitor more frequently. However, monitoring interval can also be changed by consulting a specialized doctor. Table 6 shows monitoring time interval for different categories of infected patients.

The infected patients are continuously monitored and examined until they are completely recovered from the infection. The notifications and alert messages are generated by the system and are sent to the mobile phones of infected patients. Alert messages are also sent to nearby hospitals or healthcare agencies depending upon the GPS location of the patient’s mobile phone. The proposed system performed the classification process periodically to evaluate the category of the patient as shown in Algorithm 1. In case, the category of the patient is changed then alert messages are generated by the system and are sent to the user and the nearby hospital. The patient record is also updated accordingly.

Table 6 Monitoring interval of EVD infected patients
figure a

3.4 Controlling the spread of Ebola virus outbreak

To control the spreading of Ebola outbreak is one of the important steps in our proposed model. Temporal Network Analysis (TNA) is used in our model representing each user as a node and edges are formed between users having close proximity interactions (CPIs) between them. Different color of the nodes represents different categories of infection among users. TNA plays an important role in describing the state of epidemics. With the help of TNA graph, the evolution of epidemic spread can be predicted and the infected users that are highly responsible for the spread of the disease can also be identified. Gephi 0.9.1 (Fu et al. 2016) is used to generate TNA graph. Gephi is an open-source software for visualizing and analyzing temporal networks graphs. Gephi uses a 3D render engine to display graphs in real time and speed up the exploration. Using this tool, the infected patients and their connections with another susceptible or uninfected user can be depicted effectively as shown in Fig. 4.

Fig. 4
figure 4

Visualization of temporal network graph in Gephi 0.9.1. Snapshots are taken at time interval: a \(t_1\) = 250 s, \(t_2\) = 500 s, b \(t_1\) = 750 s, \(t_2\) = 1000 s, and c \(t_1\) = 1500 s, \(t_2\) = 1750 s

4 RFID based Ebola outbreak prevention

Some pathogens such as EboV can transmit through different routes. The main mode of transmission of EboV infection are airborne and via droplet. In airborne transmission, the pathogens are transferred from an infectious user through coughing and breathing. The air carries the pathogens up to a certain distance depending on the environmental conditions and can be inhaled by an uninfected user. Droplets from an infected user are transmitted to an uninfected user when they come in contact with each other, making CPIs highly relevant for virus spread. In the proposed architecture, identifying the close-range proximity or contact between infected users and uninfected users is of paramount importance which is used to prevent the spread of the EVD outbreak. A clear picture of network structure created showing the contacts between infected and uninfected users will help the government and healthcare agencies to control the outbreak at the earliest.

A RFID is proposed which is used to detect high-resolution proximity between infected and uninfected users. RFID exchange radio waves when they come in proximity to each other and is one of the most promising technologies in the area of automatic identification of an object (Read et al. 2012). Recent advances in RFID devices such as small, lightweight and long battery life make it ideal for social network studies. RFID tags are attached to the chest of the users in a certain geographic area monitoring for EVD outbreak to detect the contacts only when persons approach each other (e.g., face-to-face interactions). A mobile phone with a RFID reader is used to sense the RFID tag carry by another user. Whenever an uninfected user carrying the mobile phone comes in contact with an infected user wearing the RFID tag, the mobile phone sense the tag and identify the presence of the infected user. The range of proximity should be within 1–2 m of one another. This threshold limit is used to sense only those CPIs during which EboV can be transmitted (Vanhems et al. 2013). The contact details captured by the mobile phone is sent to the cloud via 3G/4G internet connection for storage, processing, and continuous monitoring. An alert message will be automatically generated by the system and is sent to the mobile phone of an uninfected user. The objective is to avoid contact with an infected patient so as to prevent the spreading of the epidemic. The proximity detection is performed periodically, and each RFID tag sends contact information to the mobile phone of another user every few seconds (Stehle et al. 2011). The time duration of 20-s interval is set during which the proximity can be evaluated with a confidence level of 99 % (Cattuto et al. 2010). Figure 5 shows an architecture of close proximity interaction between infected and uninfected users.

Fig. 5
figure 5

Close proximity interaction between infected and uninfected user

4.1 Creating temporal network graph

In the TNA graph, an edge between two nodes (users) is appeared or disappeared depending upon whether they are in proximity to each other or not at a specific time. Such a graph based on interactions over time between users shows a continuous change in the structure corresponding to dynamic users activity. Recent technological advances such as RFID and mobile devices further support real time gathering of information on human-to-human interactions. Once the category of users is detected by the classification component, a TNA graph is created and regularly updated as and when new CPI data are received from different users. Algorithm 2 is used to create or update TNA graph using CPI data generated from the RFID in real time. Using TNA graph, some important conclusions can be drawn which will help the healthcare agencies to control the Ebola outbreak.

figure b

4.2 Temporal network graph metrics

Identifying critical users (nodes) that are responsible for the spread of the EboV is an important step of our model. Moreover, an infected user with the high geodesic locality to other users can spread EboV quickly to large numbers of users. Some of the important metrics that can be drawn from TNA graphs are discussed in this Section.

Definition 1

Characteristics temporal path length Characteristics temporal path length represents how fast, an EboV can be transmitted from an infected user to another in the network. A small value of temporal path length represents a faster transmission of EboV. It is defined as the mean temporal distance over all pair of nodes:

$$\begin{aligned} L = \frac{1}{N(N-1)} \sum _{ij} d_{ij} \end{aligned}$$

where \(N = \{ 1,2,\ldots ,N \}\) is a collection of nodes, \(d_{ij}\) represents the length of the temporal shortest path from node i to j. The temporal global efficiency of a time-varying graph can be computed as follows:

$$\begin{aligned} L = \frac{1}{N(N-1)} \sum _{ij} \frac{1}{d_{ij}} \end{aligned}$$

Definition 2

Temporal correlation coefficient Finding the probability of the formation of clusters of Ebola-infected patients in any region is of high importance. It will help the government agencies to isolate that region and stop all kinds of travel from that region. Temporal correlation coefficient (TCC) value is used in the proposed architecture to identify the probability of cluster formation. It can be computed as:

$$\begin{aligned} TCC = \frac{1}{N} \sum _i C_i = \frac{1}{N} \sum _i \frac{1}{M-1} \sum _{m=1}^{m-1} C_i (t_m, t_{m+1}) \end{aligned}$$

where \(C_i (t_m, t_{m+1})\) is the topological overlap of the neighborhood of infected user i in the time interval \([t_m, t_{m+1}]\) and defined as follows:

$$\begin{aligned} C_i (t_m, t_{m+1}) = \frac{\sum _j a_{ij} (t_m) a_{ij} (t_{m+1})}{\sqrt{\left[ \sum _j a_{ij} (t_m)\right] \left[ \sum _j a_{ij} (t_{m+1}\right] }} \end{aligned}$$

Definition 3

Temporal betweenness centrality Temporal betweenness centrality is a very effective metric that shows the level of involvement of any infected user in spreading the outbreak. An infected user with a large number of neighbors will contribute more to the spreading of the outbreak. The Temporal betweenness centrality (\(TC^B\)) of a node i is the fraction of temporal shortest path passing trough node i and is defined as follows:

$$\begin{aligned} TC^B_i = \sum _{j \in V} \sum _{k \in V, k \ne j} \frac{\sigma _{jk}(i)}{\sigma _{jk}} \end{aligned}$$

where \(\sigma _{jk}\) is the number of temporal shortest paths from user j to user k, \(\sigma _{jk} (i)\) is the number of such temporal shortest paths that pass through the infected user i.

Definition 4

Temporal closeness centrality Sometimes the user is not in direct contact with the Ebola infected patient but knows another user who is in direct contact with Ebola-infected patient. The closeness centrality of any user i describes the closeness of the user i to any other infected or uninfected users. It can be computed as the inverse of the average length of the temporal shortest path from users i to j.

$$\begin{aligned} TC^C_i = \frac{N-1}{\sum _j d_{ij}} \end{aligned}$$

where \(d_{ij}\) is the length of the temporal shortest path from user i to j.

5 Experiment setup and performance analysis

The greatest effort was made to search on the internet for the real data of EVD patients based on symptoms. We are not able to retrieve such data to test our proposed model. Synthetic data is generated to conduct experiments and performance evaluation of the proposed model. Our experiment is divided into following segments:

  • Synthetic data generation.

  • Classification of synthetic data using J48 decision tree.

  • Testing of the proposed model on Amazon EC2 cloud.

  • Computation of outbreak metrics using TNA.

  • Cost analysis.

5.1 Synthetic data generation

Since symptoms based data for Ebola patients is not available for the proper evaluation of the proposed model. Synthetic data is generated in such a way that all possible combination of symptoms are taken. Table 7 shows the probabilities of each EboV symptoms which is incorporated in any newly generated case while creating a synthetic dataset for Ebola virus. Algorithm 3 is designed to create such patient dataset.

The data about proximity contact details generated by RFID is also required for the appropriate evaluation of the monitoring process. We have used a real dataset measured by the SocioPatterns infrastructure (SocioPatterns 2016) that contains the contact details of the students during five days in December 2013. The file contains 188,508 entries and each entry describes close proximity interaction (CPI) between different students during 20-s intervals. Each line contains information about source, target, start time, end time, where source and target are the IDs of the students that come in proximity of 1–2 m for the time interval between the start time and end time (Mastrandrea et al. 2015). Algorithm 4 is designed for generating synthetic data of 2 million users by mapping the CPI data between students at different time intervals with generated 5000 Ebola cases. Such data will be used to create TNA graph.

Table 7 Probabilities set for Ebola virus symptoms
figure c
figure d

5.2 Classification of synthetic data using J48 decision tree

Once the 5000 cases of Ebola virus are generated from Algorithm 3, they are classified into different categories using J48 decision tree in Weka 3.6 (Hall et al. 2009). Decision tree created by Weka 3.6 is shown in Fig. 2. A 10-fold cross validation is applied to evaluate the performance of the J48 decision tree. Data of 5000 categories are tested in Weka 3.6 and various statistical results are produced as shown in Tables 8, 9 and 10. Decision tree classifies the users with an accuracy of 94 %. Table 8 shows the detailed accuracy of each category which is classified by the J48 decision tree.

True positives (TP) also known as sensitivity is the percentage of categories of Ebola cases correctly classified by the classifier. False positives (FP) also known specificity is the percentage of Ebola cases wrongly classified by the classifier. The J48 classification algorithm produces high TP rate of 0.941 and low FP rate of 0.054. The relevancy of the results is provided by the two parameters precision and recall. The proposed classification algorithm provides higher values of precision and recall which are 0.901 and 0.912 respectively. The other statistical parameters F-Measure and ROC area both represents classification accuracy. An algorithm with a higher value of F-Measure and ROC area are more accurate and our J48 decision tree provides F-Measure of 0.880 and ROC area of 0.874 respectively. Hence, the use of J48 decision tree in our proposed architecture is justified.

Table 8 Category wise detailed accuracy for J48 decision tree in Weka 3.6
Table 9 Summary of tenfold cross-validation of J48 decision tree tested in Weka 3.6
Table 10 Confusion matrix of J48 decision tree in Weka 3.6

5.3 Testing of the proposed model on Amazon EC2 cloud

The performance of the proposed model was evaluated in real time by hosting it on the cloud. Synthetic generated Ebola cases are stored in the cloud provided by Amazon EC2. General purpose compute optimized c3.xlarge (Amazon 2016) instances are used to set up an application over the cloud. Synthetic data of 2 million users are used to evaluate the performance. Initially, the system was started with 10,000 requests,then after each 5-min request to the system was increased by 10,000 and system performance was studied for a total experiment time of 100 min.

Figure 6a represents resource utilized by the proposed model that vary with the different number of users. The system achieves saturation very fast when numbers of users reach 25,000 because more resources will be consumed to process them. Similarly, the response time of the proposed model for a different number of users is also shown in Fig. 6b. The system takes low response time for a less number of users as there are fewer records available in the database to perform any operation. Figure 6c shows the latency time of the proposed model for a different number of users.

Fig. 6
figure 6

Performance analysis of proposed model: a resource utilization of system, b response time of system, c latency time of system

The accurate categorization of users is a vital step in our proposed model. Different classification algorithms such as a random tree, Naive Bayes, and REPTree are also tested in Weka 3.6 to compare their performance with our proposed J48 decision tree. We have evaluated the performance of the classifiers on the Amazon EC2 cloud for a given synthetic dataset of 2 million users and compared their performances. Table 11 shows the comparative classification performance in different classification model tested in Weka 3.6. J48 decision tree provides higher accuracy than other classifiers. Figure 7 illustrates the comparison of the accuracy of the different classification algorithms.

Table 11 Detailed accuracy of J48 tree and other models for the classification of EVD patients
Fig. 7
figure 7

Performance analysis of classification algorithms on Amazon EC2 cloud

The performance of proposed J48 classification algorithm is also evaluated on the Amazon EC2 cloud. Classification of a large dataset of users into different categories of infection requires a high-performance level. The performance of J48 decision tree algorithm in terms of classification accuracy and classification time is shown in Fig. 8. Figure 8a shows the classification accuracy of the algorithm. It represents the percentage of correct classification of infected and uninfected users in their respective category. Initially, J48 decision tree shows an accuracy of 70 %, but as time progresses its accuracy rises due to a large user dataset. Figure 8b shows the time taken by the system for the classification of the users. The classification algorithm takes more time for the large dataset as shown in the graph.

Fig. 8
figure 8

Performance analysis of proposed model: a classification accuracy of system, b classification time of system

Table 12 Summary statistics for temporal metrics tested in Gephi 0.9.1

5.4 Computation of outbreak metrics using temporal network analysis

An in-depth analysis of CPI network structure is an important tool for controlling the outbreak. The most commonly used controlling techniques are vaccination that can reduce the level of infection of infected users and prevent its spread. Furthermore, It is not possible to provide vaccination to the entire population in order to prevent the outbreaks. Different metrics are computed from the TNA graph that can be used in identifying the highly infected areas or users accountable for spreading the disease.

Synthetic data of 2 million users is simulated in Gephi 0.9.1 for the simulation of outbreak prevention using TNA techniques. Different temporal metrics are evaluated as shown in Table 12. Figure 9 shows different outbreak metrics generated from the TNA graph using Gephi 0.9.1. Figure 9a shows the closeness centrality distribution of the graph. An eccentricity distribution of the TNA graph is depicted in Fig. 9b. The betweenness centrality distribution of the graph is shown in Fig. 9c. Figure 9d provides the eigenvector centrality distribution of the TNA graph. Experimental results show that the TNA is a very useful and effective tool to analyze the state of the outbreak using different parameters.

Fig. 9
figure 9

Different outbreak metrics generated using TNA: a harmonic closeness centrality distribution, b eccentricity distribution, c betweenness centrality distribution, d eigenvector centrality distribution

5.5 Cost analysis

Cost is an important factor that needs consideration to evaluate the economic feasibility of our proposed model in poor countries such as Guinea, Liberia and Sierra Leone. Purchasing cloud services is a fundamentally different approach in which no maintenance or installation is required, and the upfront cost can be eliminated entirely by using an pay-per-use payment method that charges the user by rounding up to the nearest hour of usage time. AWS offers three usage tiers: on-demand, one year reserved, and three-year reserved at a very reasonable price. In order to deploy our proposed model, Amazon offers EC2 service for virtual computer rental (known as instances) with a variety of hardware specifications. The most basic instance is a single-core CPU with 1 Gbyte of RAM, priced at US $0.013 per hour (Amazon EC2 2016). The cost of RFID tags have fallen significantly over the past few years, which varies from $0.05–0.07 (RFID 2016). The analysis shows that the cloud computing services and sensor technology are cost effective and can be borne by the government of any nation at the time of outbreak.

6 Conclusion

EVD is a global challenge for any country and healthcare agencies. In this paper, we proposed a cloud-based architecture for predicting and preventing EVD using TNA and wearable body sensor technology. The vital body symptoms and social interactions are captured using WBAN and RFID respectively. Unlike traditional offline models, our approach is based on capturing the real time close proximity contacts and health information so as to control the epidemic spreading. The J48 decision tree is used to classify the users into different categories. TNA is used to represent each Ebola infected users on the TNA graph. Different temporal metrics are computed to identify those infected individuals or regions that are highly involved in the spreading of epidemics. The proposed model is tested on Amazon EC2 cloud which provides 94 % classification accuracy and around 92 % utilization of cloud resource. Sometimes, the users are not willing to carry RFID tags and body sensors. In the future work, we will focus on estimating the missing data of such users in order to improve the efficiency of the system.