Keywords

1 Introduction

Actual state of medicine, diagnostics and medical imaging leads to abrupt growth of diagnostic data available to physician or medical research teams. The amount of this data is often so big that in many cases physician and researchers are not able to analyze and interpret it within acceptable accuracy and time. To address this problem, one of emergent trends in medical IT systems is building private cloud systems, which can store, archive and process this data on-the-fly.

Compared to other businesses, the public healthcare service has significantly underused technology to improve operational efficiency. In Poland, there still exist systems which rely on paper medical records, but this situation is rapidly changing. Information that is digitized is typically not portable, inhibiting information sharing amongst the different healthcare actors. Use of technology to facilitate collaboration amongst the medical community is limited. The healthcare industry is shifting toward an information-centric care delivery model, enabled in part by open standards that support cooperation, collaborative workflows and information sharing. Thus, healthcare information systems (HIS) should be modernized and cloud computing is at the center of this transformation [1].

Cloud computing provides an infrastructure that allows hospitals, medical practices and research facilities to tap improved computing resources at lower initial capital outlays. Additionally, cloud environments reduce the barriers for innovation and modernization of HIS systems and applications.

Many university and pharmacology enterprises are starting to use the cloud to improve research and drug development. The growth in medical data from next generation sequencing as well as the growing importance of biologics in the research process is making cloud-based computing an increasingly important aspect of research and development (R&D). Currently, such institutions do not have the capacity to store and process large datasets. Physician collaboration solutions such as remote video conference physician visits are well-known [24]. Cloud technology supports medical teams collaboration and gives the ability to use applications based on exchanging or sharing large sets of medical cases. This technology assures the required security and privacy level. Cloud computing facilitates dissemination of information and insights in near real-time. The most current, complete insights and clinical knowledge are available to support care provider decisions and to enable a focus on the real problems not on tools. Information contained within a cloud can be better analyzed by using computationally efficient methods and tools.

2 Existing Solutions

The amount of digital information which is collected and stored in modern hospitals poses a great opportunity for extensive data analysis, enabling medical researchers to discover relationships in the data, extend knowledge of the diseases and therefore improve the treatment. However, as a result of numerous legal and administrative difficulties in collecting medical records, sensitive medical data is rarely gathered by hospitals in a cooperative manner. This also prevents establishment of commercial systems. Though, examples of emerging systems of that kind can be already found in the literature.

Rolim et al. [5] presented a concept of a system for automatic collecting and storing medical data from medical devices by using wireless sensors. Authors propose system architecture with a cloud system for storing the data and enabling its analysis by medical experts, accompanied with local servers located in hospitals for gathering data from sensors and making it accessible to medical staff. The system however relies on the possibility of straightforward connecting the wireless sensors to medical devices, without considering the diversity of the interfaces and potential problems in collecting the data. Pandey et al. [6] proposed a cloud system for collecting and analyzing ECG data from users’ wearable devices. The solution serves as a health monitoring system. Data analysis is possible via dedicated services hosted in the cloud system. The authors emphasize the requirement of providing high scalability of the system, which is assured by implementing a cloud computing architecture. Patel et al. [7] presented a system for collecting medical images along with their supplementing data in a centralized database. The system enables efficient storing of the images, reducing the cost of gathering the dataset for medical research. The system is demonstrated on the case of acquisition of digital mammography images.

Another type of systems utilize cloud architecture strictly for providing data analysis services. Hsieh and Hsu [8] proposed a cloud based solution for tele-consulting 12-lead ECG examinations on mobile devices. Shen et al. [9] presented a Service Oriented Architecture (SOA) system for EEG signals classification. Lai et al. [10] used Knowledge as a Service (KaaS) model for providing a collaborative system for facilitating radiotherapy dynamic treatment service. Finally, noteworthy systems have been also proposed strictly for the analysis of large collections of data. In [11] we reviewed existing data mining applications for knowledge extraction from medical data. Also, dynamically developing neural networks and deep learning methods are recently increasingly used for data analysis. We give more attention to this area later in the text.

Unlike the systems listed above, our system is based on web services (software plugins) for professional healthcare devices. Our goal is acquisition of valuable data for medical researchers which are hard to collect. A single doctor has only a small number of valuable research (often tens of). Our platform allows to collect hundreds of examinations. Data obtained from personal medical devices used in the papers mentioned above have also a high degree of uncertainty as compared to the data from the hospitals.

3 The IPMed Platform

The proposed system was designed as a thorough solution for medical researchers seeking to gather and analyze large amounts of diverse medical data. Currently, the system was successfully deployed in three independent hospital wards, where it enables gathering extensive data of stroke cases, including impedance cardiography (ICG) examination records. The general functionality of the platform can be outlined as follows:

  • Medical data gathering. The platform enables flexible extending with dedicated modules for communicating with particular medical devices as well as front-end interfaces for entering data by medical staff.

  • Medical data storing. The data is automatically stored on a local server at the hospital as well as sent to a centralized database, ensuring data security and persistence.

  • Data analysis supporting tools. The data collected in the centralized database can be analyzed by medical researchers using a set of statistics and data mining algorithms.

  • Remote teamwork of researchers. The platform enables organization of medical research groups for cooperative working on the data.

The data analysis and teamwork modules are being still extended in order to possibly well respond to medical researchers’ needs. High versatility and flexibility of the platform is worth emphasizing, as it is designed for handling wide range of medical data types and therefore can be applied in various areas of medicine.

3.1 General Architecture

The IPMed system is based on a component structure, in which every component is responsible for a particular task. The components communicate with each other through network using REST (Representational State Transfer) architecture style. The communication is secured by encryption and users are authenticated by login and password pair. The databases that contain sensitive data (e.g. medical data) are encrypted and the access to the key is only granted to authenticated and authorized users, which in turn keep data safe even in case of hardware theft. Figure 1 presents the overview of the main IPMed components and their interconnections. Detailed description of every component is presented in the next section.

Fig. 1
figure 1

Main IPMed components and their interconnections

3.2 Components

The IPMed system contains several basic component types depending on their role. The central point of the system is the main repository, which gathers anonymous data that it receives from local hospital data servers. Each hospital that participates in the grant gathers data from medical equipment and physicians. The data stored in these hospitals is encrypted and decryption requires an authorized user to authenticate. On the other side of the system, there is a cooperation webserver that serves a website for physicians cooperation.

  1. 1.

    Main repository. It is a well secured webserver, which is only accessible by applications that possess trusted encryption key that is used for encrypting communication. All the data kept on this server is completely anonymous (anonymization of this data is done on local hospital data servers before this data ever leaves the source hospital). The repository gathers data from hospitals along with its whole history using RESTful web services and the communication is realized using encrypted HTTPS protocol. The gathered data is then directly served to the cooperation webserver. The main repository also provides access to statistical tools that may be used for analysis of gathered data.

  2. 2.

    Local hospital data servers. Each hospital has its own local data server, which contains patients’ medical data that are stored in an encrypted database. This is the only part of the system that stores sensitive data. The key used for encrypting the database is also encrypted, but it is done using users’ keys that are not stored on the same server. This way the access to the database is only possible after authorized user authenticates using his key and thus the data is safe even after hardware theft or physical access to the database files.

    Local hospital data server enables to automatically download and convert data from medical equipment (PhysioFlow and NICCOMO cardiographs) through FTP, SMB or USB flash drive. Downloaded data is temporarily stored in a non encrypted queue (since the key to the database is encrypted it is impossible to open the database until an authorized user authenticates using his key and thus decrypt the database key), but as soon as the database is opened the downloaded data is moved to the database and encrypted. The data stored on a local hospital data server are exposed through RESTful web services that enables their browsing and modification, but requires devices and applications (e.g. dedicated Android client application) to authenticate using trusted key and requires user (i.e. physician) to authenticate using login and password pair.

    Each local hospital data server connects to the main repository in fixed intervals in order to send the data. This communication is encrypted and requires authentication of both sides using keys and the main repository must provide a key to decrypt the local database. At this step all of the data is also carefully anonymized before they leave the hospital, because it is a legal obligation that none of the patient’s personal data may leave the hospital.

    The local hospital data servers have duplicated network links i.e. they are connected to local area network in hospital and they may connect to the Internet using GPRS modem. This redundancy was primarily done in order to make local hospital data servers as reliable as possible, because they were designed to work as independent devices that no one is working on and whole administration and maintenance is done remotely from administration centre. The requirement for remote administration also created the need for the possibility of some basic administration tasks (getting error logs, restarting the device etc.) even when the device is not connected to the network and so it is done using SMS messages.

  3. 3.

    Administration centre. This is the special server that is used for administration purposes. All local hospital data servers connect to this one and creates SSH (Secure Shell) tunnel (using trusted keys) that may then be used for maintenance and administration.

  4. 4.

    Data source devices. These are the devices that gather medical data. They are connected to local hospital data servers and upload data to them. In prototype application we use Manatec PhysioFlow and Medis NICCOMO cardiographs as our data source devices.

  5. 5.

    User devices in hospitals. Data stored on local hospital data servers are accessed via user devices like tablets with Android or iOS or other devices with web browser (see Fig. 2). The devices have preinstalled IPMed client applications, which connect to the local hospital data server via RESTful web services. Similarly like in the other places in the system, the devices have to authenticate using trusted keys. Apart from device authentication, users also have to authenticate using login and password pair. To make sure that no data will be stolen if the device is stolen, none of the data (medical data, passwords, keys etc.) is stored persistently on the device.

    Fig. 2
    figure 2

    Screen of the data management frontend in the Android application for stroke patients data gathering

  6. 6.

    Cooperation web server. This web server provides a platform for cooperation of physicians from different medical facilities by giving them access to the data stored in the main repository in one common and uniform data format. The platform also contains a recommender system based on expert system that classifies patients by their hemodynamic profile. This part of the system enables research for both physicians and computer scientists. From physicians’ point of view the cooperation platform enables to share knowledge and medical data that is needed for research. On the other hand, from the computer scientists’ point of view it enables to use gathered data in research on artificial intelligence, particularly for finding regularities in the data that may be useful for physicians using this system.

    It is planned that in the future this platform will give possibility to query central repository for similar cases to a provided one, e.g. finding patients with probably the same disease based on his measured ECG (electrocardiography) signal, using artificial intelligence, especially deep learning techniques. This feature is currently in a design stage and for our pilot application (electrocardiography) the literature does not provide direct solution to the problem. There is a large number of publications on the topic of biometric recognition of human identity using ECG [1214], which is a similar task and mostly uses hand designed features of ECG from which most common are temporal features (e.g. length of QRS complex), amplitude features (e.g. amplitude of R peak) and slope features (e.g. slope of R peak). However, finding similar cases requires extracting more high level features as otherwise it will consider two cases to be similar only if they come from the same patient and so it seems more suitable to use machine learning methods instead. There were some attempts for automatic recognition of diseases using machine learning, e.g. Dorffner et al. [15] used recurrent as well as feedforward (with hand designed features) neural networks for recognizing ischemic heart disease and they achieved results even better than skilled cardiologists with slight advantage of recurrent over feedforward network. Silipo et al. [16] also used neural networks for disease recognition. They tried to recognize arrhythmia, ischemia and chronic myocardial diseases and they used 3 different networks for each of the tasks. They used feedforward network with R-R intervals as input for arrhythmia, recurrent network for detecting ST segment anomalies caused by ischemia and a hybrid feedforward network with normalized radial basis function units in the first layer and two layers of standard sigmoid units above it. However, in our case we will need to use one method that will be able to extract features that in general finds good discrimination between patients with different diseases.

4 The IPMed Platform as a Cloud

Our platform can be divided into three parts: device control layer, support layer and backend (as shown on Fig. 1). In the nearest future we plan to migrate the backend to a private medical cloud platform. Cloud computing delivers infrastructure, platform, and software as service and can be classified into SaaS, PaaS and IaaS. Chen contended that cloud business models include medical cloud, education cloud, telecom cloud, financial cloud, manufacture cloud and logistics cloud [17]. The following six domains for the application of cloud computing to healthcare were identified: telemedicine/teleconsultation, medical imaging, public health and patients’ self-management, hospital management/clinical information systems, therapy and secondary use of data. Our kind of cloud is classified as secondary use of data cloud systems. This domain contains cloud computing utilization for enabling secondary use of clinical data, e.g. for data analysis, text mining or clinical research.

Business models include private cloud service, public cloud, common cloud and hybrid cloud. We plan to build a private medical cloud based on our HPC lab with a small cluster consisting of 50 nodes with four Intel Pentium III Xeon processors (700 MHz, 1 MB Cache L2). Nodes are connected using Scalable Coherent Interface. We use Linux Virtual Server as load-balancing software running on Debian Linux, Xen Software for virtualization, OpenNebula as management and monitoring software and GlassFish as web application server are used. We treat the entire cluster as PaaS.

Currently, the gathered data is stored in Postgresql but to achieve better scalability we plan to use the Hadoop platform to solve the exchanging, storing, and sharing issues in medical data (especially images and videos) (see Fig. 3).

Fig. 3
figure 3

IPMed planned components in cloud environment

By using Mule ESB as the communication method between components/services we can extend the platform with additional components such as modeling analytics tools and high performance algorithms support, for example, deep learning methods which were mentioned in Sect. 3.2. Service oriented architecture enables easy integration with other services e.g. data converters increasing of HIS interoperability or decision support services with a desired level of reliability [18].

5 Conclusions and Future Work

The IPMed platform incorporates four neurological wards in Polish hospitals, enabling gathering of data of stroke cases, including ICG. Hemodynamic disturbances occurring in 60–100 % of 70,000 patients with stroke are the second cause of death, responsible for 15 % of deaths in hospital phase. Fifty-six complete examinations were collected so far. The results of examination are related to 45 patients with ischaemic stroke (TIA) and 16 patients with hemorrhagic stroke. Statistical significance is not achieved in both groups. Everything indicates that this is due to too small sample size and further confirms the desirability of further research. Detailed results of clinical research are presented in [19].

In this paper we have presented a description of a service oriented medical platform supporting cooperative operation for medical teams, currently focused on the area of stroke treatment, ICG examinations and hemodynamics. We have described the IPMed platform and shown the process of migration to a cloud system. In the near future we plan hardware configuration, low-level software installation (PaaS) and adaptation of the IPMed platform to cloud service (SaaS).

The following concerns in this case of cloud computing application can be identified: safety/security of data as a threat to privacy, reliability and transparency of data handling by third parties. In the design of our system we put a lot of effort to ensure high level of data safety and security. Also, service oriented architecture allows improving system flexibility and its extension at low cost (with third-parties data converters), we can therefore consider the requirements to be addressed in an appropriate way.

Concluding, the development of data gathering platforms opens great opportunities for medical data analysis. Big sets of medical data have a tremendous importance for the science. Not only a regular statistic analysis of the data can be performed by medical researchers, but also a great contribution can be made by emerging data mining techniques. This can lead to discovering important relations in the data and developing new treatments for many diseases. Therefore, after successful implementation of the system in the area of stroke-related data, the future direction is to apply the IPMed platform in other fields of medicine.