1 Introduction

Every day both healthy and unhealthy people generate a huge amount of health information that are lost by not systematically collecting such data on a single European (cross-border) e-health platform. The value, quality and cost of current and future healthcare services could be improved by enabling the creation, collection, delivering and analytics of these valuable data together with their transformation into useful medical knowledge.

In the following, the word “patient” refers to any individual that needs temporary or continuous assistance by physicians, nurses, volunteers, etc. operating in the healthcare ecosystem. However, we also use the term “individual” to emphasize that the KNOWLEDGE-CARE concept can be helpful in maintaining health and wellness as well as a tool to help with illness that the term “patient” implies.

Traditional healthcare environments are extremely complex and challenging to manage, as they are required to cope with an assortment of individuals’ and healthcare business conditions under various circumstances with a number of resource constraints.

The general current business model of healthcare dictates that medical practitioners and other actors operating in the healthcare ecosystem keep and protect their own individual records of patients. This person-based and machine-based data is seldom integrated, analysed and shared dynamically. It is a general issue in today’s medical practice in most European Countries that one medical practitioner unless from the same clinic cannot gain access to the full view of a patient’s medical history and previous diagnosis from other entities resulting in diagnosis and therapy selection on the basis of partial information. Still in most European Countries, the exchange of medical records between healthcare providers is very conventional and restricted for outside access [1, 2]. This impedes the gain of more healthcare value with less resources.

Digital health records are classified as follows:

  • Electronic Medical Records (EMRs): they are digital versions of the paper charts in clinician offices, clinics, and hospitals. EMRs contain notes and information collected by and for the clinicians in that office and are mostly used for diagnosis and treatment. EMRs are more valuable than paper records because they enable clinicians to track data over time, identify patients for preventive visits and screenings, monitor patients, and improve healthcare quality.

  • Electronic Health Records (EHRs): they are built to go beyond standard clinical data collected in a clinician’s office and are inclusive of a broader view of a patient’s care. EHRs contain information from a set of clinicians involved in a patient’s care (e.g. an hospital or a municipality).

  • Personal health records (PHRs): they can contain the same types of information as EMRs and EHRs (e.g. diagnoses, medications, immunizations, family medical histories, and provider contact information), but they are designed to be set up, accessed, and managed by patients only. Patients can use PHRs to maintain and manage their health information in a private, secure, and confidential environment. PHRs can include information from a variety of sources including home monitoring devices and patients themselves.

According to the Office of the National Coordinator for Health Information Technology’s annual report on EHR adoption [3], despite progress in establishing standards and services to support health information exchange and interoperability, practice patterns have not changed to the point that healthcare providers share patient health information electronically across organizational, vendor, and geographical boundaries. In addition, electronic health information is not yet sufficiently standardized to allow seamless interoperability, as it is still inconsistently expressed through technical and medical vocabulary, structure, and format, thereby limiting the potential uses of the information to improve healthcare. Furthermore, paper based medical records are still in use in many European Countries and these records are often handwritten with poor legibility which contributes to medical errors; the storage and processing of this information is usually not possible.

The missing block is a unique platform, common to all European countries, which gathers healthcare information from globally spread citizens and which allows to offer personalized healthcare services possibly anywhere, anyplace, anytime the individual demands it, supporting the following:

  1. 1.

    Physicians in diagnosis/therapy and contributing to the pool of tools for medical research.

  2. 2.

    The patient or person to take care of him/herself.

Standardisation bodies are currently working on this problem [4]. However, the objective of standardisation bodies is not to converge to a single e-health platform with both PHR, EMR and EHR functionalities.

The KNOWLEDGE-CARE vision proposes a new knowledge-based cross-disciplinary approach (including ICT engineering, business, systems medicine, telemedicine and ethics) to attack the problem using an e-health cloud platform, where the individuals can “upload” and “download” personal healthcare data at the same time allowing healthcare stakeholders to do the same under the respective personal security and ethical rules. Knowledge is the theoretical and/or practical understanding of a subject that can be gained though data collection, sharing and processing. Knowledge-based healthcare management requires the exchange of any useful information between different entities (patient, general practitioner, specialist medical practitioner, hospitals, medical laboratories). Enhancement of medical therapies requires better knowledge of health-related observable and hidden phenomena that are currently not tracked because of the lack of multi-feature monitoring systems. The KNOWLEDGE-CARE vision can be achieved through the following actions:

  • Integration of PHR, EMR and EHR: one of the main objectives is to develop an e-health cloud platform as a unique framework solution which integrates the functionalities of PHR, EMR and EHR in a privacy preserving manner, collecting data coming from the individuals and the physicians. Cloud computing has a great potential to enhance the collaboration among different healthcare organizations operating in the healthcare ecosystem and to fulfill the common requirements such as scalability, agility, cost effectiveness and availability. Moreover, the migration of health records to the cloud storage relieves the healthcare providers from the infrastructure management tasks and enable individuals to take a proactive role in their own healthcare processes. We call this cloud platform “KNOWLEDGE-CARE platform”.

  • Widespread Adoption of Integrated PHR, EMR and EHR: another objective is to develop organizational strategies based on improved and new business models implemented in a scalable and flexible platform with both basic and on-demand dimensions/components and rewarding the production of health data useful for healthcare, personal wellness and medical research. Additionally, as mentioned in the previous point, we aim at promoting the interoperability of existing EHR from several providers and the coordination of different actors through the KNOWLEDGE-CARE cloud platform. We believe that this approach also increases the adoption of the KNOWLEDGE-CARE cloud platform.

  • Extensive Data Collection from Certified and Uncertified Sources: the KNOWLEDGE-CARE vision aims to support several different options for healthcare data collection from medical devices through consumer devices like smartphones, PC, tablets focusing on a high-level of usability and acceptance. The collection of a multitude of data related to health and wellness including genetic information, patient reporting symptoms, lifestyle information, sport activities information and the extension of healthcare services to healthcare suggestions, wellness suggestions, sport activities virtual trainer and knowledge sharing between individuals and physicians. Rather than studying each disease individually, systems medicine and integrated care take into account their intertwined gene-environment, socio-economic interactions, medically unexplained symptoms and co-morbidities that lead to individual-specific complex phenotypes [5]. These types of individuals’ information typically come from certified sources of medical organizations (e.g. medical instruments, clinical laboratory’s equipments). On the other hand, following the concept of the environmental epidemiology and medicine, we assume that health has many different forms of correlation/association with the following factors which are location and person dependent: lifestyle, weather conditions, air quality/pollution, stress factors, etc. These types of information typically come from uncertified sources of non-medical organizations (e.g. weather stations, air pollution monitoring station, smartphones). Such data will be collected using personal commercial devices (smartphone sensors, indoor weather stations, indoor air quality stations), connecting to servers of non-medical organizations (weather reports, outdoor air quality stations) and connecting to servers of health organizations (hospital databases, medical laboratories, pollen forecast, etc.). In particular, when some kind of information, usually related to lifestyle and psychosocial factors, is not available, it will be inferred from indirect sources of information (e.g. social networks).

  • Knowledge Extraction from the Stored Data: finally, this vision focuses on optimized knowledge discovery algorithms on the cloud stored healthcare data for real-time support to physicians in diagnosis and/or therapy selection and long-term optimization of therapies combining genomics, proteomics, metabolomics and epigenetics with models of disease, human individuality, lifestyle and environmental factors. Data mining tools are required for information extraction from health and wellness data gathered in a precise and personalised manner. This would also include techniques for the identification of unreliable data sets; the prediction accuracy of the platform is a key parameter.

Overall, the KNOWLEDGE-CARE vision is to enable new e-health services that are personalized, preventive, predictive, pre-emptive, participatory, pervasive and precise. The objective is to make a move from reactive medicine to predictive medicine, developing models through knowledge extraction and using such models for personalized clinical healthcare. Therefore, it is needed to develop a more responsive and relevant model, based on knowledge care management that lives within the symbiosis between the healthcare system and “the patient and personal healthcare system”, which can learn over time, and can adapt to the variation seen in the actual real-world population.

In summary, it is needed to develop a global e-health cloud-based system as an integrated framework solution for the collection, information exchange and knowledge extraction of healthcare. The goal is to improve the efficiency of medical care, enhancing the value, quality of health and wellness, while personalising the services and lowering the healthcare costs.

The system can provide, by means of novel data mining techniques comprehensive information, meaningful feedback and data management tools to different users, customers and network partners including physicians, virtual communities, pharmacies, patients, athletes, etc. The value of such health-related data is immeasurable and, hence, the collected data should be released (disclosed) through open access to public and private research institutions, provided that privacy is granted. The gathered healthcare data will be used for the analysis of lifestyle, co-morbidities, trends, side-effects from medications and the development of new therapies. Every individual, sharing personal healthcare information, becomes proactive in enhancing his/her and others’ healthcare therapies and lifestyle.

2 Integration of PHR, EMR and EHR

Integrated PHR/EMR/EHR should be more than just repositories for patient data; it should combine data collection, assisting software tools and knowledge extraction methods which help individuals to become active participants in their own health and wellness.

Currently, two different approaches have been followed in the management of a PHR; the standalone PHR and the tethered PHR. The first approach is a standalone PHR (e.g. Microsoft Health Vault or Google Health), where an individual may create his or her PHR using commercially available applications, ranging from stand-alone systems to Web-based applications. The patient can enter and access his or her health data through such systems. In this simplest form, the PHR is a stand-alone application that does not connect with any other system. In the second approach, called tethered PHR, the PHR functionality can be provided by allowing patients to view their own health information that is stored in their healthcare provider’s EHR. Tethered PHRs that are integrated only with an healthcare organization’s EHR. With a tethered PHR, patients can access their own records through a secure portal and see, for example, the trend of their lab results over last year, their immunization history, or due dates for screenings. In some cases, patients may add supplemental information that may or may not subsequently be incorporated into the provider’s EHR.

The KNOWLEDGE-CARE approach foresees to integrate the functionalities of PHR with that of the EHR, also integrating the functionalities of EMR for a complete interoperability. Furthermore, the owner of this integrated PHR/EHR/EMR is the individual, while the infrastructure is owned by a third party (e.g. by a nonprofit organization).

KNOWLEDGE-CARE will evaluate the use of incentives to allow the sharing of EHR from healthcare organizations (hospitals, medical laboratories, etc.) with the KNOWLEDGE-CARE cloud platform.

The infrastructure of the integrated PHR/EHR/EMR will be based on a cloud platform and will guarantee the interoperability with existing and future EMR and EHR removing the wide fragmentation of digital health records.

3 Widespread Adoption of Integrated PHR, EMR and EHR

The KNOWLEDGE-CARE vision follows a multidisciplinary approach to motivate the widespread adoption of Integrated PHR, EMR and EHR, developing new business models, new mobile applications, and solving ethical, security and privacy issues. Privacy concerns should not stop us from learning how PHR technology can improve healthcare. Particular emphasis should be given on understanding and supporting patients’ and practitioners’ needs. PHR users should think that using a PHR makes them feel like they know more about their health, and about the care that their doctor gives them.

High affordability, achieved for instance adopting free basic services, is expected to increase the level of acceptance and the quantity of health information. Though most PHR users tend to be younger, highly educated and of higher income, those with less education and lower income and those with chronic illnesses derive more value from using a PHR including becoming more educated, inquisitive, and proactive about improving their health.

From the technical point of view, the key features considered by the KNOWLEDGE-CARE vision are:

  • High-level of usability: usability by elderly people, teenagers, athletes, is expected to increase the level of acceptance and the quantity of health information.

  • High-value services: high-value of the service is expected to increase the level of acceptance and the quantity of healthcare information, people and businesses interacting with the KNOWLEDGE-CARE system.

4 Extensive Data Collection from Certified and Uncertified Sources

Collecting many different types of data (including physiological, genetic and lifestyle information) and collecting data from a huge number of individuals are two conflicting objectives, considering the wide variety of users and the different levels of acceptance by each category of user. The KNOWLEDGE-CARE vision addresses this problem through an approach which focuses on:

  • Integration of PHR, EMR and EHR: a new model based on the convergence of PHR, EMR and EHR is expected to increase the level of acceptance and the quantity of health information.

  • Convergence of services for both healthcare and wellness: a new model based on the convergence of healthcare and wellness is expected to increase the level of acceptance and the quantity of health information. The system should not be designed for one or few specific diseases, but it should be both usable and useful for any disease. This objective can be achieved through a modular system that can include additional subsystems and services upon request.

  • High-level of interoperability: interoperability with available medical devices or useful sensors is expected to increase the level of acceptance and the quantity of health information [6, 7, 8].

  • Mobility: healthcare service availability anywhere, any time—allowing 24/7 monitoring of patients (in particular patients with chronic diseases, elderly people), by “secured anybody and anything” is a key factor [9, 10, 11]. Satellite networks together with High Altitude Platforms (HAPs) should be considered the most effective tools to provide large scale ubiquitous coverage [12, 13, 14].

Following the concept of the environmental epidemiology/medicine, we assume that health has many different forms of correlation/association with the following factors which are location and person dependent:

  • lifestyle (eating, physical activities, working environment, use of hazardous materials);

  • weather conditions (rainy, cloudy, snowing);

  • outdoor air-quality (temperature, humidity, pressure);

  • indoor air-quality (temperature, humidity, pressure);

  • stress factors (work environment, car traffic, noise pollution);

  • electromagnetic pollution.

Several types of unconventional information sources should be integrated in the KNOWLEDGE-CARE cloud platform, including:

  • The user which manually enters personal/lifestyle information.

  • Sensors integrated within the smartphone (accelerometers, gyroscopes, microphone, camera, GPS receiver, luxmeter).

  • Devices connected with the smartphone (bluetooth cardio bracelet, heart rate monitor, pedometer, digital thermometer, weighing scale, etc.).

  • Weather databases from organizations which distribute current and historical weather information.

  • Air pollution monitoring stations.

  • Outdoor air quality stations.

  • Indoor air quality stations.

Furthermore, we will integrate several types of conventional information sources, including:

  • Databases of Hospitals

  • Databases of medical laboratories

  • Genetic information

  • Pollen monitoring stations

  • Portable medical devices such as: blood glucose meter, pulse oximeter.

One of the primary focuses of KNOWLEDGE-CARE is in the homogenizing data collection from all available data sources directly or indirectly related to the health. The collection of such data imposes challenges that need to be addressed in a novel approach taking into account the diversity of tools that need to be developed. These tools will collect data from devices that range from personal smartphones and data from non-official data source to official medical records and medical devices providing health related information. Important part of these tools will be data storage and management. The data annotation and organization will be one of the primary tasks that is addressed in KNOWLEDGE-CARE. Annotating and storing such heterogeneous data is an unexplored field that needs multidisciplinary expertise and multi-view approach.

5 Knowledge Extraction from the Stored Data

The two challenges in knowledge extraction from datasets are: use of unstructured datasets and the “small n, large p” problem.

For the first challenge, our approach is to design a cloud-based system where all the users (physicians, patients, athletes, etc.) are motivated to provide only structured information to build structured datasets which allows a more reliable and prompt analysis with respect to unstructured datasets as the ones gathered from physician’s doctor’s scribbles on a patient’s chart (unstructured free text).

The second challenge is quite common in the medical sector, where the number p of features (attributes including physiological, genetic, environmental and lifestyle information) of the available dataset can be even higher than the number of records (measurements) for a single patient or for a population. This situation constrains the validity of results in knowledge extraction algorithms.

We can address the “small n, large p” problem by enhancing the validity of knowledge extraction algorithms in the medical sector and increasing the number of measurements (see previous approach to extensive data collection).

For a more effective usage of structured datasets we aim at optimizing knowledge discovery algorithms lowering computational complexity.

We wish that, after a relatively short period of individuals’ information collection and analysis, a system based on the KNOWLEDGE-CARE vision will be able to automatically suggest the most appropriate therapy/lifestyle/training/activity.

The other primary focus is data analysis and content extraction. Analysing data from EHR records, where data is well organized and directly related to E-Health is a manageable task, while analysing data from unofficial sources is challenging. Such data have variations and unpredictability in the quality. The development of data mining algorithms fusing and analysing such a variety of data and extracting content related information is required.

In 10 years or so, each individual will be surrounded by an integrated PHR/EMR/EHR of billions of data points and the addition of genetic data extends the set of features that are available for data mining tasks. Therefore, we need to develop appropriate dimensionality reduction methods to simplify the hypothesis about health and disease for each individual.

We can extract knowledge from integrated PHR/EMR/EHR using both descriptive data mining methods (i.e. one way association, two way association, clustering, frequent sequential patterns discovery) and predictive data mining method (i.e. classification, anomaly detection, regression).

Unknown phenotypes will be defined and analysed using iterative cycles of modelling and testing. Furthermore, novel biomarkers can be identified combining different datasets. On important role that biomarkers will have is to stratify (clustering) a given disease into its different subtypes so that appropriate and distinct therapies can be selected for each subtype.

Examples of results from descriptive data mining methods include: geographical analysis of an industrial hazard; clusters of diseases based on geographical location, lifestyle, etc.; associations between symptoms and geographical location, lifestyle, etc.

The “No Free Lunch Theorems” formally proved that a data mining technique which is good when the true model is A may be poor compared with another technique when the true model is B. Therefore, once the needed data mining task has been identified, there is no single algorithm that is better than all the others on all the problems. As a consequence, it is needed to search for available and new data mining methods to find the optimal solution in the integrated PHR/EMR/EHR datasets.

6 Conclusions

The value, quality and cost of current and future healthcare services have a strong impact on our life. In this frame, it is of paramount importance to consider new visions for development of future healthcare systems and services able to improve the effectiveness of the whole healthcare sector.

The KNOWLEDGE-CARE vision is based on four pillars:

  1. 1.

    Integration of PHR, EMR and EHR.

  2. 2.

    Widespread Adoption of Integrated PHR, EMR and EHR.

  3. 3.

    Extensive Data Collection from Certified and Uncertified Sources.

  4. 4.

    Knowledge Extraction from the Stored Data.

We believe that the proposed multidisciplinary approach grounded on systems medicine and knowledge discovery concepts is the most effective way to tackle the problem of future healthcare systems.