Keywords

1 Introduction

With increased digitization in the healthcare sector, privacy risks and security concerns about data storage, access, and transfer among healthcare providers and patients have subsequently increased as well [55, 174]. Thus, information security has become an ongoing challenge in the healthcare sector with critical data breaches exposing sensitive records of millions of patients [10]. One such major data breach occurred in 2015 when a phishing scam exploited the credentials of five employees at Anthem, a health insurance organization, compromising the Personally Identifiable Information (PII) of 80 million individuals [194]. Data breaches in healthcare could occur for a variety of reasons, including a lack of employee awareness about data security, technological shortcomings, and the dearth of technological implementations [53]. Despite the proliferation of data security focused research in the community, the field lacks a comprehensive synthesis and analysis of the body of healthcare privacy and security research especially from the userFootnote 1 perspective [14].

Towards this, we conducted a systematic literature review to provide a holistic overview and a basis for the research undertaken in this area which has been proven to be helpful in other domains [125]. We collected 2, 903 research articles on data security and privacy preservation in healthcare organizations. Thereafter, we did a thematic analysis on a selected set of \(N=205\) papers. From the \(N=205\) papers, we further discuss insights from \(n1=97\) papers that focused on the technological implementation. Finally, we present an in-depth analysis of \(n2=18\) papers that are focused on the human (user) factors. We found that the majority of the security research done in healthcare focused on the technologies with a severe lack of focus on understanding and improving the human factor aspect. Furthermore, even among the work focused on technologies, we observed a gap in the research with applications to private practice healthcare organizations. The disparity is noteworthy.

Our contributions through this work are as follows:

  • While other Systematizations of Knowledge (SoKs) have been published on specific technologies related to healthcare, ours is among the first to perform a systematic approach for structuring existing knowledge on security and privacy in healthcare organizations.

  • In this SoK we make a holistic observation on security and privacy in healthcare and point out gaps that remain to protect patients’ health data.

  • To the best of our knowledge, our SoK is the first paper to focus on an overview of privacy and security research of patient data from a human perspective.

Our study concludes that the technological solutions are outpacing the foundational analysis of the ways the healthcare workforce is using and defending patient data today. Moreover, the existing research focuses on a narrow scope of medical settings which neglects the large population of patients and healthcare workers engaged in private healthcare practices.

2 Method

Our systematic literature review includes a corpus of 205 papers published till February 14, 2021, collected from different digital libraries. The literature review comprised of six steps: (i) database search, (ii) title screening, (iii) abstract screening, (iv) full-text screening, (v) data extraction, and (vi) thematic analysis.

Inclusion Criteria: Papers were included if they were: (1) Published in a peer-reviewed publication including journals and conferences; (2) Written and available in English; and (3) Focused on the security and privacy of data in healthcare organizations. We contacted publication venues and authors to obtain papers that were not available for public access, and obtained all the papers in our list.

Exclusion Criteria: Papers were excluded if: (1) Papers were presented as a work-in-progress (posters, extended abstracts, etc.); (2) The content analysis showed that the research was not directly related to patient/consumer health-related data security and/or privacy in healthcare organizations; and (3) The collected articles were part of patents or book chapters.

Figure 1 details all the steps carried out throughout this analysis.

Fig. 1.
figure 1

A snapshot of the data collection, screening, and analysis methodology along with the number of papers screened in each stage of the literature review.

2.1 Database and Keyword-Based Search

We conducted our search by exploring seven digital technology and medical databases: ACM DL, Google Scholar, SSRN, ScienceDirect, IEEE Xplore, PubMed, and MEDLINE. We specifically searched healthcare-focused journals in MEDLINE but were not able to find any relevant papers based on our topic of research, so we removed it from our database list. Our selection process was based on an iterative evaluation. We started by defining appropriate keywords for our subject matter. This was followed by filtering the results to meet our requirements. Subsequently, a systematic analysis was conducted on the final collection of research articles. This procedure was adapted from prior literature reviews by Stowell et al. [185], Das et al. [51, 52], and other related works [123, 127, 139].

After the initial search to obtain the keywords we collected the papers through a keyword-based search as mentioned above, using the Publish or PerishFootnote 2 software for retrieving articles from Google Scholar. Thereafter, we explored individual digital libraries to collect papers relevant to this research. Boolean search strings were developed for searching databases including up to 88 AND/OR operators and 17 NOT operators across the following keyword terms: Healthcare Data Security, Healthcare Data Breach, Healthcare Data Theft, Medical Data Theft, Medical Data Security, Medical Data Breach, Patient Data Security, Patient Data Theft, and Patient Data Breach. Our initial database and keyword-based search resulted in a total of 2903 papers.

2.2 Title Screening: Google Scholar

We noticed that every other digital library except Google Scholar has a limited number of papers. Thus, we avoided title-based screening for these digital libraries. We conducted a title-based search with the above-mentioned keywords in Google Scholar. We also removed any patents or citation options from Google Scholar. In the title-based search we looked for the keywords in the title itself to emphasize on the relevance, resulting in a total of 352 papers.

2.3 Duplicate and Work-in-Progress Removal

In the next phase, we conducted the step of duplicate removal. We removed 72 duplicate articles, which left us with 280 papers. We also removed any papers which were a work in progress such as posters, extended abstracts, etc. We screened out self-identified work-in-progress papers or reviewed the paper to see if the papers were works-in-progress. Due to the varying nature of publication of these works we could not demarcate the papers based on their page numbers with an assumption that work-in-progress papers are short. However, we removed any papers which were shorter than four pages. After this procedure, we were left with a data set of 231 papers.

2.4 Abstract and Full-Text Screening

Each individual research paper was assessed to determine its relevance to the topic of our research by reviewing the abstract and full-text. To do this, two researchers trained in qualitative coding determined the relevance of the individual papers to the research by analyzing the abstract and full-text. If there were any discrepancies with determining the relevance to the research then a third researcher was introduced to resolve the issue. Thus, 26 papers were excluded in this phase. After this screening, there remained a total of \(N=205\) papers on which we conducted our detailed thematic analysis [51].

2.5 Analysis

Our final set of data included a total of \(N=205\) papers on which we conducted detailed analysis in two parts. First, a thematic analysis was conducted to evaluate specific aspects of the papers including technical applications and policies. Thereafter, a detailed analysis of the user studies was conducted to understand more about the user issues as per the goal of this work.

Table 1. Distribution of the number of papers based on the thematic analysis

Thematic Analysis: To perform a thematic analysis, we reviewed the abstract, methods, results, discussion, and conclusion of the 205 collected papers obtained from full-text screening. Two researchers evaluated this collection of papers by first reviewing 20 randomly selected papers to generate the codebook. The codebook consisted of 119 open codes which were themed into eight overarching themes including: technological solutions proposed, evaluation of current model with privacy frameworks, systematic literature reviews, evaluation of patient data focusing on the big data storage and management, ethical and legal implications of research, author notes and overview of the current healthcare practices to protect user data, case studies on particular incidents occurred as in data breaches, and finally the user studies.

Table 1 shows the distribution of the papers as per the categorization of all of the 205 papers. This can be further examined in Fig. 2. Any paper that included any form of user study, even if that was not the paper’s primary theme, was marked in the user study category. This was specified given the user-focused aspect of the paper. After conducting the first set of analysis, we performed another set of thematic analysis to categorize the papers which studied technological solutions to address healthcare privacy and security challenges. Given the large number of technical solution-focused papers, we have detailed them in Table 2 to explore more on what type of technical solutions were proposed by the prior works.

Fig. 2.
figure 2

A snapshot of the themes discussed throughout the analysis

User Study Analysis: After the two phases of thematic analysis, we conducted a detailed user study analysis where we focused on the \(n2=18\) user studies. We extracted the quantitative and qualitative findings to assess what user and technical perspectives of the healthcare-focused research was conducted by the prior studies. We have provided details of both the technical solutions analyzed in this work and the user studies in the following section.

Out of the 18 user-focused studies, four were qualitative [1, 24, 48, 99], 12 were quantitative [23, 36, 43, 49, 69, 71, 80, 140, 143, 162, 170, 177], and two were mixed-methods studies [28, 133]. The quantitative studies included works which implemented nine survey-based studies [23, 49, 69, 71, 80, 143, 162, 170, 177], one cross-sectional studies [36], one in-lab simulation-based study [43], and one randomized control experiment [140]. For qualitative studies, they included three interview-based studies [1, 24, 48] and one field-based research [99]. In the qualitative study, Baker et al. also performed observation evaluations on their studied participants for the interview [24]. For mixed methods, there were two studies, where one study which had a combination of online survey and did content analysis [133], the second study did a semi-structured interview with 16 care managers at 12 health centers in three states participated [28].

3 Findings and Discussions

As described earlier, we first started with the thematic analysis of the collected papers where we found eight overarching themes. Thereafter, we detailed the technical solutions provided by the papers, and finally performed a detailed literature analysis on the small subset of user studies identified. In this section, we will first provide details of the thematic analysis and thereafter, we will provide details and evaluation of the user studies.

3.1 Thematic Analysis

For each of the 205 papers, we collected details about the methods, results, analysis, discussion, and implications. Thereafter, we analyzed the data collected, and categorized them into eight themes as shown in Table 1. For this we particularly looked into the methods, results, and discussions of the mentioned papers. We then performed a detailed analysis on the technical solutions and the user studies, which will be discussed in the later subsections.

Technical Solutions Discussed: Nearly half of the collection, \(n1=97\) (47.32%) out of \(N=205\) papers, focused on proposing a technology-based solution for the privacy and security issues of the healthcare sector. To understand further, we classified the technical solutions proposed by the authors. Table 2 as well as Fig. 3 show the distribution of the papers based on the several types of technological solutions proposed by the authors to enhance the privacy and security of the data transferred and accessed in the healthcare sector. Many of the papers use a combination of the technical solutions, for example using cryptography for authentication or using encryption to do image processing. However, here we used mutually exclusive codes to focus on the primary solution proposed by the paper after going through the full-text.

Table 2. Distribution of the papers providing technical solutions out of the \(n1=97\) papers which proposed privacy and security solutions of the healthcare organization
Fig. 3.
figure 3

A snapshot of different technology based solutions for healthcare data privacy and security

Data Encryption: Out of the \(n1=97\) technology-focused papers, nearly half of the papers discussed the encryption techniques to protect the data. A total of 32.99% of the papers discussed how patient data can be encrypted and anonymized for robust security of health-related data [6, 16, 17, 25, 29,30,31, 42, 57, 86, 92, 96, 103, 120, 147, 153, 159, 161, 166, 167, 188, 189, 195, 197, 201, 209,210,211]. For example, Sudha and Ganesan while discussing the lack of security of Electronic Medical Records (EMR) propose a Pervasive Mobile Healthcare where multimedia medical record are protected using an Elliptical Curve Cryptography algorithm [186]. Gupta and Metha discuss the importance of transmission of medical data over unsecured networks, and propose a chaos-based encryption scheme to secure medical images [76].

Blockchain: Another important focus on the technological solution found in our collected sample was on blockchain technology [8, 13, 33, 38, 50, 72, 75, 112, 149, 154, 155, 181]. These papers explore the peer-to-peer network topology of the blockchain, which implements a distributed ledger technology focusing on the transparency of the network [141]. For example, Brunese et al. propose a blockchain-based technology aimed to protect information exchanges in hospital networks, with particular regard to magnetic resonance images by implementing formal equivalence checking to validate the network of the transiting data [38].

Image Protection: As discussed previously, there are papers which discussed how encryption and blockchain can be used to protect medical data in the form of images. However, we found 12 papers which explored the different technical implementations to specifically protect medical images [18, 21, 27, 40, 42, 62, 97, 102, 114, 171, 173, 182]. For example, Kumar et al. propose embedding patient information into a medical image through data hiding to improve security and confidentiality for diffusion of medical information system [114]. Their proposal was interesting and effective as they not only discussed embedding the text into images, but also the importance of protecting these images.

Watermarking: A particular aspect of image protection was digital watermarking. There were eight papers which focused on the watermarking aspect of medical image protection [22, 65, 66, 98, 137, 165, 182, 200]. Vidya and Padmaja focused on enhancing the security of Electronic Patient Record (EPR) data which enable tele-diagnosis. They propose watermarking by embedding EPR into the facial photograph of the patient and discussed implementing a Photoplethysmography signal from the forefinger tip of the patient for authentication which had a success rate of 98% against security breaches [200].

Access Control and Authentication: Seven papers focused on making the security protocols of the healthcare system robust by addressing the access control and authentication particularly [20, 73, 73, 85, 105, 157, 172]. Izza et al. focused on Internet of Things-based E-healthcare and radio frequency identification (RFID) authentication scheme for Wireless Body Area Network (WBANs). In their protocol, which they mention to be effective against digital threats implements elliptic curve digital signature with message recovery [85].

Mathematical Modeling: We found that five of the collected papers utilized statistical and other mathematical models to provide solutions to the security threats of the healthcare organization [44, 121, 122, 124, 192]. Chaudhury et al. discusses the Supervisory Control And Data Acquisition (SCADA) systems used for medical data transfer and how Impulsive Statistical Fingerprinting (ISF) can be implemented to substantiate the conversion of sensitive health data through the ISF into a secure Health Level 7 (HL7) format [44].

Network-based Solutions: Five (5.15%) of the papers discussed network-based solutions to resolve the privacy and security complexities of healthcare systems [26, 88, 106, 206, 213]. For example, Wang et al. details the WBAN and introduces the key technologies and characteristics of wireless sensor networks emphasizing node localization. They emphasize the importance of network localization algorithms and performance evaluation indicators on wearable 3D node localization algorithms to protect healthcare data of the patients [206].

AI and ML-based Solutions: Out of \(n1=97\) papers we found that four (4.12%) papers discussed artificial intelligence and machine learning-based solution to address the privacy and security issues of the healthcare sector [45, 95, 156, 183]. PraveenKumar proposes health and temperature sensors to monitor the patient health data that gets transmitted to a microcontroller. The real time data is then monitored and analyzed using k-means clustering and can guide both patient and doctor knowledge [156].

Web-based Solutions: Web-based solutions were proposed in four papers in our collection, where any form of web-based technical solutions to improve privacy and security of the sensitive data of the patients was discussed [19, 117, 191, 207]. Tian et al. looked into clinical prognosis prediction models based on electronic health record data and developed a web service based on multi-center clinical data called POPCORN. The PrognOsis Prediction based on multi-center clinical data CollabORatioN (POPCORN) focused on the standardization of clinical data expression, the preservation of patient privacy during model training using a multivariable meta-analysis, and a Bayesian framework [191].

Cloud-based Solutions: Four out of 205 papers discussed cloud-based solutions to address the privacy and security issues of patient data protection [12, 58, 101, 134]. Khan et al. presents a secure cloud-based mobile healthcare framework using WBANs where the framework tries to secure the inter-sensor communication by multi-biometric-based key generation scheme [101].

Edge Computing-based Solutions: Several prior papers have discussed edge computing, but we found three papers which focused on edge-computing-based solutions [7, 9, 119]. Edge computing is a distributed, open IT architecture that features decentralised processing by the device itself or by a local computer or server, rather than being transmitted to a data center [175].

Treatment Continuity: An interesting paper by Zhang et al. [214] pointed out a scary aftermath of cybersecurity breaches, which is pausing or preventing continuous treatment of patients suffering from critical ailments. Their proposed solution to address this focuses on automatic retrieval of essential information from the clinical radiation oncology information systems for each under-treatment patient periodically and providing backup through secondary data servers in the event of an attack to one of the servers [214].

Healthcare Frameworks: Of the 205 papers collected, 34 (16.58%) papers studied or introduced new healthcare data management frameworks. A paper was considered under the theme of healthcare frameworks if the main subject of its study is a security, privacy, or design frameworks [5, 11, 15, 35, 37, 39, 60, 67, 70, 83, 84, 90, 110, 113, 115, 116, 118, 126, 128, 132, 135, 145, 150, 152, 160, 164, 178, 180, 198, 208]. These papers particularly describe methods to design a secure and private technology for healthcare data usage. One such paper “A Security Framework for Mobile Health Applications”  introduced a security framework for healthcare mobile applications, taking usability and security into consideration [190]. Ibrahim et al. introduced a framework for securely sharing electronic health records over the cloud between different healthcare providers. This framework ensures the confidentiality, integrity, authenticity, availability, and auditability of the electronic health records [82].

Data Storage and Management: Papers were classified as data storage and management if the research done was related to healthcare data access, manipulation, or the different technologies allowing for medical data storage. We found 18 (8.78%) such papers in our corpus [32, 54, 68, 74, 77, 81, 94, 107,108,109, 129, 138, 151, 163, 168, 179, 202, 205]. In particular, Duque et al. introduce a distributed data management architecture with a focus on the healthcare data security and high performance requirements [54]. On the other hand, Petković was concerned about the reliability of data transmitted through remote patient monitoring systems, since the data is collected by patients with no medical supervision [151]. Petrović addresses this issue by proposing several approaches that minimize the risks and ensure high information reliability.

Overviews: Overview papers include works which consolidate the prior work on healthcare privacy and security by adding details of the current state of privacy and security in the organizations and also adding details of the new technologies implemented. Of the 205 papers, 16 papers (7.80%) discuss or review the healthcare privacy and security domain [2,3,4, 41, 59, 100, 100, 130, 142, 144, 148, 158, 187, 196, 204]. Of these, Paksuniemi et al. gives an overview of the wireless technologies devices and reveal the importance of implementing security measures in these technologies to enable secure patient monitoring [144]. Moreover, Wang provides an overview of the security threats imposed by smart devices which monitor the patients through internet-connected technologies. Wang details two primary security related issues for Internet-based tele-medicine systems that need to be addressed: (1) medical data protection needs; and (2) system design issues [204].

Ethical and Legal Implications: Of the 205 papers, ten (4.89%) papers studied the ethical and legal ramifications of data leaks occurring due to healthcare data breaches [63, 72, 78, 79, 91, 104, 131, 169, 193, 199]. These papers particularly explore violations in U.S. healthcare standards, including the Health Information Technology for Economic and Clinical Health (HITECH) Act [79] which proposes the meaningful use of interoperable electronic health records throughout the U.S. healthcare delivery system as a critical national goal. Hollis also discusses how beyond medical data secursity, healthcare staff are ethically required to anonymize the data so other staff are unable to uniquely identify a patient through their stored data [78].

Case Studies and Data Breaches: Case studies and data breaches both document real-world outcomes including common violations of security and privacy. Both are insightful to illuminate contemporary issues and research should seek to help develop proactive defenses that decrease the prevalence and impact of incidents and data breaches. We found that six (2.93%) of the 205 corpus papers were case studies and data breaches, classified as such when authors studied a particular organization, data protection practices, or particular incidents of data breaches. Some of these case studies chose different countries for their analysis [34, 46, 61, 136, 176, 212]. The organizations which were studied spanned global geography including India [176], United States [46], Saudi Arabia [34], and Morocco [136].

Yesmin and Carter created an evaluation framework for automated privacy auditing and found that 98.09% of 55,000 accesses of protected health information by staff in a hospital were identified as appropriate and the tool was unable to identify the remaining 1.91% of accesses [212]. Choi et al.’s work estimated changes in health information technology investments by tracking spending by U.S. hospitals between 2012 and 2016. Their results found that health information technology spending increased by 26.0% in one year after a breach [46]. These studies have been critical to understanding the real world but do not mention the stakeholders who were responsible or whose data were breached and how that may impact patients’ lives.

Systematic Literature Reviews: Of the 205 papers analyzed, six (2.93%) were systematic literature reviews [47, 87, 89, 93, 146, 203]. These studies gave an overview of the current standards and practices followed in the healthcare sectors while mentioning the importance of the focus on the healthcare privacy and security. However, these studies did not focus or explore the user perspective. For example, Walker et al. implemented a mixed-method systematic review by analyzing about 300, 000 papers and found evidence of high heterogeneity across crude data indicating that the effectiveness of security measures varies significantly in healthcare but concluded without a solution for insiders attack [203].

3.2 Analysis of User Studies

In addition to our analysis of the technical solutions proposed in the collection, we performed a detailed analysis of the user studies (\(n=18\)). Our goal was to understand and assess the studies which evaluated user perception towards the privacy and security of their healthcare-related data. We performed a thorough analysis of the user studies and analyzed certain aspects of the study such as type of study conducted, study populations, duration, and medical settings.

Table 3. % of and number of studies in settings with various population densities along with details about the user study durations.

Study Method: Of the 18 user studies in our corpus, 66.66% (12) were quantitative studies. From the quantitative perspective, 50% (9) were surveys [23, 49, 69, 71, 80, 143, 162, 170, 177], 5.56% (1) quantitative descriptive study [36], 5.56% (1) simulation-based study for a quantitative sample [43], 5.56% (1) randomized controlled trials [140]. Of other studies, 11.1% (2) were mixed-methods survey [28, 133] with open-ended questions with a smaller population sample, 5.56% (1) field study [99], and 16.66% (3) qualitative interview-based studies [1, 24, 48]. Among the 18 user studies, only one assessed a proposed technological intervention. This evaluation involved the efficiency and convenience of a mobile app for managing diabetes [1]. Participants noted that one advantage of it was compliance with hospital regulations for patient data security.

Study Duration: For the majority of the studies, the time taken for the completion of the study primarily occurred in a single session (Table  3) [23, 49, 69, 71, 133, 143, 170, 177]. However, an evaluation of a diabetes management app occurred over 12 weeks [1], the randomized controlled trial of telehealth occurred over a 12 month period [140]. Also, a survey of public perception mobile phones’ effect on healthcare was repeated in 2013 and 2014 [162], and a field study in Nigeria was conducted over four weeks [99]. Such longitudinal studies are particularly important to understand users’ privacy and security perspective and how user perspectives can change (or do not change) over time.

Population Distribution: As shown in Table 3, many of the 18 papers did not report population distribution of the participants (44.44%, 8) [23, 28, 49, 69, 99, 143, 177]. Most of the remainder studies were conducted in urban settings (37.5%, 6) [1, 36, 43, 80, 140, 170], except one (5.56%) which was conducted in a rural setting [24]. No papers reported on suburban population settings.

Table 4. % and Number of studies conducted in various healthcare facilities along with the number of study participants for different user studies.

Study Population Setting: Of nine of the 18 user-focused papers which studied healthcare providers [24, 28, 48, 49, 69, 71, 99, 133, 177], only one studied the patients exclusively [170]. Three papers studied a mixed population of patients and healthcare providers [1, 23, 140]. Mixed method studies focused only on healthcare providers; similarly, 75% of qualitative studies were focused on healthcare providers.

Study Geographical Location: Out of the 18 studies, four were conducted in the USA [24, 28, 140, 162] and five in the European Union [48, 69, 71, 133, 177], and one was conducted in both Europe and USA [49]. One paper that conducted their study with participants in Europe included 30 countries [177] and one included 24 European countries [133]. Only one study was conducted in Turkey [36], two in Africa (both in Nigeria)  [23, 99], and two in the Middle East [1, 170]. Three quantitative studies were conducted in Asia specifically India, Malaysia, and Hong Kong [43, 80, 143].

Study Context: Two qualitative studies were conducted in medical settings other than hospitals and private practice [1, 99]; one was conducted in private practices [24] and one in three different hospitals [48]. (Table 4). Quantitative studies reported settings including hospitals [69, 71, 177], medical settings not including hospitals and private practice such as medical schools [36, 43, 49, 80, 143], patients’ home environments [162, 170], and mixed settings [23, 140]. No papers focused on private practice settings. This is again interesting, as privacy and security of medical data is critical irrespective of the setting. Thus, studies focusing on more diverse medical settings are critical.

Number of Participants: One of the 4 qualitative studies did not report the sample size. The most participants reported in one study is 50 participants, the other two studies reported the same number of participants, 14. All the quantitative studies and the mixed method studies reported the sample size. A total of 94 participants were in qualitative studies, 5, 856 (Median=429, IQR=581, Range=50–1242) were in quantitative studies, and 117 (Median=58.5, IQR=42.5, Range=16–101) in mixed studies.

4 Implications

We acknowledge the contribution of these previous works towards enhancing the privacy and security of sensitive patient data. However, we note that more research is needed to fully understand the challenges to healthcare security and privacy.

4.1 Holistic Security Approach

When security or privacy are a secondary goal of the users, research is needed to understand the motivations behind the circumvention of controls. From our analysis of the user studies, we have identified three major themes pertaining to the human factors of information security in healthcare, namely: inconsistent access controls, non-compliant and insecure communication modes, and disruptive update and backup policies. The majority of the past security research involving people in healthcare has focused on understanding how providers may circumvent authentication [184], including the discovery that providers often share login credentials with each other due to inconsistencies in access control policies [24, 48].

Access controls and privileges in healthcare are often designed without considering the individual provider’s needs or the multitude of tasks conducted by them on a day-to-day basis. Rather, it is often designed in a tiered manner where senior doctors have the most privileges and junior doctors and nurses are assigned limited privileges [24, 48, 69]. Therefore whenever a provider (e.g., nurse or junior doctor) needs immediate access to a certain system or patient record for providing critical care, but don’t have the necessary privileges, credentials are shared, usually by the senior doctors in these settings. This type of credential sharing also occurs when someone needs access at a critical time but has not completed the necessary training [47]. In addition to this, past research also discusses other general, known issues associated with password usage such as using insecure passwords, task interruptions, disabling authentication or keeping machines unlocked for a long periods of time. Access control cards are used to counter these password usage issues, but still do not address the security circumvention issues discussed earlier [177].

The other dominant theme involved secure communication between providers and patients, or lack thereof. Few papers noted that providers often used non-HIPAA compliant messaging software to share test results with the patients and also with each other [1, 48]. For example, providers have been known to share images of scan reports with patients using WhatsApp, a popular messaging platform from Facebook. Providers may be placing inappropriate trust on these messaging platforms based on the end-to-end encryption claims made by these platforms. More research is necessary to understand the challenges involving the use of recognized, HIPAA compliant message systems (e.g., American Messaging System or AS) for communicating securely between providers and between providers and patients.

The final theme that emerged from our analysis was regarding the issue of applying security updates and automatic backups. Providers report updates and backups appearing at inappropriate times such as while engaging with patients [48]. More research is necessary to determine the timing of updates that are reasonably quick and non-disruptive to the workflow of the providers. Unsurprisingly, technologies including encryption, blockchain, cloud, and access controls were popular topics in the research literature. While technology represents an important area for future opportunities and threats in healthcare, they remain distant and disconnected from real-world needs today. Their over-representation in the literature, therefore, overshadows the analysis of security and privacy practices today.

The rollout of any new technology in healthcare is slow given strict legal and compliance constraints. Despite these new technologies, other technical solutions were notably missing that may hold promise for healthcare security and privacy. For example, continuous authentication may aid healthcare workers by using biometrics or hardware tokens to lock and unlock computers when an authorized user is in physical proximity. The user studies of security circumvention suggest that automated security features may be helpful, building on the effectiveness of features such as automated software updates. Additionally, despite the popularity of machine learning solutions in various fields, we were surprised that these solutions were not prominent in our healthcare corpus.

4.2 Focus on Private Practice Healthcare

The studies we analyzed focused heavily on hospitals and other large medical settings despite the fact that those represent a narrow view of all healthcare workplace settings. Hospitals are atypical because they are among the most well-resourced settings for controlling, implementing, and enforcing security and privacy controls. Those resources enable higher than average investment in security and privacy solutions, technical support, and organizational security culture. The problems that manifest in hospitals, and solutions for them, should not be assumed to generalize to other medical settings.

The literature appears to emphasize that improving health is the primary objective in healthcare, with security and privacy among secondary goals. A small businesses may have slimmer margins to apply to those non-primary goals. They need help to prioritize spending and implementation of privacy and security controls and the research community should prioritize the most impactful needs first. In a study of private practice audiology clinics, Dykstra et al. found that expertise, time, and money were reported as the primary limitations of better cybersecurity [56]. While these limitations are not unique to healthcare, they must be more explicitly acknowledged when proposing new security and privacy mitigation measures. For example, one might imagine that a doctor in a single-provider clinic may circumvent a compliant telehealth solution and revert to a non-compliant personal device given a hardware failure in the practice. Thus, a focus on studies reviewing such nuances will be critical especially for private practice and other resource-constrained healthcare organizations.

4.3 Studies in Rural Setting and Developing Nations

Along these lines, we observed scarce security and privacy research related to rural settings and developing nations. The resource limitations of the settings demand a dedicated study of the population and appropriate technological mitigation techniques. The healthcare sector and research communities alike require the insights of economics. None of the papers in our survey offered a robust analysis of the probability of various vulnerabilities that would aid resource-limited organizations in prioritizing solutions. Economic models, such as the Gordon-Loeb model, may be effective in suggesting investment strategies [64]. Economics research may also wish to explore the costs and benefits of cybersecurity policy decisions in medical settings, insights about attacker motivations, and oppositional human factors to disrupt attacker cognition and decision making.

4.4 Understanding the Patient’s Perspective

Among the user studies we analyzed, the majority have focused on understanding the security behaviors of healthcare workers. However, patients’ perspectives appears to be largely overlooked. Security and privacy requirements should be informed and driven primarily from the desires of patients about their own data. Patients as voting citizens influence healthcare laws and regulations in their choice of elected officials. Patients are also the most directly impacted by security breaches. More research is necessary to understand the gaps in patients’ understanding about the implications of a security breach to their personal data. Research is also necessary to understand how much (or how little) trust patients place in their healthcare organizations in protecting their personal data [111].

5 Limitations and Future Work

Healthcare is a broad and diverse sector with many niche journals and publications. Despite our best efforts, we may have missed important contributions reported in publications for medical sub-specialities published in paid venues or otherwise excluded by our search criteria. Future work is needed to understand when, how, and why healthcare workers circumvent compliant workflows and tools. Prior work has been focused primarily on authentication-related circumvention and usability and a broader examination is warranted. Further, past research has drawn heavily from surveys so in-situ data would provide further grounding and accuracy.

6 Conclusion

As the healthcare sector is increasingly digitized, privacy risks and security concerns about data storage, access, and transfer have greatly increased. However, the question remains about how the research community is addressing these concerns from the technical and user perspective. To this aid, we conducted a detailed systematic literature review after collecting 2, 903 papers and thematically analyzing \(N=205\) of them. These peer-reviewed research articles were published and available over seven digital spaces: ACM DL, Google Scholar, SSRN, ScienceDirect, IEEE Xplore, PubMed, and MEDLINE. We examined the security and privacy of patient data in healthcare organizations as studied by prior literature. We found that current research focuses primarily on data encryption and frameworks while understudying the user risk perceptive of privacy and security. Along the socio-technical component of healthcare privacy and security, it was concerning to note that < 9% of the papers conducted any user studies. Among those, the studies were influenced by survey designs rather than in-depth, longitudinal user-focused studies. Additionally, these studies focused primarily on larger settings by severely ignoring the organizations with limited resources such as the private healthcare sector. We conclude with actionable recommendations from the rich literature we studied that can enhance the privacy and security aspects of the healthcare sector.