1 The Discipline of Social Media Research: Scope, Theories and Principles

1.1 An Introduction to Social Media Research

The internet has had widespread uptake around the globe and offers opportunities and challenges for risk communication in safety of medicines. In 2015, there were 3.2 billion (International Telecommunication Union (ICU) 2015) internet users worldwide; 63% were from low- and middle-income countries (International Telecommunication Union (ICU) 2015). Even in least-developed countries, a significant number of people access the internet regularly, especially from handheld devices operating over cellular data networks. The internet is used for disseminating and accessing information via websites, electronic mail, purchasing goods, and engaging via social media. In 2015, nearly two-thirds of adults (65%) in the United States (US) used a social networking site. While the majority of these individuals were aged between 18 and 29, 35% of adults aged 65 and older were using social media (Perrin 2015).

1.1.1 Social Media Listening

Billions of people interacting with the internet, or being “online”, on a daily basis generate traces of important information that can be aggregated and analysed for research purposes. The process of using social media to understand how consumers discuss specific topics in online spaces is known as social media listening (Powell et al. 2015). Typically, social media listening is a passive process for the social media users and has been used for commercial purposes, like marketing and retail. However, a large amount of daily discussions in social media pertains to health information and diseases, as well as biomedical and medical products that address these conditions (medicines, devices, vitamins, supplements, etc.) (Powell et al. 2015). Many of these health-related discussions are generated by patients, comprising a large corpus of free-text narratives that can be leveraged for health-specific research (Powell et al. 2015).

1.1.2 Crowdsourcing of Information

Crowdsourcing of information, on the other hand, is generally an active process whereby online participants are solicited for specific information. It may be defined as the systematic effort to collect information from a wide audience, particularly through online tools that can provide mutual benefits to participants and activity sponsors (Bahk et al. 2015).

In practice, both active and passive processes may be used within a single research project. For example, social media listening may be used to form hypotheses, which are then tested using crowdsourced data. Conversely, social media listening can be used post hoc to contextualise and make sense of unexpected crowdsourced information, such as jargon and acronyms.

Information from patients is traditionally captured through qualitative research or surveys, usually with a series of standardised questions (see Chap. 8). However, listening to the patient voice in this structured way may limit the scope of patients’ responses and their willingness to discuss sensitive topics. In contrast, unstructured discussions acquired from online forums—particularly those dedicated to discussions regarding a specific therapeutic area or treatment—could provide a wealth of patient information that typically is not captured in traditional studies due to hearing directly from the patient. Metadata derived from posts and user account profiles can provide a more complete picture for research-related applications than relying solely on a single post, and has the possible benefit of painting a more comprehensive view of a patient’s life than just based on a cross-sectional survey response.

The body of literature on digital health is expanding rapidly (Rothman et al. 2015). While the use of the social media and the development of social media strategies are important topics for research, in this chapter we narrow the focus. Our aim is to describe how to apply emerging tools of social media research—a new discipline under formation—to the post-authorisation safety surveillance of medicinal products and pharmacovigilance overall. This additionally includes the application to medicinal product risk communication research in particular, including for the purpose of planning and evaluation of communication interventions.

1.2 Pharmacovigilance, Risk Communication and the Social Media

Pharmacovigilance monitors a medicinal product to identify and assess adverse events that may occur in patients. Adverse events causally associated with a medicine (i.e. adverse reactions) pose a patient and public health problem. However, both rare and late reactions are difficult to uncover through clinical trials during the development process of a medicine because trials typically include a couple of thousand patients at maximum and are relatively short in duration compared to long-term medicines use in real life. For this reason, safety surveillance after product approval by the regulatory body and during use in healthcare is of critical importance to safeguarding the availability and development of pharmaceutical medicines. Legal obligations for pharmaceutical manufacturers and established practices during this post-authorisation phase refer to characterising, preventing, and minimising risks related to medicinal products. Fundamental to these pharmacovigilance processes are continuous exchange and (re)assessment of risk information. Many organisations currently use a combination of automated and manual processes to perform necessary pharmacovigilance duties, including with traditional individual case safety reports, i.e. reports of an adverse reaction suspected in a patient, that are submitted as the so-called spontaneous reports through national reporting systems. Harms related to medication errors or product quality concerns may also be reported depending on national definitions and requirements. Reports can be submitted via telephone, paper, email, fax, online forms, and mobile apps. Nowadays evidence from observational studies, in addition to spontaneous reports, is very important for further investigating safety concerns or proactively monitoring a medicine at the population level.

More recently, regulatory authorities and other stakeholders have recognised the importance of capturing the patient voice and data contribution for pharmacovigilance. As such, regulatory authorities in many countries recommend to patients to report adverse events they suspect with their medicines, and recommend testing of risk communication for patient comprehension, even asking for patient input on proposed risk minimisation/communication plans and strategies (Snipes 2015) . In general, the patient voice has been established as an important addition to a variety of medical research initiatives (Smith and Benattia 2016). Patient-reported outcomes are now accepted in clinical trials, and there is a renewed focus on patient-reported outcomes derived from unstructured data in other types of research, such as comparative effectiveness (Peacock 2014).

Starting in 2011, questions about the future of social media and pharmacovigilance were raised by senior figures in the field (Edwards and Lindquist 2011). With the rise of social media usage, there is potential for social media to be incorporated into effective pharmacovigilance (PatientsLikeMe 2019), including risk communication, by manufacturers, regulators, and others involved. Social media can be perceived as a new data source to inform pharmacovigilance and risk communication. Nevertheless, the volume and concerns about tenuous causality give rise to legitimate concerns about muddling data from social media with vetted data from carefully honed pharmacovigilance information systems. Yet, the processes in place globally for pharmacovigilance information processing offer a potential framework for dealing with social media data. This will require a careful balance of human and machine tasks, tempered by vastly different concepts of privacy and collaboration.

This chapter provides an overview of how social media research may be used to augment current medicines safety surveillance and risk communication practices through case studies, discussion of its potential opportunities for benefits and limitations, ethical and legal concerns, as well as practical lessons learnt and future outlook. This includes a synopsis of the current public debate on the usefulness of social media research in pharmacovigilance, underpinned by examples. Many high-quality reviews of existing applications have been published recently (Rees et al. 2018; Convertino et al. 2018; Tricco et al. 2018; Wong et al. 2018; Demner-Fushman and Elhadad 2016; Golder et al. 2015; Lardon et al. 2015; Sloane et al. 2015; Sarker et al. 2015) and should be consulted for more in-depth discussion of topics like the merits of particular data sources and computational methods.

2 Research Approaches and Methods

2.1 Selection of Social Media Sites

For clinical trials and epidemiological studies, site selection is central to investigating causal inference from observed associations. Similarly, a wide variety of social media platforms currently exist, and each may be used primarily by a different population; therefore, one social media site may be more appropriate for a specific research purpose than another. Permissions associated with a specific site might only allow for use of certain information. Additionally, each site’s users may have a unique demographic profile that could change over time. For research projects that are interested in specific, well-defined topics or events, Twitter might be useful due to the hashtag (#) feature, which groups posts into a folder system; hashtags are a means of organising content in social media, akin to folders in traditional computer operating systems or electronic mail (Grajaless et al. 2014), but limited by length of content. Researchers specifically have been able to utilise Twitter to connect with patients or potential patients about a variety of health topics. However, for privacy reasons, healthcare professionals and patients should be cautious about what content they publicly share (Grajaless et al. 2014). Closed social media platforms such as a site for patients of a clinical practice allows patients to be actively involved in their care coordination, track their clinical progress, and have greater access to their physicians (Grajaless et al. 2014). While this is beneficial to the patient, this information is often unavailable for research projects. Alternatively, online patient communities offer a theoretically more secure healthcare forum for patients to communicate with one another. These sites are more likely to partner with stakeholders who are interested in using online patient narratives in research that will directly benefit the patients who originally generated the data; however, a site’s terms of use may require organisations to pay or to follow certain guidelines to access the raw data, with varying standard of informing or obtaining consent from patients.

2.2 Study Designs

Studies using social media data often default to cross-sectional epidemiologic designs because they are straightforward to conduct. Metadata about the user account (such as patient gender and location) that accompanies an individual message posted to a site may be used to define prospective cohorts, bringing such research more in line with other epidemiological study designs. For example, if a medicine safety communication intervention is targeted to a high risk subset of patients (say, women of reproductive age actively seeking to become pregnant that should avoid a suspected teratogenic medicine), then individuals with the underlying disease condition who meet the high risk criteria could be identified in social media from post histories and metadata. This subset of patients could be enrolled in a prospective cohort to evaluate message penetration (say, by seeing if these individuals repost warning materials generated from the information campaign).

2.3 Social Media Listening

Early initiators (Knezevic et al. 2011; Bian et al. 2012; Wu et al. 2013; Chary et al. 2013; Abou Taam et al. 2014) presented technical modalities when social media surfaced as an untapped data source for pharmacovigilance. The general approach to social media listening remains the same, even as new tools are developed:

  • First, data are generated by users of a social media site, usually a general-purpose social network or a disease-specific patient forum.

  • Second, with permission from site administrators, unformatted text and metadata on user characteristics are transferred to servers held by the analyst.

  • Third, text is standardised and formatted for machine processing, including removal of verbatim multiplicate copies (e.g. reposts or forwards) (Sharpe 2014), perhaps with steps to preserve anonymity of social media users.

  • Then, an automated or semi-automated process is conducted to isolate the name of the medicine and the description of the suspected adverse reaction or another medicine-related problem, often with the use of purpose-built or existing publicly available medical semantic language tools. Machine learning tools are usually required to separate the indication for using the medicinal product from the suspected adverse reaction, as well as the removal of spam, advertisements, etc.

  • A further step of manual review is often executed, with vastly different amounts of human effort involved. The most intensive individual case reviews are conducted by pharmacovigilance experts, and more commonly cursory review is completed by entry-level analysts.

  • Finally, quantitative descriptive statistics are generated through summarisation, including comparisons to traditional sources of pharmacovigilance data, leading either to a publication for disseminating the evidence or to support internal decision-making, such as for risk management at a pharmaceutical company.

Social media listening to patient and other relevant various communities can be performed manually or through automated tools that filter and/or classify information acquired from social media. It is most commonly performed through a mixed method process of automatic tools coupled with manual review or curation (Tufts Center for the Study of Drug Development 2014). Automated data processes typically employ normalisation (i.e. organising data so that there is no redundancy, and ensuring related items are stored together), text-matching, and natural language processing techniques to collect and filter data, enabling researchers to amass a larger, more complete database (Sharpe 2014). Best analytical practices will likely require a hybrid approach leveraging automated and manual processes to contextualise the data. Manual work may be needed to develop taxonomies for translating colloquial phrases from social media into standardised medicine and medical condition concepts. Human curation is crucial for validating and improving outputs from machine learning tools for data classification. In essence, machine learning tools are excellent at replicating tasks that humans perform well through applying consistency. On the other hand, machine learning stumbles on tasks where discretion is involved, such as when humans disagree on classifications, highlighting the importance of human curation.

There are specific challenges with using data from social media listening in pharmacovigilance that have been well addressed in the scientific literature: determining which posts deserve manual review (Comfort et al. 2018; Alvaro et al. 2015), vernacular patient language (i.e. the language commonly spoken in the respective region as mother tongue) (Sharpe 2014; Jiang et al. 2018a; Emadzadeh et al. 2018; Cocos et al. 2017; Carbonell et al. 2015), 3326 misspellings of medicine and disease names (Bian et al. 2012; Carbonell et al. 2015), drawbacks to manual annotation of training a corpus (Jiang et al. 2018b; Gupta et al. 2018; Liu et al. 2018; Nikfarjam et al. 2015), and separating side effects from indications or benefits within a post (Liu and Wang 2018; Abdellaoui et al. 2017; Eshleman and Singh 2016; Liu et al. 2016; Sarker et al. 2016; Segura-Bedmar et al. 2015). Other issues being addressed by creative computer science include: dealing with constantly evolving internet slang and visual elements of text (e.g. emoticons, emoji) , geolocation of social media posts, maintenance costs of complex dynamic visualisation displays of real-time data, the burn-out from demands of human curation, purposefully misleading information disseminated by malicious actors using automated methods (e.g. bots), the ability to perform retrospective analyses on historical data, and the ability to remove personally identifiable information (PII) (Tufts Center for the Study of Drug Development 2014).

Social media listening can be used for a number of research purposes, including understanding aspects of medicines use and risks, or simply understanding what kind of information patients are asking for. It can also be used to understand audiences of risk communication, their characteristics, communication needs, and preferences more comprehensively for communication planning. Following a communication intervention, social media listening can be used to evaluate its impact.

2.4 Understanding Aspects of Medicines Use and Risks

Social media listening, or monitoring, involves two-way communication, where organisations engage in disseminating messages and also in listening to populations. For pharmacovigilance, insights may be obtained to serve risk assessment and provide for the contextualisation of risk—for example, what it means to patients—in communication materials.

More specifically, healthcare professionals generally underutilise voluntary spontaneous reporting systems of adverse reactions of medicines, due to bandwidth constraints precluding them from having time to submit reports. Patients and informal caregivers may be unaware of the importance or mechanisms by which to report adverse reactions. Additionally, some national authorities may be wary of becoming inundated with reports of minor side effects, as it could distract them from paying attention to more serious problems. Further limitations of spontaneous reporting—regardless of whether it is voluntary or mandatory—include significant underreporting of events, incomplete data quality for clinical evaluation, a lack of geographic diversity (most reports are from the US and Europe), persistent reporting of known adverse reactions, duplicate reports, and unspecified causal links (Sarker et al. 2015). Spontaneous reporting has been described as efficient for rare and very serious events. However, the sizeable limitations leave information gaps among regulatory agencies, healthcare professionals, stakeholders, and patients. While social media cannot fill all gaps and overcome all problems, there may be certain areas in which social media content can complement what is collected via traditional systems.

Two case studies (see Figs. 11.1 and 11.2) provide a methodological introduction and exemplify how social media listening can support understanding aspects of medicines use and risks. These examples demonstrate that social media data can provide the context of real world use of medicines, help identify safety concerns and risk factors, and offer additional information not typically captured by existing reporting systems, such as benefits or lack of efficacy. These two case studies provide interesting parallels and contrasts. Case study 1 (see Fig. 11.1) was conducted using Facebook and Twitter data by a large pharmaceutical company with considerable reliance on manual review and an annotated training corpus. Case study 2 (see Fig. 11.2) comes from an academic group that used consumer-generated product reviews from Amazon online marketplace in a highly automated manner. Both approaches revealed new insights into the safety of the substances and patient perceptions of them. A third case study 3 (see Fig. 11.3) describes how online news and social media could be used to understand infectious disease outbreaks and support safety surveillance of anti-infectives as well patients in making healthy choices.

Fig. 11.1
figure 1

Case study 1 on social media listening for routine post-authorisation safety surveillance of medicines (Powell et al. 2015)

Fig. 11.2
figure 2

Case study 2 on social media listening by analysing consumer reviews on an online marketplace for identifying potentially unsafe nutritional supplements (Sullivan et al. 2016)

Fig. 11.3
figure 3

Case study 3 on surveillance of disease outbreaks and safety of anti-infectives through social listening based on news media and social media monitoring

2.5 Understanding Audiences of Risk Communication and Their Information Needs

Since various social media platforms are used by large proportions of the general population, they can provide stakeholders with access to more diverse and comprehensive patient cohorts than those used in traditional studies (Rothman et al. 2015). Integration of traditional data sources with alternatives such as social media, partnered with rapid buy-in from key stakeholders may allow regulators, pharmaceutical industry, academia, and healthcare professionals to better understand the patient communities they serve. This in turn enables patients’ first-hand experiences to improve the care they receive (Smart Patients, Inc 2015). To leverage this effectively, methods are needed to filter out noise and distil insights from patients (Larkin 2014). A 2015 analysis of vaccine sentiments in Twitter users in the US performed illustrates the application of social media listening to better understand audiences to develop strategies and communication intervention to address their concerns. The analysis showed which themes and terms were more prevalent in positive, neutral, and negative sentiment networks. This approach could guide which messages and words to use for reaching and improving vaccine confidence in the respective populations. Methodologically, the study was performed through coding, creation of semantic networks, and their analysis (Kang et al. 2017).

2.6 Crowdsourcing of Information

Social media cannot fill all gaps and overcome all problems seen with traditional data sources used for pharmacovigilance. Nonetheless, there may be certain areas in which social media content can complement what traditional systems collect, such as data directly from patients. Traditional systems for spontaneous reporting of suspected adverse reactions are burdensome and time-consuming for healthcare professionals and patients, for whom reporting is mostly voluntary. Patients completing reports through traditional channels can take up to an hour. As a result, only 2% of reports received by the US Food and Drug Administration (US FDA) are reported by patients directly, i.e. not by or via a healthcare professional. Online and mobile tools have been developed to address barriers to reports, streamline the reporting process, and make them more user-friendly. Additional tools have been developed to perform digital disease detection in the form of online surveillance and social media listening, allowing for a more complete, accurate picture of medicinal product—adverse event pairs (Bahk et al. 2015). These tools’ hallmark is the ability to support a concept known as crowdsourcing.

Crowdsourcing tools enable stakeholders to directly engage with a patient community. Patient community outreach can be successful if conducted through social media platforms where community groups may pre-exist. These communities may look different depending on the networking site. For example, Facebook hosts pages or member groups that can be set up by any member to provide a space for dedicated discussion according to a patient population, interest group, or disease area (Bahk et al. 2015). A Twitter-based community would be organised by hashtags that identify different patient populations or concepts that are aggregated by a folder system to be easily identified through a simple query (Grajaless et al. 2014). For example, Twitter users may use the hashtag #teamnosleep to self-identify themselves as insomniacs. Social media patient communities typically openly discuss experiences with their disease(s) and/or treatment(s) that include conversations about adverse events and benefits of medicines, news in scientific journals, and official communications, such regulatory guidelines, label changes, and product recalls. Organisations can access these group members by contacting the group administrator(s) for permission to engage with members and discuss the benefits of utilising an online crowdsourcing tool (Bahk et al. 2015). Administrators may encourage the group to participate in the crowdsourcing. This could include utilising social media to share information about potential adverse reactions of a medicine among a specific patient group (Bahk et al. 2015). This method of patient engagement is illustrated in the motivation-incentive-activation-behaviour (MIAB) concept. In the MIAB concept, motivation is the reason for patient interest, and incentive is what leads the patient to act. Activation is the set of factors that lead to the patient’s actual participation, and behaviour is the activity of interest and outcome—in this case, submitting a suspected adverse reaction report (Bahk et al. 2015). It has been proven that patients are more likely to engage in activities that reduce their own burden or that provide some benefit in exchange for some equal level of effort (Bahk et al. 2015). A proven history of patient buy-in to social listening and to other digital tools for pharmacovigilance may encourage patients to participate in crowdsourcing activities. This can be seen as a more active form of two-way communication, which has implications for traditional communication efforts as well as offering opportunities.

3 Utility of Applied Methods for Researching Medicinal Product Risk Communication

3.1 Opportunities of Social Media Research

“Fast”, “cost-effective”, “large-scale”, “transparent”, “patient-generated”, “real-time” and “general usefulness” are all phrases commonly used to describe the strengths of social media listening and crowdsourcing.

Social media listening is often available prospectively and in real time, allowing stakeholders to quickly grasp disease prevalence and other epidemiological insights, the impact of a medical intervention, (like a medicine), health topics, and questions of interest to medicine users. Pharmaceutical companies often use such listening alongside launches of new medicines or post-authorisation studies to gather information on how the patient population is responding to treatments. It has also been used to determine where to host a study or launch a new product or intervention due to previously unknown medical need and patient demand (Larkin 2014). Just as importantly, medicinal risk communication may benefit from social media mining, in monitoring and evaluations of communication interventions, or even in the planning phase of communication. Reliance on online health forums for medical advice could be risky to patients; they could be misinformed by each other, improperly self-diagnose, or inappropriately use a medication. Hence, it could be beneficial to capture complex topics and confusing messages. These insights can be used to inform healthcare professional communications to patients, for example. Social media listening enables capturing a large amount of unsolicited, patient-generated data that are available publicly or with permission. End users are provided with the resulting data either in verbatim form or in aggregate, via datasets, summary reports, or visualisations. Since the population of social media users is pre-existing, this method is thought to be cost-effective for the potential amount of data and information gathered from these sources (O’Connor et al. 2014). To collect, clean, analyse, and visualise the same volume of data from other sources would take years, and the timely actionable insight provided would be limited due to the time required to disseminate results (Donahue 2012).

As patients become more knowledgeable about their medical conditions, their articulation of first-hand experiences and perspectives contribute to a valuable data source that can improve the care they receive (Smart Patients, Inc 2015). The widespread use of social media platforms provides communication researchers and practitioners with the ability to understand and design communication interventions for populations that would otherwise be hard-to-reach audiences. The use of new technology and the rapid uptake of social media will provide for better responses to the patient communities they serve.

Many patients report a lack of trust in healthcare professionals, preferring to share information with fellow patients and caregivers (Peacock 2015). Since some diseases, specifically rare diseases or those with social stigma, are associated with an isolating experience that can span several geographical areas, many individuals look to social media to communicate with their peers (Peacock 2015). These patient forums offer anonymity and privacy that may result in patients providing unfiltered data that are more readily available than data from traditional sources. This content can be incredibly beneficial to organisations leveraging social media listening as a research tool: these conversations are unsolicited, and often unfiltered and unabashed. Online discussions among patients about medicines often extend to wider aspects of use, such as off-label use (i.e. use with a medical purpose not in accordance with the terms of the marketing authorisation), as well as issues with product quality, formulation, handling and disposal, sensitive or stigmatised topics, and reluctance to adhere to treatment due to troublesome adverse reactions.

Crowdsourcing offers the opportunity to specifically solicit information on medicines’ use behaviours, risk knowledge and perceptions, communication needs, and preferences as well as feedback on communication events.

Finally, information from patient populations may reflect preconceived notions of shared beliefs due to community mentality, which should be considered in research projects. A carefully planned social listening campaign that accounts for nuances of social media data and potential biases gleans insights from a diverse range of global patient populations.

3.2 Limitations of Social Media Research

While social media data may be readily available in unprecedented volumes, these data represent unsolicited responses, often making it challenging to understanding its quantity or quality. Once personal identifiers are removed from social media data, it is impossible—and ethically challenging—to verify a reported adverse event by following up with a social media user. Additionally, it is difficult to validate the information until data from traditional sources are available for a comparison analysis. Despite the exuberance generated by the potential of social media mining, in practice there has been a vigorous and necessary debate about the practical application of social media mining for pharmacovigilance. In fact, multiple recent, sophisticated, large-scale efforts and systematic reviews have concluded that routine use of social media for pharmacovigilance underperforms pharmacovigilance data collection systems, including industry-dominated traditional reports of suspected adverse reactions submitted to national authorities (Rees et al. 2018; Convertino et al. 2018; Caster et al. 2018; Kheloufi et al. 2017; Pierce et al. 2017). Others have acknowledged these limitations and noted that social media may fill niche knowledge gaps in medicine safety or may require the use of more sophisticated computing tools (Lardon et al. 2018; Bousquet et al. 2018; Anderson et al. 2017). In most cases of serious adverse reactions identified by regulatory authorities, vigilant physician reporters were the most consistent and earliest source of information on new safety signals, compared to social media.

The authors of the largest evaluation to date (Caster et al. 2018) identified key limitations. In their evaluation, they analysed more than two million Twitter, Facebook, and patient forum posts, using an automated Bayesian classifier and purpose-built patient vernacular dictionary to assign risk scores to posts. Two reference datasets of known positive and negative controls were used for comparison. In addition, a major global database of adverse reactions (i.e. VigiBase) was used in head-to-head comparisons with social media. The analysis calculated traditional pharmacovigilance reporting disproportionality ratios for each medicine in social media and compared them against controls. The results were extensive and decisive: “This study investigated the potential usefulness of social media as a broad-based stand-alone data source for statistical signal detection in pharmacovigilance. Our results provide very little evidence in favour of social media in this respect: in neither of the two complementary reference sets, containing validated safety signals and label changes, respectively, did standard disproportionality analysis yield any predictive ability in a large dataset of combined Facebook and Twitter posts… [M]anual assessment of Facebook and Twitter posts underlying 25 early signals of disproportionality showed that only 40% of posts contained the correct drug and the correct event as an adverse experience, and for only three of those 25 signals did the posts strengthen the belief in a causal association” (Caster et al. 2018). The authors offered some possible explanations. First, some medications may have very little discussion in social media channels. Second, identifying rare events in social media may be difficult if the specific colloquial terms are not detected, and the underlying algorithm to detect adverse reactions may have limited detection ability for the types of very rare events of interest to safety reviewers. Third, there is possible bias when comparing social media results to established reference or validation datasets of known signals. Relatively few reference datasets are in public scientific literature, and the nature of the comparison can vary greatly. Fourth, using statistical aberration detection methods originally optimised for traditional pharmacovigilance systems may not be appropriate for social media-based applications (Caster et al. 2018).

In relation to medicinal product risk communication research, like many other data sources, social media data have inherent biases that must be considered when interpreting results. Biases specific to social media data result from each social media network having its own user demographic profile, making it difficult to generalise findings to a larger population of patients who may not fit this profile. This could, for example, influence the provision of useful data pertaining to medicines most commonly used by specific populations, like older or paediatric patients. In addition, certain brands or types of medicinal products may be represented differentially in the social media; thus, an organisation ought to consider determining how often products are discussed online prior to launching a social media research project. Another bias dimension of using unstructured text is literacy bias. Individuals with limited written language skills will only be represented in the data if someone else posts about their experiences for them. The use of emoji and voice-to-text tools may be able to mitigate some of this bias.

For some products, such as medicines against the human immunodeficiency virus (HIV) and acquired immune deficiency syndrome (AIDS), or hepatitis B, individuals may not be willing to communicate publicly about their treatment experiences due to stigmas associated with their diseases and concerns about being identifiable. This could result in bias due to large self-selection or incomplete information sharing. Honest conversations are more likely to be found on specific patient forums as opposed to on public social media sites. Moreover, if patients suspect that they are being monitored, they may go elsewhere to post comments about their disease or treatment regimen, posing a risk to social listening projects.

There is another issue to consider. The need to improve health outcomes, increase safety and safe use of medicines and manage risk are major drivers behind collecting patient data. Communication practitioners and researchers should note that as more data have been collected, concerns about privacy have grown beyond patient privacy. Notably, one of the biggest lessons learnt from using social media for pharmacovigilance is that patients will talk. While this may seem to many like a treasure trove of information, there is major concern that patients will become unblinded when social media is used alongside clinical trials (Lipset 2014). This occurred during a 2009 clinical trial when a patient discovered that she had been placed on the study product (as opposed to placebo or comparator) (Lipset 2014). This realisation led to more individuals seeking online patient communities to share symptoms and compare notes about pill formulations and taste to try to determine which treatment they were receiving. Many patients do not understand the consequences of these interactions, which could end a clinical trial early, and delay or even prevent a new treatment from becoming available to other afflicted patients. This underscores the importance of clinical trial subjects understanding that their social media discussions may compromise randomisation and be an inherent threat to validity in clinical trials. Such discussions among clinical trial participants should be discouraged or sequestered while the clinical trial is underway. Social media monitoring for such discussions could therefore be useful to proactively understand this threat to clinical trial validity.

3.3 Ethical and Legal Aspects

Regulatory guidelines and best practices are slowly emerging regarding when and which organisations have the legal responsibility for mining patient narratives through social media listening (Lengsavath et al. 2017). The regulatory dimensions are addressed as part of the WEB-RADR project (web-radr.eu) (Ghosh and Lewis 2015) and by a few authors (Sloane et al. 2015; Lengsavath et al. 2017; Naik et al. 2015). Despite the ambiguity and evolving regulatory environment, major pharmaceutical companies have executed social media listening projects in recent years (Powell et al. 2015; Comfort et al. 2018; Caster et al. 2018). Currently, the most evident disadvantage to using social media for research relating to medicinal product safety and communication is the lack of regulatory guidance and best practices regarding the use of social media data.

Social media listening also poses ethical and privacy concerns, especially within private online communities (Stergiopoulos 2014). To meet moral obligation, many organisations will only listen in and/or engage with patients on public social media platforms once they have announced their affiliation and presence to the patient(s) (Stergiopoulos 2014). In addition, ethical and privacy regulations are distinct across different geographic regions. Hence, organisations that wish to engage in social media listening must be cognisant of these differences to avoid or address privacy breaches in a timely manner (Stergiopoulos 2014). Due to the speed at which information travels on social media, a researcher may benefit from considering issues that may arise from inappropriately using social media (Stergiopoulos 2014).

Pharmaceutical manufacturers must also consider, as part of their protocol, how to conduct social media listening activities in a way that addresses liability and compliance, meeting regulatory requirements. Legally required reporting of suspected adverse reactions necessitates patient information. This poses a challenge in social media listening, as there is limited ability to confirm that individuals are using their true identity when posting on social media sites, or to approach them if they are obviously using an alias name. When monitoring social media alongside clinical trials, this challenge becomes more complicated, as there is often no way of confirming a patient’s participation in a specific clinical trial (Thompson 2014). Furthermore, even if a person can be confirmed as a trial participant, there would be no way of confirming in which arm of the trial a participant is participating, which treatment(s) that participant is receiving, or if any adverse event reported in social media has already been recorded and dealt with appropriately (Barry 2014). It is therefore highly recommended that legal and compliance departments review the use of any social media for recruitment or use alongside a clinical trial, prior to the start of social media listening activities (Dizon et al. 2012). This practice could also be subject to institutional review board (IRB) approval and require compliance with national privacy laws (Dizon et al. 2012). Alternatively, the rules and requirements for surveillance campaigns and observational studies are often less scrutinising. Therefore, it is important to determine the feasibility of using social media for a specific project prior to committing resources.

When considering the use of a third-party vendor to acquire social media data, an organisation should ensure that the vendor meets all compatibility and accountability standards required for the research project as well as provide all needed software services. The regulatory and societal expectations of privacy with social media data are rapidly changing and should be considered in earnest to maintain the credibility and viability of the research effort.

More specifically, Appendix 11.1 provides an introduction to the data protection regulation applicable in the European Union (EU) and derives some globally applicable principles.

4 Outlook: Relevance, Improvements and Future Potential

As a field, we are at a crossroads in pharmacovigilance. The potential of social media is hard to deny, but the execution in relation to the collection of adverse reactions has born little fruit (Rees et al. 2018; Convertino et al. 2018; Caster et al. 2018). Yet, many researchers regularly derive new insights from monitoring social media content (Lardon et al. 2018; Kurzinger et al. 2018; Patel et al. 2018; Keller et al. 2018; Chen et al. 2018). One research article’s title summarises this succinctly: “Descriptions of adverse drug reactions are less informative in forums than in the French pharmacovigilance database but provide more unexpected reactions.” (Karapetiantz et al. 2018). This may very well be the key insight from the past decade of efforts to understand the role of social media for collecting adverse reaction data; given that any surveillance system is inherently designed to identify what is expected, as broadly defined among the scope of outcomes. The challenge for the future will be to narrow the scope of inquiry and to focus on social media mining applications that are most likely to generate new knowledge; our focus to date has been on information more generally. When considering an assessment of a new safety concern with a medicine, evidence from animal studies, laboratory findings, clinical trials, pharmacoepidemiological studies, and treatment experience all come into play. Machines do not appear to be on the cusp of replacing this complex human assessment in the immediate future; perhaps, harvesting new knowledge from the exuberant promise of social media will require the development of automated multi-factorial safety reviewing.

A further objective of social media research for pharmacovigilance purposes is to capture information about patients and medicinal products through a patient-centric lens. This is achieved by turning to social media to amplify the patient voice to understand patients’ knowledge, attitudes, and behaviours—to understand them as audiences of our communication—and to collect data which help evidence-based planning and evaluating of communication interventions that support informed therapeutic choice and safe use of medicines. Social media is a communication channel, which is an important research topic in itself. Such research may determine who uses social media and how, with a view to inform communication strategies for incorporating the social media not only for listening but also messaging. Beyond pharmacovigilance per se, social media data present the tantalising possibility of providing insight into how physicians communicate with each other (Albarqouni et al. 2019; Graff et al. 2018; Falzon et al. 2016), topics that patients want to know more about (Charlie et al. 2018), and how the public reacts to health news in real time (Adams and Schiffers 2017). These broader dimensions of medicines safety and communication have not yet been evaluated in social media adequately.

In conclusion, social media listening and crowdsourcing of information provide a timely and insightful complement to traditional methods for medicinal product risk communication research, and is applicable globally. Given people’s increasing use of the internet and social media, and patients’ views on the prospects of its utility for data gathering in support of patient-centred care (see Chap. 16), the emerging discipline of social media research is becoming an essential part of a multidisciplinary and multilayered approach to medicinal product risk communication research (see Chap. 1). As a source for data on real-time patient discussions, social media can be used to understand aspects of use of medicines in healthcare, information needs and adverse reactions as characterised by patients, as well as to monitor and improve risk communication efforts. Online discussions among patients about medicines often extend to wider aspects of use, such as off-label use, and issues with product quality, formulation and handling and disposal, and even reluctance to adhere to treatment regimens due to adverse reactions experienced by the patient. Social media can also be used to identify specific patient groups for soliciting perspectives on certain safety concerns and risk communication needs. Lastly, as social media listening and crowdsourcing information gains traction as a viable source for insights, it will become necessary to acknowledge its myriad challenges—in particular inherent noise, incomplete data when follow-up is impossible, privacy and patient protection, and lack of regulatory guidance. More coordinated research among academics, regulators, pharmaceutical industry, and subject matter experts is needed to develop best practice guidance. Practical solutions that adequately address these social media research challenges without impacting the usefulness of the data for pharmacovigilance, including improving communication about risks and safe use of medicines, will be of utmost importance.

Conclusions

  • Social media research can provide a timely and insightful complement to traditional data sources for pharmacovigilance as well as medicinal product risk communication research, in particular for planning and evaluating of communication interventions.

  • As a source for real-time patient discussions, social media listening can facilitate understanding aspects of use of medicines in healthcare, adverse reactions as characterised by patients, audiences and their information needs as well as help monitor and improve risk communication efforts. Online discussions among patients about medicines often extend to wider aspects of use, such as off-label use, as well as issues with product quality, formulation, handling and disposal, sensitive or stigmatised topics, and reluctance to adhere to treatment due to adverse reactions.

  • Social media can also be used to identify specific patient groups for soliciting perspectives on certain safety concerns and risk communication needs, an approach called crowdsourcing for information.

  • Social media is an evolving global communication channel. Understanding who uses these media and how is important for informing communication strategies, for both listening and tailoring messaging.

  • Social media research needs to consider specific potential for bias as well as ethical and legal concerns. Therefore, more collaboration is needed among researchers, regulators, the pharmaceutical industry, and subject matter experts. This collaboration is critical to develop best practice guidance and practical solutions that adequately address these challenges without impacting the usefulness of the data for pharmacovigilance and communication about risks and safe use of medicines.