Background

Industrialization and diet- and lifestyle changes have dramatically increased diseases of lifestyle. Currently, seven of the top 10 causes of death globally are non-communicable diseases, accounting for 74% of all deaths, with diabetes the ninth leading cause of death [1].

The forefront of diabetes medicine includes medicines targeting insulin production [2,3,4,5,6,7,8,9,10], such as semaglutide, a human incretin glucagon-like peptide-1 (GLP-1) receptor agonist, injected subcutaneously to increase insulin production [11], for weight loss in diabetic patients with obesity. Users report suppressed appetite and hunger [12], and marked weight loss [13], which reduces after a year of treatment [14], and reverses when treatment stops [15, 16]. Novo Nordisk’s semaglutide (administered weekly) [3] results in greater weight loss with long-term use than its predecessor liraglutide (administered daily) [17].

After clinical trials [2, 11,12,13, 15, 16, 18,19,20,21,22,23], the U.S. Food & Drug Administration approved two injectable semaglutide formulations (Ozempic and Wegovy) only “as an adjunct to a reduced calorie diet and increased physical activity for chronic weight management in [obese or overweight] adult patients” [24,25,26]. Oral semaglutide (Rybelsus) was approved later [27]. Semaglutide received regulatory approval in the USA, the EU, and Australia, amongst others, in the period 2017–2019 [28].

One of the most common comorbid conditions alongside diabetes is obesity—nine of every ten adults diagnosed with type 2 diabetes mellitus (T2DM) are overweight or obese [29]. Conversely, obesity is a key risk factor for developing T2DM [30,31,32,33]. The strong weight-reducing property of GLP-1 receptor agonists makes them the preferred medical treatment for diabetic patients with obesity [26, 34].

Social media has impacted society profoundly in only two decades (e.g., Facebook was founded in 2004, Reddit in 2005, Twitter in 2006, and TikTok in 2016). Initially, studies reflected on how social media would change public health [35] but gave little thought to the possible deleterious effects of social media on public health.

Social media influencers are increasingly powerful and marketable, and their ability to influence consumer behavior [36, 37] is well established and worth millions—although this follows a power-law distribution where a few influencers wield the most influence and earn the most income [38, 39].

The overlap between overweight/obesity and diabetes has converged with social media trends, giving rise to a new health challenge where T2DM patients on semaglutide prescription find themselves competing with non-diabetic overweight/obese people using semaglutide off-label for weight loss [40,41,42], which regulators [43], conventional media [44,45,46] and recent academic literature [42, 47,48,49,50] attribute to semaglutide promotion by social media influencers/celebrities. Some examples from cited news pieces include: the Dr. Oz show featured a New York Times article [51] on 15 February 2021 [52]. Kim Kardashian wore Marilyn Monroe’s “Happy Birthday, Mr. President” dress to the Met gala (2 May 2022), with rife (but unconfirmed) internet speculation that she used semaglutide to lose weight to fit the dress. Elon Musk tweeted about semaglutide on 24 April 2022 that “… semaglutide (aka Ozempic/Rybelsus) appears to be effective in appetite control with minor side effects” [53]. Then on 22 October 2022 a tweet thread asked him “Hey, @elonmusk what’s your secret? You look awesome, fit, ripped & healthy. Lifting weights? Eating healthy?” and he replied: “Fasting…. And Wegovy” [54]. And on 23 May 2023 he replied in a weight loss thread that “Semaglutide actually works” [55]. Sometimes, celebrities actively denied using semaglutide, but internet speculation still held that semaglutide was behind their sudden weight loss, as with Adele when she posted a photo of herself on Instagram on 5 May 2020. While most of these examples are of US-based social media influencers, it appears that their influence has stretched across the world [44, 46].

The growing semaglutide interest fed a surge in off-label prescribing, leading to a worldwide shortage of semaglutide, as Novo Nordisk battled to produce semaglutide in enough quantities to meet the surge in demand. Medicines regulators in numerous countries (e.g., the Therapeutic Goods Administration (TGA) in Australia) issued warnings of semaglutide shortages, which the TGA attributed directly to off-label prescription driven by viral TikTok videos “about achieving rapid weight loss with Ozempic” [43, 56].

While conventional channels remain useful for monitoring such trends, the availability of internet data presents epidemiologists with means to more rapidly monitor developing trends—a field now called infodemiology [57, 58]. Although social media surveillance holds promise for infodemiology, it is complicated by social media platforms limiting or changing access to data, the constant flux of the social media landscape, with changing user numbers, and new platforms (e.g., TikTok) appearing and attracting large user bases, and the changing nature of social media content itself, transitioning from primarily text to image and now video—presenting new computing challenges. Some studies examining social media and semaglutide have ventured into this area, with analyses of content from one or more of Reddit, TikTok, YouTube, and X [48, 49, 59,60,61,62,63,64]. A different, but promising, data source for infodemiology is internet search data, e.g., Google Trends (GT) [65, 66]. After email, internet search is the most common internet activity and although internet search data are not as rich as social media data, they do provide important insights into the motivations of search users [67, 68], when used appropriately [69]. While T2DM is a legitimate health concern, weight loss, arguably, has resonances with both aesthetics and health [49]. Many studies have used GT to examine trends in aesthetic surgery [70]. The influence of celebrities on search interest in aesthetic procedures has also been demonstrated [71]. Very few studies have investigated semaglutide interest using Google Trends data [42, 72] and these did not employ appropriate analyses for autocorrelated time series data. Our study aims to provide a panoramic perspective on worldwide GT data related to semaglutide, exploring global trends that should be succeeded by more detailed local investigations.

Methods

To capitalize on Google’s machine learning classifiers which group related user queries into categories (important to overcoming data variations related to misspelling, different languages, and query variations), we used the topic “semaglutide”, expressed by the Google KnowledgeGraph ID: /g/11dyzd5snl [73, 74] (hereafter: Semaglutide).

We used the GT Extended for Health (GTEH) Application Programming Interface (API) and the GT extraction tool [75] for all the GT data extractions described below. First, we retrieved six samples of regional online interest (ROI) per country (worldwide list of region values) from January 2019 to December 2022. Google scales the highest value amongst all regions to 100 and expresses all other region values relative to that. Figure 1 shows the median ROI for all countries (and enlarged for Europe). We selected all countries with a median ROI across samples ≥ 20 (Additional file 1: Table S1) for this study, as we estimated from past experience [76] that these countries would provide acceptable amounts of non-missing data for further analysis.

Fig. 1
figure 1

Regional online interest worldwide (a) and with an enlargement of Europe (b)

We extracted and compared the top search queries for each country, for the years 2021, 2022, and 2023 (January to August), which showed that the Semaglutide topic adequately captured information for the other possible search terms (such as “semaglutide”, “Ozempic” or “Wegovy” as terms). Also, as noted, Semaglutide allowed us to capture search interest for the same overall topic in countries where the proportion of English speakers was lower (Additional file 1: Table S1).

We downloaded 30 samples of GTEH daily search probabilities for semaglutide, for each target country, from January 2019 to August 2023. Google provides the GTEH data as unscaled search probabilities (multiplied by 107) to registered API users, unlike search values from the GT website which are scaled to 100 for the highest value. GTEH data allowed us to compare search values across time and between countries, as the values reflect the probability of a search given the number of users in the time frame in the geographical location (i.e., all the probabilities are comparable relative to the underlying population size and time frame). We aggregated the 30 samples for each country using the median daily value [77]. To visualize the trends over time, we aggregated these data into the median for each week for each country and to examine the effect of specific social media posts, we used the aggregated daily data.

We retrieved the headers of all news articles mentioning semaglutide (additionally using the brand names Ozempic, Wegovy, and Rybelsus) contained in the ProQuest news data service (for magazines, newspapers blogs, podcasts, and websites; query string shown in Additional file 2) and aggregated these by date to create an estimated time series of the amount of news coverage for semaglutide for the study period.

Granger causality test

Granger causality analysis is a statistical technique which evaluates whether historic values of one time series (the “indicator” series) have usefulness in forecasting future values of a second-time series, beyond the predictive capability of past values of that second series alone [78]. This examines predictive relationships rather than determining true causation between the data series. The methodology involves fitting two vector autoregressive (VAR) models and comparing their performance. Given two stationary time series xt and yt, one VAR model regresses yt on p lags of itself and q lags of xt. The second restricted VAR excludes the xt lags, including only p lags of yt as predictors.

$$y_t=\alpha_0+\sum\nolimits_{j=1top}^m\alpha_jy_{t-j}+\sum\nolimits_{j=1}^m\beta_jx_{t-j}+\epsilon_t$$
(1)
$$y_t=\alpha_0+\sum\nolimits_{j=1top}^m\alpha_jy_{t-j}+\epsilon_t$$
(2)

where.

yt = the predictive time series variable at time t.

α0 = the constant.

α1 = coefficient on the p lags of yt.

yt-j = lagged value of yt.

βj xt-j = lagged values of the indicator time series xt.

ϵt = error term at time t.

The null hypothesis is that

$${H}_{0}:{\beta }_{1}={\beta }_{2}=\cdots ={\beta }_{m}=0$$
(3)

An F-test compares the restricted and unrestricted model fits. If the unrestricted model with xt lags results in a statistically significant improvement per the F-test, then xt is said to “Granger cause” yt—the history of xt contains useful information for predicting xt above and beyond just the history of yt.

$$F=\frac{\left(\frac{\left(RS{S}_{1}-RS{S}_{2}\right)}{{P}_{2}-{P}_{1}}\right)}{\left(\frac{RS{S}_{2}}{n-{P}_{2}}\right)}$$

where n is the number of data points, P1 and P2 are the number of parameters in (1) and (2), and RSS is the residual sum of squares.

Determining time lag between time series (p)

The vector auto-regression (VAR) model was used to determine the optimal lag length (annotated as p) to accurately capture the time lag of interest between the two time-series for the regression model. To quantitatively determine p for each time series pair, we considered the Akaike Information Criterion (AIC) [79] and Bayesian Information Criterion (BIC) [80]. Based on the AIC and BIC values for each estimated VAR model, we set the maximum lag length p equal to 3 when fitting the VAR models for the Granger causality tests. Granger analyses were calculated with Stata [81] and results were visualized with R [82].

Trend analysis

Joinpoint regression analysis fits segmented linear regression models to data to model non-constant trends. We used the Joinpoint Regression Program (version 5.0.2) [83] to calculate average weekly percent change (AWPC) from joinpoint regression models to quantify the rate of change in temporal semaglutide online search trends in each country. While slopes may vary at different joinpoints, AWPC averages these to provide an overall directionality of the trend. Joinpoint regression identifies periods with differing trends between joinpoints and estimates the slope within each segment. AWPC averages slope coefficients to quantify the mean annual percent change over the whole analysis period from 2019 to 2023 for the semaglutide searches in each country. We also modeled the monthly percent change (MPC) in semaglutide online searches for the same period using semaglutide search probabilities as the dependent variable and months as the independent variables. For MPC analyses, we used a weighted Bayesian Information Criterion (WBIC) to select the final model and allowed a maximum of five joinpoints.

Thematic analysis

We downloaded 30 samples of the top GTEH queries related to Semaglutide and aggregated the samples by including all queries, regardless of how many samples they were present in, and taking the median scaled value across all samples. We obtained the top queries for the full-time period (January 2019–August 2023) as well as every quarter (with the last being only July–August 2023). We used PROC HPTMINE natural language processing (NLP) of the SAS Text Miner [84] to compare all the top queries across all the quarters for each country and classify them into themes.

Results

Overall, there is an upward trend in semaglutide online searches observed over time. Figure 2 displays weekly Google search interest for semaglutide from January 2019 through August 2023 across different countries. In 2019, search interest was nearly flat, followed by a slight year-on-year increase starting from January 2022. Interest then exponentially increased across all countries, with the highest search probability observed in the USA in May 2023.

Fig. 2
figure 2

Per-country semaglutide Google searches from January 2019 to August 2023. Semaglutide (topic) online search probability (× 10 million); ROI: regional online interest values (Scale 0–100)

We conducted Granger causality tests to examine whether the ProQuest-derived estimate of semaglutide news coverage could predict online search probabilities in all of the countries individually and whether the news coverage combined with the search probabilities in one country could predict the search probabilities in other countries (i.e., whether any countries were ahead of other countries in search probability). A left-skewed distribution of p values corresponds to an indicator variable with a strongly leading relationship across other countries. Figure 3 describes the p values of the leading indicators’ relationships. The ProQuest news coverage estimates best predicted the trends in several countries at lag-3 (i.e., at a 3-week time gap), and the news-and-country combined predictions for other countries were especially strong for Germany and the UK, while several countries showed high variation in the predictive relationship strength to semaglutide online searches. The significant relationships are plotted in Fig. 4 and the underlying p-values for each relationship are tabled in Additional file 3: Table S2.

Fig. 3
figure 3

Granger causality significance test for news about semaglutide and its impact on Google searches, 2019–2023

Fig. 4
figure 4

Granger causality significance tests (p < 0.01) for ProQuest as a sole and combined predictor of search probability trends in all countries. Red lines indicate significant paths for ProQuest. Black lines indicate significant paths between countries. Continents coded as follows: Africa (purple), Australiasia (orange), Europe (green), Middle East (blue), North America (red), South America (yellow)

The estimated AWPC values and accompanying 95% CIs quantifying the weekly interest in online search interest from January 2019 to August 2023 are listed for each country in Table 1, with AWPC over the 5-year period (247 weeks) quantified through joinpoint regression modeling. Overall, statistically significant increasing trends were evident in most countries ranging from 0.1% (95% CI 0.1–0.2) in the UAE and 1.7% (95% CI 1.6–1.7) in the USA.

Table 1 Average weekly percent change in semaglutide search probability by year and country

In 2019 only five countries showed increasing online search trends, with Brazil reporting the highest change at 3.0% (95% CI 2.7–3.4, p < 0.001), followed by Canada at 1.6% (95% CI 1.1–2.1, p < 0.001) and the USA at 0.6% (95% CI 0.4–0.7, p < 0.001). Most countries exhibited positive weekly trends between 2021–2022, while a negative trend was observed in some countries in 2023. In 2021, Australia (3.7%, 95% CI 3.1–3.6, p < 0.001) recorded the highest search trend, while Belgium, Saudi Arabia, Serbia, and South Africa showed negative shifts. All countries demonstrated consistently increasing semaglutide search trends per week in 2022, with seven exhibiting declines in 2023.

The joinpoint regression analysis revealed multi-phasic non-linear patterns in semaglutide search probabilities across the 27 countries examined from 2019–2023 (Table 2). Until mid-2020 most of the countries demonstrated a significant downward trend in monthly semaglutide search probability, except for the Netherlands (MPC 0.7%, 95% CI 0.0–1.3), the USA (2.4%, 95% CI 1.3–3.2) and Brazil (16.7%, 95% CI 14.3–19.9). These decreasing phases reversed to rising trends beginning around mid-to-late 2020 where the monthly search probabilities significantly increased to peak levels in 2021–2022 across all countries. However, the scale and duration of uptrends varied substantially by country. For instance, semaglutide monthly search probabilities peaked in Brazil from April to September 2020 (24.6%), while Australia (16%) the UK (50.5%), the USA (23.0%), and Spain (16.5%) peaked in late 2020. In 2021, peak search probabilities were observed only in Romania (3%), Germany (13.7%), and Poland (31.1%). Peak upward trends in semaglutide search probabilities were observed in the Netherlands, Norway, Poland, Saudi Arabia, and the UAE during early 2022, followed by Ireland, Serbia, South Africa, Sweden, and Switzerland in late 2022.

Table 2 Monthly trends in semaglutide search probabilities

The thematic analysis for each country across the full-time period is shown in Additional file 4: Tables S3–S29). There were some important differences between countries (e.g., which countries did and did not include a theme of searching for semaglutide side effects in general, and also for a particular side effect, known as “Ozempic face” [85]). There were also similarities with regional variation. For example, many countries contained queries about how to obtain semaglutide, often including the name of the dominant pharmaceutical retailer for that particular country (e.g., Chemist Warehouse in Australia, Boots in the UK, Dischem in South Africa). Many countries included searches related to buying Ozempic, even in the local language. Germany identified a strong theme related to buying Ozempic without a prescription (e.g., “Ozempic kaufen ohne rezept”). Almost every country contained a theme around semaglutide and weight loss searches, but only two countries (Germany and the U.S.) had themes around semaglutide and diabetes. Additional file 5 and Additional file 6 show the relative search priority for all the various queries over time (by quarter) for each country. Because the single-term search queries “Ozempic” and/or “semaglutide” (or regional spellings of semaglutide) tended to dominate, we show the relative search priority with these terms in Additional file 5: Figures S1–S27, and without them in Additional file 6: Figures S28–S54 for a clearer focus on the other topics included in searches. These plots reveal that regional Ozempic searches (e.g., “Ozempic Australia”) were still very strong and also, that the searches related to weight loss tended to figure quite strongly (although there was variation between different countries). Furthermore, weight loss searches did not start at the same time in all countries. For example, Additional file 6: Figure S54 shows that “Ozempic weight loss” and “weight loss Ozempic” searches were both prominent in the USA, and both were already present in the first quarter of 2019, as also in the UK, Ireland, and Canada, while in South Africa and Australia, these searches were less prominent until late 2021. More countries showed diabetes as a theme in the quarterly data, but this theme showed low search interest in those quarters in which it did appear.

Lastly, Fig. 5 shows the raw GTEH search probabilities for 25 of the 27 countries. Some of the events mentioned in the introduction are plotted to show their relation to the timeline. Specific charts for each country, which allow more nuanced analysis, are shown in Additional file 7: Figures S55–S81. It appears that the Dr. Oz alert coincided with a strong spike in search interest in sixteen countries shown in Fig. 5, but that the various Elon Musk tweets and the 2022 MET gala event did not show similar correspondence to search trends. Also, many countries showed spikes in search interest which did not correspond to those listed, which investigations with a specific regional focus might uncover.

Fig. 5
figure 5

Google Trends Extended for Health Raw search probabilities for twenty five countries plotted against social media events

Discussion

This study demonstrated semaglutide search interest in numerous countries across the world. The shifts in search probability for each country have coincided less with the per-country approval dates of semaglutide, and more with the media reporting and social media discussion of semaglutide. Our study focused on internet search data, rather than social media data. Examination of discussions on social media could provide additional context but is hampered by continuing changes to the level of data access that for-profit social media companies grant researchers (e.g., full and free researcher access to Twitter through its v2. API was restricted after rebranding to X). Further challenges are the amorphous landscape of social media platforms (e.g., TikTok launched internationally in September 2017, simultaneous to the first regulatory semaglutide approvals), and the shift of social media content from text to image to video, with corresponding analysis challenges. Nonetheless, finer-grained analyses would allow a detailed examination of the exact media and social media drivers around interest in semaglutide in each country. More detailed analyses of country-specific semaglutide searches could examine the full context of the top searches for semaglutide within smaller time frames (e.g., were searches only for semaglutide, or for semaglutide and weight loss, or for how to purchase semaglutide). This additional information is also available from GT, although will require additional data extraction specific to each country. Using the joinpoints identified per country in this study, researchers could identify key changes to the “semaglutide narrative” in each country. For example, researchers could examine the themes identified in the top search queries for smaller time frames, and how these themes might change over time (i.e., whether, and when, new themes emerge, and existing themes decline), and then also track the search patterns for these themes (and not only the specific queries). This more detailed information could be combined with regulatory and other information to tease apart the impact of regulatory communication, media information, and more. Also, the impact of specific social media events within each country’s timeline could be tested using interrupted time series analysis.

The popular appeal and the number of offerings (both competing medicines and different formats available on the market) [86] of anti-obesity medications have exploded. For example, after Ozempic, Novo Nordisk released semaglutide in different formulation strengths specifically targeted at weight loss (as Wegovy) and then subsequently as Rybelsus, an oral formulation [18, 87, 88]. Both traditional media exposure and social media hype continue to fuel a preference for seeking medical, not lifestyle, interventions to address the disease of lifestyle [47, 89]. As such, little doubt exists that the problems recently experienced with semaglutide merely foreshadow even greater future medicine-access societal inequalities in the domain of noncommunicable diseases [47, 90, 91]. Searches of the medicines regulator websites of many countries examined in this study (e.g., https://‌www‌.gov‌.uk‌/search‌/all‌?keywords‌ = ‌semaglutide‌&‌order‌ = ‌relevance or https://‌www‌.tga‌.gov‌.au‌/search‌?keywords‌ = ‌semaglutide‌&‌submit‌ = ‌Search) have revealed numerous news alerts related to, initially, supply shortages [43], and more recently, alerts of counterfeit semaglutide on the market [92, 93]. These counterfeit medicines pose a serious health risk to a significant proportion of the population. Related to counterfeiting is the response of compounding pharmacists, compounding semaglutide to supply large numbers of patients without proper medical supervision, which, at least in Australia, is illegal [94, 95]. This is evidenced by multiple postings on online marketplaces such as eBay or Facebook Marketplace, where doses of semaglutide are offered for hundreds of dollars, which may be counterfeiting, compounding, or just scams [42, 50, 96, 97].

Even in pure form, GLP-1 receptor agonists have side effects, especially gastrointestinal [98,99,100,101,102]. One trial examining long-term use found over 80% of respondents reported “mild to moderate” gastrointestinal side effects for both daily liraglutide and weekly semaglutide [17]. Another trial detailed these adverse events as typically “nausea, diarrhea, vomiting, and constipation” and while reported as “mild-to-moderate in severity,” they were “transient, and resolved without permanent discontinuation of the regimen” [23]. However, the trial’s supplementary data show that constipation (second only to nausea in incidence) was less transient. Given these high rates of adverse events, and an as yet unverified adverse event signal of suicidal thoughts and self-injury [49, 103,104,105] (albeit with some recent contradictory evidence [106]), it is not advisable that GLP-1 receptor agonists be administered without proper medical consultation.

For all of these reasons, continued surveillance of the demand for GLP-1 receptor agonists such as semaglutide must continue, and internet search surveillance can play an important role in this infodemiological application. Although GT web data provide some indications of search activity, the fact that these values are scaled complicates comparisons between countries and time periods. Researcher-facing API services, such as GTEH provide invaluable insights into the online activity of people across all walks of life.

Our study has several limitations. Our use of the Semaglutide topic relies on Google’s proprietary (i.e., non-transparent) algorithms to aggregate searches related to semaglutide. This allowed us to transcend the language barrier for the global perspective of this paper, but the degree to which Google’s algorithms correctly classify searches for each individual country is unknown [107, 108]. Future studies focusing on individual countries should broaden the scope of search terms to include both Semaglutide and local terms relevant to the topic [109]. Our study also did not investigate the impact of the COVID-19 pandemic on search interest in semaglutide. There is a large body of literature showing correlations between Google Trends searches and COVID-19 hospitalizations in various countries, but these studies mostly only correlate a single sample of GT website COVID-19 search interest with local data and do not account for media influence. Even though there may be some similarity between the gastrointestinal symptoms of COVID-19 and the gastrointestinal side effects of semaglutide, and even though lockdowns might have led to searches for medical weight loss options in the presence of decreased physical activity, a cursory examination of the search probabilities shown in Fig. 5 appears to indicate that semaglutide search interest measured through the Semaglutide topic was not affected by the COVID-19 pandemic. This is further corroborated by the top search queries for each country (Additional file 3: Tables S2–S28) which did not include queries possibly relatable to COVID-19 or COVID-19 symptoms.

A further limitation is that the NLP we employed did not perform equally well across all languages—regional studies will benefit from using NLP packages with language libraries tailored to the specific country. Our reliance on ProQuest as the indicator of print media coverage will have underestimated the level of news coverage, but the database’s scope will have captured the overall trend, which is sufficient for our analyses.

Two strengths of this study are our reliance on the superior GTEH data using the API keys of all the authors. These data are not scaled to 100 and are not rounded to integer values (with the accompanying loss of information), instead presenting the raw search probability for each geographical region included in the study, relative to the size of the underlying population, allowing direct comparison between all the countries involved. Secondly, we used multiple samples of both regional interest and timeline data for the analyses, providing better estimates of the search trends than ordinarily reported in the literature [110].

Conclusions

Our study has shown definite changes in global search interest in semaglutide. At some—but not all—points in time, these have coincided with media reporting and social media influence. Regulators have attributed shortages in semaglutide supply to social media promotion driving up demand for off-label use. This reflects a health risk reality related to a trifecta of inadequate supply to patients who really need these medicines, improper use by users for whom these medicines are not approved, and the proliferation of potentially harmful counterfeit medicines flooding markets in attempts to profit from limited supply and public desperation for medical interventions to lifestyle diseases.