1 Introduction

Social Media has become a buzz word in public discussions, steadily increasing its attraction for both academia and industry in the last years. In this article, we follow Kaplan and Haenlein (2010) who define Social media as a “group of Internet-based applications that build on the ideological and technological foundations of Web 2.0, and that allow the creation and exchange of User Generated Content” (p. 61). This term includes all the well-known websites where people share their thoughts, pictures, or videos with the Internet community (e.g., Facebook, Twitter, Google+, Youtube).

The number of people involved in Social Media has largely increased in recent years. According to Trendstream’s Global Web Index (Q4 2012), 693 million people are active users on Facebook, followed by Google+ (343 million), Youtube and Twitter (both ~280 million). These numbers indicate that virtually any Internet user participates in Social Media today.

The value of user-generated content in terms of business forecasts has been shown in the literature. For instance, online consumer reviews can be used to predict movie success (e.g., Chintagunta et al. 2010; Dellarocas et al. 2007), video game sales (Zhu and Zhang 2010), music sales (Heimbach and Hinz 2012), or book sales (Chevalier and Mayzlin 2006).

Some research has already been done to investigate the influence of user-generated content on stock returns. Generally, one can distinguish between sentiment detection with respect to specific objects of interest and the analysis of mood levels, i.e., the strength of positive or negative mood states. Former methods for example focus on measuring the company sentiment by analyzing consumer reviews (e.g., Tirunillai and Tellis 2012) or contents of stock message boards (Antweiler and Frank 2004). Twitter has also been used to extract sentiment with respect to commodity markets and currency rates (Rao and Srivastava 2012).

While these approaches aim to determine the degree of positivity or negativity towards a firm or product, this article will primarily deal with the second approach, the analysis of mood levels. We will use Twitter to determine mood states on a general level. Behavioral finance and neurofinance researchers attempt to explain the link between investors’ emotions and their trading behavior (e.g., Tseng 2006). For instance, individuals tend to be loss-averse, which means that they assign more importance to losses than to gains (Tversky and Kahneman 1991).

While early research was typically done in experimental settings, Social Media applications can now help reveal the social mood (Nofsinger 2005). Individuals in good mood are more willing to invest in risky assets, such as stocks (Johnson and Tversky 1983). Thus, stock returns depend on the investors’ risk appetite which in turn depends on their mood states.

The impact of feelings and emotions on the stock market was measured by means of Twitter (e.g., Bollen et al. 2010), Facebook (Karabulut 2011), or LiveJournal (Gilbert and Karahalios 2010). The prediction of share returns based on mood states can be seen as market anomaly contradicting the efficient market hypothesis (e.g., Kamstra et al. 2000).

However, virtually no study has considered social interactions of Internet users when showing the relationship between mood levels and stock returns. We therefore aim to extend previous research by including the number of Twitter followers in the analysis. The importance of every tweet depends on the number of users recognizing the original message. There is wide evidence that lead-users exert a large influence on other members of the community. Studies have also shown mood contagion, i.e., the transfer of emotions from leaders to followers (Bono and Ilies 2006; Sy et al. 2005) or between persons in general (Neumann and Strack 2000). A number of recent studies found evidence for emotional contagion on the Internet (e.g., Coviello et al. 2014; Guillory et al. 2011; Kramer et al. 2014). According to these findings, mood states can spread among Internet users through text-based communication.

First, we study the influence of changing social mood levels on share returns without considering the community structure. This enables us to answer the question if mood effects, which have been found by other researchers before, still exist in today’s financial markets. There might have been diminishing effects in recent years due to potential data mining strategies of investors. Afterwards, we include the importance of each tweet as measured by the number of followers in the analysis. It will become clear whether the predictive ability of mood states can be improved by considering social interactions of Internet users. After investigating the relationship between the social mood and the stock market in the training period, we apply a trading strategy to a different time period. Results of our virtual portfolio will show whether investors can actually profit from mood states in monetary terms.

In the next section we develop our hypotheses and present previous research which investigated the influence of emotions on stock returns. We then describe the empirical study including the calculation of the Social Mood Indices (SMI and WSMI), our data set, method and results. On the basis of our results in the training period, we create a trading strategy for the German stock market. The paper concludes with a brief summary as well as implications for researchers and practitioners.

2 Previous Research

2.1 Behavioral Finance

Since the early 1990s, behavioral finance researchers have continuously shown that the stock market is driven by investors’ psychology. Investors are human beings who are prone to errors or at least emotion-based decisions.Footnote 1 Market anomalies were observed which contradict the efficient market hypothesis (Fama 1970) according to which the prediction of share prices should not be possible since market prices reflect every available piece of information.

For instance, calendar anomalies refer to seasonal movements of stock market returns. The January effect states that returns are on average higher in January compared to other months of the year (e.g., Thaler 1987). One reason for this anomaly might be tax-loss selling. Investors aim to avoid taxes by selling shares which have performed badly over the year. Then, at the beginning of the year, share prices recover from such selling pressure (Brown et al. 1983). Researchers also identified the Monday effect (also known as day of the week effect or weekend effect), implying that returns on a Monday are relatively low compared to those on the Friday before (e.g., Jaffe et al. 1989).

Anomalies can also have a technical background. The momentum effect implies that past winners (losers) continue to perform well (bad). This has been observed for single stocks (Jegadeesh and Titman 1993) as well as for indices (Chan et al. 2000). Investors also use the past performance of mutual funds as an indicator for future returns although persistence cannot be expected according to the efficient market hypothesis (Grinblatt et al. 1995).

Researchers provide different explanations for these market inefficiencies. Reasons for technical- and calendar-related anomalies are out of the scope of this paper. Instead, we focus on anomalies which are driven by feelings and emotions.

Behavioral finance researchers refer to two groups of investors which are important for the pricing information. First, rational arbitrageurs are well-informed investors who are not prone to sentiment. This group of investors is also known as “smart money” in the literature (De Long et al. 1990). On the other hand, noise traders irrationally rely on sentiment and other non-fundamental information which is unimportant in the eyes of rational traders (Black 1986). These noise traders follow trends and often over- or underreact to news.

The proponents of the efficient market hypothesis argue that rational arbitrageurs trade against noise traders, driving prices immediately back to fundamental values after exogenous shocks. Noise traders can therefore influence prices only for a very short time before rational traders take positions against them until the market equilibrium is reached (Fama 1970).

However, behavioral finance researchers have shown that the power of rational arbitrageurs in trading against noise traders is limited. De Long et al. (1990) refer to positive-feedback strategies: more and more noise traders might follow other noise traders when buying or selling stocks. In this case, noise traders buy (sell) in case of rising (falling) prices. Thus, rational speculators can anticipate the behavior of tomorrow’s noise traders and also buy the stocks today, driving prices even higher.

There are a number of other factors which limit the capability of rational investors to trade against the uninformed individual investors. For instance, smart money might have short-selling constraints and other trading risks (Shiller 2003). Since rational investors are mostly risk averse, the fundamental risk (e.g., variance of share values) can also prevent arbitrageurs from trading for a certain period of time.

The overall conclusion from this line of research is that sentiment can influence share prices in case of limited arbitrage. Different sentiment measures have been proposed in order to forecast share returns, such as investor and consumer surveys (Brown and Cliff 2005; Lemmon and Portniaguina 2006; Qiu and Welch 2004), trading volume (Baker and Stein 2004), or market volatility (Whaley 2000). In this article, we focus on mood levels which have also been used as proxy for the investors’ sentiment (Baker and Wurgler 2007).

2.2 Influence of Mood on Share Returns

According to neuropsychologists, mood is influenced by different factors. While dopamine was found to mediate the cognitive effects of positive mood, serotonin may be responsible for negative mood (Mitchel and Philipps 2007). During the day, not only events and stress levels influence people’s mood states (van Eck et al. 1998) but also social interactions with other people (Vittengl and Holt 1998).

The literature reports many examples of mood-related anomalies. Saunders (1993) studied the period between 1927 and 1989 and found that stock returns at New York Stock Exchange are lower on cloudy days than on sunny days. The weather effect was confirmed by Hirshleifer and Shumway (2003) who show that sunshine is positively correlated with returns in 26 countries between 1982 and 1997. Both studies argue that sunshine creates a good mood, which in turn affects investment behavior.

Sport events can also influence people’s mood levels (Wann et al. 1994). Following this intuition, Edmans et al. (2007) studied the effect of international soccer game results on stock returns. The authors observe that domestic stock markets negatively react to losses of national soccer teams in international competitions (i.e., World Cup, Asia Cup, etc.). For instance, elimination from the World Cup leads to abnormal stock returns of 49 basis points on the next trading day. This loss effect holds for other sports, such as cricket or basketball. Chang et al. (2012) show that NFL game outcomes influence returns of companies which are locally headquartered, confirming results on the national level.

Apart from sport events or weather conditions, sleeping habits are another area of interest for studying the influence of emotions on asset prices. Kamstra et al. (2000) refer to the so called “daylight saving anomaly”, which means that Mondays after daylight-savings-weekends have lower stock returns than regular Mondays over the year. The reason for poorer returns lies in the fact that individuals tend to shy away from risky assets due to increased anxiety which is caused by losses or gains of sleep.

Investors’ mood might also be influenced by the level of air pollution. According to Levy and Yagil (2011), regions with a higher degree of air pollution (as measured by the Air Liquidity Index) show smaller returns compared to ecologically cleaner areas. Finally, Kamstra et al. (2003) investigated the role of depressions on investment behavior. Many individuals (and thus investors) suffer from seasonal affective disorder (SAD) during autumn and winter months when sunshine hours are scarce. Consequently longer nights lead to significantly lower returns for a number of stock markets in the world. The SAD effect was observed to be more pronounced in countries with a long distance to the equator (e.g., Sweden).

Thus, single events (e.g., sport results, daylight saving anomaly) or continuous effects (e.g., weather effect, daylight saving anomaly, air pollution) influence people’s emotions. These mood-related anomalies can be explained by the misattribution bias according to which people make risky decisions depending on mood states (Johnson and Tversky 1983). Individuals in good mood are more optimistic with respect to uncertain future events. A person’s emotional well-being is therefore important for subjective probability evaluations (Wright and Bower 1992).

The relationship between positive and negative mood states and the risk-taking tendency can be explained by the Affect Infusion Model (AIM) which postulates that people in positive mood rely on positive cues to make decisions (Forgas 1995). Because of the mood priming effect, people in positive moods associate risks to positive results in contrast to people in negative mood. Thus, the risk-taking tendency is higher for people in positive moods since they use heuristics and perceive the consequences of risky situations as more positive. People in negative moods are more prone to see the danger and are thus more careful in the decision process. Therefore they shy risks due to the negative associations with the risky decision (Schwarz 1990).

The AIM was confirmed by a number of laboratory experiments. For instance, Yuen and Lee (2003) induced subjects to positive and negative mood by showing corresponding movie clips. Results reveal that people in a bad mood show a more conservative risk-taking behavior compared to people in neutral or positive mood. Using a similar method, Chou et al. (2007) also report a higher risk-taking tendency for people in good mood compared to those in bad mood.

Depressive mood states have also been widely studied in the literature, especially by linking depression to levels of “sensation seeking”, which is another measure for risk-taking tendency (e.g., Zuckerman 1984). It has been shown that depressive subjects have reduced sensation seeking compared to normal people (Carton et al. 1992). Bell et al. (2000) found that differences in risk behaviors can be explained by the levels of sensation seeking. Wong and Carducci (1991) show that high sensation seekers have a greater risk-taking tendency in financial decisions than people with lower scores of sensation seeking. Furthermore, Eisenberg et al. (1998) show that depression correlates with risk aversion.

We argue that mood fluctuations influence the risk attitude, which in turn exerts an influence on the willingness to invest in risky assets, such as stocks (Fig. 1). This relationship was shown in the above cited studies of behavioral finance. Stock returns are therefore expected to be influenced by mood states of market participants.

Fig. 1
figure 1

Theoretical framework

2.3 Predictive Value of Social Media

While earlier research used exogenous factors as variables of interest (e.g., weather, sport results), Social Media applications now allow researchers to precisely measure mood fluctuations by analyzing people’s statements about their emotional well-being.

In a seminal work, Bollen et al. (2010) have shown that mood levels extracted from public tweets have predictive value to the Dow Jones Industrial Average. At a time when the overall mood is calm (or to some extent happy), the authors find statistically significant evidence for an associated reaction of the DJIA a few days afterwards. Some other studies using Twitter to predict the stock market appeared in recent years. For instance, Rao and Srivastava (2012) combined Twitter sentiment with Google search volumes to predict returns, trading volume and volatility of commodities (e.g., oil, gold) and stocks. Sprenger et al. (2013) focus on tagged tweets (e.g., $MSFT representing Microsoft) and find a correlation of r = 0.166 between Twitter sentiment and returns. Based on user posts from Twitter, online message boards as well as company news, Nann et al. (2013) created a trading model which outperformed the S&P 500 index by 0.24 % per trade after the consideration of transaction costs. Results from Oh and Sheng (2011), who study a 3 month period of roughly 70,000 postings on stocktwits.com, also reveal the predictive value of micro-blog messages to the stock market development.

Other social networks have likewise been investigated. Gilbert and Karahalios (2010) studied emotions extracted from LiveJournal, showing that the S&P 500 declines in case of increasing levels of anxiety. In a recent study, Karabulut (2011) found that Facebook’s Gross National Happiness (GNH) can predict returns in the US stock market.

In sum, studies from the offline as well as online world provide evidence that the stock market is driven by mood states of market participants. We therefore hypothesize: H1: Increased positive social mood levels derived from Twitter lead to higher stock market returns. There is also evidence in the literature that the community structure plays an important role when extracting mood from Social Media applications. Studies of diffusion processes and information cascades have a long tradition in the field of social network analysis as well as computer science (Granovetter 1973; Kempe et al. 2003; Leskovec et al. 2007; Hinz and Spann 2008).

We already know from experimental research that mood states are contagious (Hatfield and Cacioppo 1994). For instance, Bono and Ilies (2006) as well as Sy et al. (2005) found that followers and group members are influenced by positive mood states of their leaders. Neumann and Strack (2000) show that feelings are automatically transferred between individuals who listen to each other. Another example for emotional contagion in the real world comes from Fowler and Christakis (2008) who observed the spread of happiness in a real social network during a 20 year period.

A number of recent studies confirm these findings using an Internet setting. In the online world, it was shown that text-based communication can spread emotions among group members (Guillory et al. 2011; Hancock et al. 2008; Kramer 2012). Emotional contagion occurs on the Internet even in the absence of direct social interactions. In a recent experiment, Kramer et al. (2014) manipulated the volume of emotionally positive and negative posts in News Feeds of 689,003 Facebook users. It turned out that people who were exposed to less positive content produced fewer positive status updates themselves. On the other hand, if fewer negative posts occurred in their News Feeds, people published fewer negative status updates. The study confirms results of Coviello et al. (2014) who found that rainfall exerts an influence on the status messages of Facebook users as well as messages of geographically separated friends. Thus, emotions of Facebook members influence the emotions of other Facebook members. This relationship shows that textual content can spread emotions without direct social interactions.

Online shopping behavior also suggests that Internet users rely on the opinions of other community members. Conducting a field experiment, Grahl et al. (2014) were recently able to draw causal conclusions between social recommendations and purchase volume. Displaying Facebook Likes increases online store revenues by almost 13 % within 1 month, which indicates that Internet users are infected by opinions of their peers.

The Twitter network structure has also been investigated in previous research. So far, researchers only focused on the level of information or sentiment spread but not on mood and emotional contagion. According to Lerman et al. (2012), Twitter users are closely connected with each other: Following friends and re-tweeting messages leads to a large social network where news stories and other content can easily spread. The authors previously presented a framework for studying information cascades in online social networks (Ghosh and Lerman 2011). In general, using the number of re-tweets might be interesting for measuring emotional contagion. However, we realized that only a very small fraction of tweets are re-tweeted. This observation is supported by empirical studies which also found few re-tweets. For instance, Boyd et al. (2010) collected 720,000 tweets for studying the re-tweeting behavior on Twitter. Only 3 % of the tweets were re-tweets in this sample. The number of Twitter followers has frequently been used as a measure for influence and popularity within the community (e.g., Cha et al. 2010; Kwak et al. 2010). The follower influence is also known as in-degree influence in the literature and describes the potential audience a user might reach (Bakshy et al. 2011; Ye and Wu 2010). Bakshy et al. (2011) quantified user influence on Twitter and concluded that, on average, “individuals who have been influential in the past and who have many followers are indeed more likely to be influential in the future”. It is therefore reasonable to assume that the number of followers constitutes an appropriate measure for social influence (see for example Hinz et al. 2011). Ruiz et al. (2012) study conversations about companies on Twitter and show correlations with share prices under consideration of user activity and interaction (e.g., number of followers, number of re-tweets).

Hence, we hypothesize: H2: Increased positive follower-weighted social mood levels derived from Twitter lead to higher stock market returns.

3 Empirical Study

3.1 Data Collection and Method

We conducted our empirical study in three steps. First, we study a historical time period in order to replicate previous studies investigating the relationship between Twitter mood and stock returns, because it is unclear whether market actors already incorporate this new information and whether this market anomaly still exists. We therefore collected tweets that were published in Germany between January 1, 2011 and March 17, 2012. Afterwards, we integrate the number of followers into the analysis to see whether social interactions of Internet users help to predict the fluctuation of share prices. This second sample captures the period between December 1, 2012 and November 30, 2013.

We split this sample equally into a training period (December 1, 2012–May 31, 2013) and a testing period (June 1–November 30, 2013) in order to apply a trading strategy for investors. In the training period, we aim to investigate the predictive power of social mood states by integrating up to four lags into the model. Results of the trading strategy in the testing period will show whether investors might consider mood states for trades in the real world. We used a different time period for testing since applying a trading strategy in the same period would only reproduce existing results and therefore decrease the validity of the results (see for example Bollen et al. 2010 or Hill and Ready-Campbell 2011 who used a similar approach).

We accessed the data through the Twitter API.Footnote 2 Each tweet includes the tweet ID, time of publication, information on followers and re-tweets as well as text content, which is restricted to 140 characters. We eliminated all tweets that cannot be categorized as either positive or negative according to the dictionary approach described below.

For the mood analysis, we used Dalbert’s (1992) “Aktuelle Stimmungsskala” (ASTS) which is the German version of the Profile of Mood States (POMS) originally developed by McNair et al. (1971). We therefore followed the seminal work of Bollen et al. (2010) who also used a modified version of POMS for extracting mood levels from public tweets. However, in contrast to these authors, we focused on one specific region (Germany) instead of collecting world-wide tweets.

The ASTS consists of 19 adjectives which belong to 5 mood dimensions: grief, hopelessness, tiredness, anger and positive mood. Respondents usually indicate on a 5-point scale how accurate each adjective describes their current feelings. For instance, the words hopelessly, discouraged and desperately are part of the hopelessness dimension. We expanded the original ASTS from 19 to 529 items by deriving synonyms from the German dictionary Wortschatz (Biemann et al. 2004). This larger scale is called WASTS (Table 1). We translated all items into English, which is the predominant language on Twitter. Only 1 % of all Twitter messages are written in German, while 50 % are written in English (Semiocast 2013). While the percentage of German tweets is higher in Germany, English is widely spoken on Twitter in this region (Leetaru et al. 2013). It is therefore reasonable to consider both English and German tweets when measuring mood levels in Germany. However, it should also be clear that Twitter users in Germany are primarily German native speakers. According to Lewis (2009), more than 80 % of all German native speakerslive in Germany. Emotional states are expressed differently across cultures and languages which differ widely in the size of their emotion lexicons (e.g., Benedict 1934; Boucher 1979; Brown and Gilman 1960; Gehm and Scherer 1988; Pavlenko 2008). Thus, using the English POMS scale for English tweets primarily written by Germans would ignore the cultural differences, which is why we translate the German WASTS scale into English (see also Gehm and Scherer 1988 for a similar approach).

Table 1 Depressive mood states derived by WASTS

Our approach enables us to classify tweets into one (or more) of the five WASTS mood dimensions. For instance, the tweet “I’m feeling good today” would increase the positive mood score by one point because of the occurrence of the word “good”.

Our variable of interest for this study is the “Social Mood Index” (SMI), which we simply define as the share of positive mood on all word occurrences (sum of positive and negative mood states).

$${\text{Social Mood Index}} = \frac{\text{Positive Mood}}{{{\text{Grief}} + {\text{Hopelessness}} + {\text{Tiredness}} + {\text{Anger}} + {\text{Positive Mood}} }}$$
(1)

That is, we summed up all positive and negative tweets each day in order to calculate SMI values. We used Central European Time (12 midnight) as cutoff time since we measured the social mood in Germany. The SMI is comparable to Facebook’s Gross National Happiness (GNH) Index, which indicates the mood of Facebook users based on their status updates. The advantage of the SMI is that we do not have to rely on an external source (i.e., black box).

The SMI represents the Twitter mood in Germany. Every tweet published in Germany during our observation period reflects a part of the social mood. One could argue that the social mood is not representative for the investors’ mood. Indeed, if we were able to measure the investors’ mood solely, we would expect this assessment to be more accurate. However, not every investor has a public Twitter account and it is furthermore very difficult to identify all investors’ nick names on Twitter. This is why we analyzed the social mood on a macro level and assume that the overlap between social mood and investors’ mood is sufficient.

In this article, we follow Nofsinger (2005) who also used the term social mood for collective mood states.Footnote 3 He argues that “interaction with others has a strong influence and leads to a shared emotion, or social mood. Collectively shared opinions and beliefs shape individual decisions, which aggregate into social trends, fashion, and action” (p. 8). According to this definition, the SMI is likely to capture a certain part of emotions of stock market participants.

In addition, we especially aim to account for the social character of mood states by integrating the number of followers into the analysis. The weighted social mood index (WSMI) is simply an extension of the original SMI in that we sum up all positive and negative mood followers each day:

$${\text{Weighted Social Mood Index}} = \frac{{{\text{Positive Mood}} \times {\text{Followers}}}}{{{\text{Positive Mood}} \times {\text{Followers}} + {\text{Negative Mood}} \times {\text{Followers}}}}$$
(2)

For instance, if an influential individual with 10,000 followers on Twitter posts “I’m feeling good today”, this positive tweet would increase the positive score by 10,000 points instead of one point (original SMI, see above).

Our dependent variable is the DAX intraday return, which we simply define as the percentage gain or loss between the first price and last price of the trading day. We then study whether SMI and WSMI values have predictive value to share returns. Most of the previous studies have found a relationship between shifts in mood states and a stock market reaction on the next trading day (see Sect. 2). For instance, Kamstra et al. (2000) show that time changes on Sunday (“daylight saving anomaly”) leads to abnormal negative returns on the following Monday. Edmans et al. (2007) found a negative stock market reaction on the trading day after the elimination of the national soccer team at the World Cup. According to Karabulut (2011), changes of Facebook’s Gross National Happiness predict changes of the S&P 500 on the next trading day. However, Bollen et al. (2010) found significant values for different time lags so that we take this possibility into account by including more than one lag into the analysis. This is especially interesting when investigating emotional contagion effects (H2).

It should be noted that the DAX is dominated by foreign investors. However, these investors are mostly institutional investors such as banks or insurance companies which should not be prone to sentiment changes. For instance, the world’s biggest money manager Black-Rock owns 4 % of DAX total value.Footnote 4 In contrast, individual investors and noise traders are mostly domestic investors, living in Germany in our case (see Sect 2 for a discussion on noise traders). The preference for domestic stocks is known as “home bias” in the literature (French and Poterba 1991). The reason why retail investors prefer local stocks might be familiarity (Huberman 2001; Grinblatt and Keloharju 2001) or superior information (Coval and Moskowitz 1999). We therefore assume that a visible stock market reaction can be observed if noise investors, who are primarily German retail investors, are affected by changing mood levels which in turn influence their risk-taking tendencies.

Equation 3 depicts that we control for a number of anomalies, which have been discussed in the previous research section. We account for technical-related anomalies by the DAX intraday performance on the previous day (r t-1). This momentum variable represents the general market development (bull or bear market). The DAX index consists of 30 major German companies. It has been shown that past winners are often future winners and vice versa (Chan et al. 2000). In addition, we control for calendar anomalies (see Sect. 2.1). To this end we integrate dummy variables for trading days after the weekend (Monday t ) and national holidays. Further, the tax dummy variable equals 1 for December 28, 2012 (last trading day of the tax year) as well as January 2–8, 2013 (first five trading days of the tax year) in order to account for tax-loss selling. Finally, we take the lunar cycle into account (Dichev and Janes 2003) by constructing a dummy variable which equals 1 for the (−3; +3) window around full moon days and 0 otherwise. Finally, we control for a time trend by including Time t . This variable equals 1 on the first trading day of the observation period, 2 on the second trading day and so forth.

We also account for investor sentiment proxies: trading volume, stock market volatility, and consumer confidence. Trading volume and volatility have been shown to interact with stock indices in the past (e.g., Chen et al. 2001; Chordia and Swaminathan 2000; French et al. 1987; Gallant et al. 1992; Karpoff 1987). TradingVolume t represents the turnover of all DAX shares on day t. Volatility t is the stock market volatility on day t as measured by the VDAX-NEW. This index indicates the implied volatility of the DAX which is expected by market participants for the next 30 days.Footnote 5 In addition, we include the consumer confidence into the analysis. Qiu and Welch (2004) have shown that consumer sentiment correlates well with investor sentiment. Furthermore, Lemmon and Portniaguina (2006) used consumer confidence as a measure for investor sentiment in order to forecast share returns. One prominent measure for consumer confidence in Germany is the GfK index, which is published by the market research group GfK once a month.Footnote 6 ConsumerConfidence t indicates the consumer confidence as measured by the GfK index (in points) in the respective month on day t.

We use OLS in order to measure the effect of Twitter mood on stock returns. We estimate our model (Eq. 3) with robust standard errors due to heteroskedasticity (Breusch-Pagan test p < 0.01).

$$r_{t} = \beta_{0} + \beta_{1} \times {\text{SMI}}_{t - 1} + \beta_{2} \times {\text{SMI}}_{t - 2} + \beta_{3} \times {\text{SMI}}_{t - 3} + \beta_{4} \times {\text{SMI}}_{t - 4} + \beta_{5} \times r_{t - 1} + \beta_{6} \times {\text{TradingVolume}}_{t} + \beta_{7} \times {\text{Volatility}}_{t} + \beta_{8} \times {\text{ConsumerConfidence}}_{t} + \beta_{9} \times {\text{Monday}}_{t} + \beta_{10} \times {\text{Holiday}}_{t} + \beta_{11} \times {\text{Tax}}_{t} + \beta_{12} \times {\text{Moon}}_{t} + \beta_{13} \times {\text{Time}}_{t} + e_{t}$$
(3)

3.2 Results

3.2.1 Descriptive Statistics

In our historical sample, we observe the highest SMI value (0.679) on January 1 2012, when the Twitter mood was rather low after Amy Winehouse’s death (0.617) or during a terrorist attack in Moscow on January 24, 2011 (0.602). It should be noted that we do not aim to show a causal relationship between these events and share returns in this article. As described in Sect. 2, mood states can be influenced by many factors, such as stress levels, weather conditions, social interactions, etc.

Overall, the historical sample period contains 310 trading days between January 1, 2011 and March 17, 2012. The mean value of the SMI during this period is 0.637, which means that two third of tweets were recognized as being positive. The phenomenon that positive words are used more often than negative words is known as “Pollyanna effect” in the literature (e.g., Boucher and Osgood 1969).

This number is compatible with previous studies extracting sentiment from Internet messages. For instance, Rao and Srivastava (2012) studied stock and commodity discussions on Twitter and found that 67.14 % of tweets were positive. The ratio between positive and negative tweets persists when calculating WSMI values. Figure 1 in the Online Appendix shows a comparison between the WSMI and SMI over time.

Overall, we collected roughly 100 million tweets in the 3 year period between January 2011 and November 2013. On average, 102,084 tweets per month were recognized by the German and English version of the ASTS scale. While 60 % of tweets are English, 40 % are recognized as German tweets.

3.2.2 Relationship Between Social Mood (SMI) and the Stock Market

We surprisingly find no significant relationship between Twitter mood as measured by the SMI and share returns on the next 4 trading days in Germany (Table 2). We can therefore reject Hypothesis 1. One explanation might be that market actors have incorporated the mood level in their models so that the market anomaly is not persistent anymore. Multicollinearity does not seem to be a problem with all VIFs below 10 (mean VIF = 1.57).

Table 2 Influence of SMI on the stock market (01/2011–03/2012)

3.2.3 Relationship Between Follower-Weighted Social Mood (WSMI) and the Stock Market

Previous research has shown that mood states and emotions are contagious on the Internet (e.g., Kramer et al. 2014). We also know that Internet users heavily interact with each other on micro-blogs. It is therefore reasonable to investigate whether the predictive ability of the SMI improves when weighting each tweet according to its importance within the Twitter atmosphere. We therefore include the number of followers into the analysis and create the WSMI as described in Sect. 3.

Please note that this information is not available for the historical data set that we used in our first analysis. Our second sample includes tweets that were published in Germany between December 1, 2012 and May 31, 2013. We study the influence of the WSMI on the stock market on 117 trading days.

Table 3 shows that the DAX intraday return is positively influenced by increased WSMI values, supporting H2 (p < 0.05). A 1 % increase of the WSMI compared to the previous day exerts an influence of 3.3 basis points on the next day’s DAX return.Footnote 7

Table 3 Influence of WSMI on the stock market (12/2012–05/2013)

The relatively small effect of 3.3 basis points goes well with existing studies which investigated the predictive value of mood states and online sentiment for the stock market. Most researchers observe only weak magnitudes (e.g., Antweiler and Frank 2004; Karabulut 2011). Compared to other studies in the field of share price forecasting, our R2 value of 25 % is relatively high.Footnote 8 Usually small R2 values are reported due to the fact that share prices are influenced by a number of factors which cannot be included into one regression. Even the R2 of 3.2 % which we received in the historical sample (Table 2) is at the upper end of existing studies. As a robustness check, we also calculated (W)SMI values without the anger dimension due to the fact that anger might foster risk-taking tendencies and thus lead to higher stock market returns. However, we received qualitatively similar results compared to our original SMI and WSMI measures (see Online Appendix, Tables 2–4).

We found only one working paper which included the number of followers into the Twitter mood analysis. In contrast to our results, Zhang et al. (2010) do not report any significant influence of follower-weighted mood levels on the US stock market. However, the authors only present correlation coefficients of Twitter mood variables with the US stock market and do not perform more sophisticated analyses or control for other mood and technical-related anomalies.

We adopteded a bivariate VAR model in order to test Granger causality. The model is estimated with the following equation:

$$z_{t} = \alpha + \mathop \sum \limits_{j = 1}^{n} \gamma_{j} \times z_{t - j} + \beta \times x_{t} + e_{t}$$
(4)

where z t is a vector of the WSMI and DAX intraday return on day t. x t is a vector of our control variables: TradingVolume t , Volatility t , ConsumerConfidence t , Monday t , Holiday t , Tax t , Moon t , Time t .

In contrast to OLS regression, the VAR model allows to capture linear interdependencies among the follower-weighted social mood and share returns. That is, the variables are explained in the VAR system both by their own delayed values as well as by the delayed values of the other variable. Testing up to 10 time lags, we received the lowest Akaike Information Criterion (AIC) and Schwarz’ Bayesian Information Criterion (BIC) when choosing 1 lag (=1 day). The WSMI exerts a significant influence (p < 0.05) on the next day’s DAX return (see Table 4). Furthermore, the granger causality test shows that the WSMI does actually granger-cause the DAX intraday return (p < 0.05).

Table 4 Results of VAR model (December 1, 2012–May 31, 2013)

The unweighted SMI variable, which measures the social mood without the consideration of follower numbers, is again far from being significant in the sample period (see Table 1, Online Appendix).

3.3 Trading Strategy

Based on our results, we created a virtual portfolio and applied a simple trading strategy. Individuals can easily invest in stock indices with the help of exchange-traded funds (ETF). These highly liquid funds can be bought and sold during regular trading hours and fully replicate the index performance. If the WSMI increases compared to the previous day, we bought the iShares DAX ETF (ISIN DE0005933931), which is the most popular ETF in the German market. We then held the investment for one trading day so that our win or loss is the difference between the last price and the first price of the focal trading day. In case of decreasing WSMI values, we buy the db x-trackers ShortDAX ETF (ISIN LU0292106241), which is a liquid instrument in order to benefit from decreasing DAX values.

The trading strategy was applied for a time period (June 1–November 30, 2013) different from the training period in order to test whether there actually is a predictive value associated with social mood. Again, only tweets that have been identified as being relevant by our dictionary were stored in the database. The WSMI was calculated in the same way as described in Sect. 3.1.

The following example illustrates our approach: The WSMI decreased from 0.75 points on Wednesday, June 19 to 0.71 points on Thursday, June 20. We then bought the ShortDAX ETF on Friday, 21 June. On this day, the DAX decreased by 1.98 % from 7946.32 points (first price in the morning) to 7789.24 points (last price in the evening). The ShortDAX ETF increased by 1.91 % so that the portfolio realized a win of roughly 2.0 % before transaction costs. These numbers illustrate that long as well as short ETFs replicate the index performance virtually on a 1:1 ratio. We chose two highly liquid ETFs in order to create a realistic investment scenario. However, investors are not restricted to these ETFs and might use other instruments.

Starting with € 100,000 on June 1, 2013, this portfolio would increase to € 121,012 until the end of our observation period on November 30, 2013 (Table 5). Thus, this simple trading strategy delivered a return of more than 20 % within 6 months while the DAX index itself only increased by 13.4 % (see also P&L chart in Fig. 2, Online Appendix).

Table 5 Trading strategy

The outperformance against the DAX persists even if we control for transaction costs. Assuming a brokerage fee of € 5 per trade,Footnote 9 transaction costs would reduce the return of the portfolio by € 10 each day. However, this trading strategy would still realize a positive six-month performance of 19.11 %, increasing the value of the portfolio from € 100,000 to € 119,114.

It can further be improved by investing into leveraged ETFs. These funds are also easy to buy, tracking the index performance on a ratio of for example 2:1 or 3:1. We use the db x-trackers LevDAX ETF (ISIN LU0411075376) for long investments and the db x-trackers ShortDAX x2 (ISIN LU0411075020) in order to benefit from decreasing DAX values. The 2x leveraged ETF strategy would achieve a return of 35.63 % after transaction costs.

Next, we calculate the Sharpe Ratio, which is a common reward-to-volatility measure (Sharpe 1966):

$${\rm Sharpe\, Ratio} = \frac{{(R_{a} - R_{b} )}}{\sigma }$$
(5)

where R a represents the return of an asset (DAX return in our case); R b denotes the return of the riskless investment as measured by the risk-free interest rate; σ represents the standard deviation of the excess returns \(\left( {R_{a} - R_{b} } \right)\)

The Sharpe Ratio determines the return per unit of risk. Assuming 260 trading days, the average daily return in our case is 0.164 or 42.60 % on an annual basis. If we further deduct the risk-free interest rate of 3 %, which is close to the long-term mean value (e.g., Hill and Ready-Campbell 2011), we receive an excess return of 39.60 %. The standard deviation of daily returns is 0.0016 or 0.104 annualized. Thus, the Sharpe Ratio of the trading strategy is 3.8, which means that the investor is compensated well for the risk taken.

Despite this promising performance, we are aware that there are usually other transaction costs in addition to the brokerage fee. The bid/ask spread might be severe, especially for less liquid investment products. However, this spread is virtually zero for DAX ETFs due to large turnover rates and the great competition among market makers. Operating expenses (i.e., costs for administration, portfolio management, etc.) are another part of transaction costs. However, these are very low for ETFs since there is no portfolio management in contrast to actively managed funds. For instance, the total expense ratio of the iShares DAX ETF is only 0.17 % per year. In sum, we are confident that investors can use social mood states for their investment success, even after the consideration of transaction costs.

4 Conclusions

Our results provide evidence that follower-weighted social mood levels can predict share returns. An improved WSMI of 1 % led to a 3.3 basis points DAX increase on the next trading day during our training period. This effect is persistent even if we control for other anomalies, such as calendar effects. Surprisingly, our results do not support the view that the simple aggregation of mood states of all individuals in the Twitter blogosphere is sufficient to predict the stock market. Instead, it is necessary to consider the community structure (i.e., followers). An explanation for this phenomenon might be emotional contagion among Internet users as has been shown by previous research (e.g., Kramer et al. 2014).

The missing effect of the non-weighted SMI might be explained by the fact that some investors already conduct data mining and collect messages from Social Media applications in order to buy or sell stocks according to mood levels. Mood analysis is increasingly gaining attention and a number of companies emerged in recent years, offering their clients solutions to analyze big data on the Internet. Previous research used Twitter and Facebook data primarily from the years between 2007 and 2011 (e.g., Bollen et al. 2010; Karabulut 2011). Meanwhile, many articles were published by academic journals and the media so that investors are more likely to be aware of the large potential of user-generated content on the Internet. Our sample covers a more recent time period between 2011 and 2013. Thus, while previous research regarded social mood states primarily as private data (i.e., not visible for most investors), Twitter mood could be public data by now (i.e., visible for many or large investors), making financial markets more efficient and decreasing the predictive value of Social Media applications.

The diminishing influence of Twitter messages on the stock market might be compared with other mood-related anomalies, such as the weather effect. Saunders (1993) presented evidence for a sunshine effect in the US stock market during a 100 year period, although results in the last period (1983–1989) have not been statistically significant. In addition, researchers tried to reproduce Saunders’ study in subsequent years but many of them could not find a significant relationship between weather conditions and share prices (e.g., Krämer and Runde 1997; Trombley 1997; Worthington 2009). This lack of significance might be the product of data mining strategies, which have made financial markets more efficient over the years. Our study may potentially indicate similar effects for mood states derived from Social Media applications, although at this point in time we can only speculate.

However, one has to be careful when interpreting these results. The insignificance of the SMI might also be caused by our measurement. We are confident that the German and English version of the WASTS scale is most suited to assess mood states of the German Twitter users. It might however be problematic to use the English POMS scale to assess mood states of German native speakers due to cultural differences in emotion lexicons (e.g., Pavlenko 2008). Nevertheless, it was used for the first time when studying the influence of mood states on share returns. The WASTS deviates to some extent from other scales previously used by researchers who found significant mood effects (e.g., Bollen et al. 2010).

The consideration of social interactions among community members delivers promising results. Follower-weighted social mood states have predictive value for stock returns. Our simple trading strategy, which we applied to the German stock market, delivered returns between 19.11 and 35.63 % after the consideration of transaction costs. We were therefore able to outperform major international benchmark indices by double-digit percentage points.

Our results have strong implications for investors as well as the entire economy. The financial industry might integrate mood levels into traditional forecast models to make better trading decisions. Especially the combination of mood analysis with established capital market models would be an interesting area for future research in order to further improve forecast accuracy.

Implications of our results are not restricted to the financial industry. Future research might also investigate the relationship between social mood levels and other areas of our economy. For instance, the buying behavior of consumers seems to be influenced by emotions and feelings (Weinberg and Gottwald 1982). Researchers might predict online sales with the help of social mood levels derived from Twitter or Facebook.

Our results might be the first indication that emotional contagion caused by online messages can influence people’s behavior in the offline world, particularly the economic behavior. It therefore might be possible for Facebook, Twitter or another massive social network to manipulate the amount of positive messages shown to users in order to improve the economy. However, we cannot actually prove emotional contagion at this point in time. We can only assume the spread of mood states among Twitter users. Although there is evidence for emotional contagion on the Internet and Facebook in particular (e.g., Coviello et al. 2014; Kramer et al. 2014), the magnitude of mood transfers on Twitter still should be identified by future research projects.

Another avenue for future research would be to study intraday instead of inter-day effects of mood swings. There is already some evidence that shifts of investors’ mood states can influence share prices during the trading day (e.g., Chang et al. 2008; Lo and Repin 2002), and it would be interesting to study the influence of intraday mood swings derived from Twitter or Facebook. In addition, researchers could include other Internet sources, such as discussion boards or news sites. Especially the consideration of market news would help to compare the influence of mood states with the influence of events which occur in the real world.

Despite our promising results, our research has still some shortcomings. There may be fake messages in our sample. However, according to Twitter, only 5 % of all accounts are fake (D’Onfro 2013). Studies focusing on the predictive value of Twitter also found similar numbers of spam accounts (e.g., Conover et al. 2011). It is furthermore questionable whether these accounts actually produce fake messages which potentially pose a threat to the validity of our research.

Our dictionary approach does not consider specific features of tweets, such as emoticons and Internet slangs (e.g., Bifet and Frank 2010). These features might also convey mood, which is currently not captured by our SMI and WSMI.

Our dataset for studying the influence of follower-weighted mood states is relatively small. Overall, it captures the one-year period between December 1, 2012 and November 30, 2013. Further analyses with larger datasets are required in order to confirm our results. Especially changing market phases might deliver different results of our trading strategy. We used different time periods for training and testing and therefore followed Bollen et al. (2010) as well as other authors who used data of Social Media applications (e.g., Hill and Ready-Campbell 2011). However, several researchers (e.g., Ali and Pazzani 1992; Holte et al. 1989) argue that using different market phases for training and testing might cause incorrect results due to the problem of “small disjuncts”. Therefore it might be interesting to apply our trading strategy in the real world in order to test the validity of the results.

Sentiment and mood analysis with the help of Social Media is still a relatively young research domain. However, academia and industry are more and more aware of its huge potential for predicting the company success. It is difficult to evaluate how mood analysis will change the financial industry. According to our results, the network structure should be considered when studying the relationship between mood levels and share returns. In sum, opportunities in the field of mood analysis seem to be unlimited for researchers and practitioners which is why we have to expect numerous research projects over the next few years.