1 Introduction

Many studies have examined the role of investor sentiment in the stock market. Although traditional theories do not consider investor sentiment as one of fundamental factors that affect the stock market, Shefrin and Statman (1994) show that investor sentiment can serve as one of main drivers in stock pricing. As DeBondt and Thaler (1985), Lakonishok et al. (1994) and Barberis et al. (1998) argue, in fact, investor sentiment appears to influence irrational or unexpected valuations unexplained by traditional theories or models.

A number of studies empirically examine how investor sentiment is related to stock returns. Some studies focus on the direct causal relationship between stock returns and sentiment. Jansen and Nahuis (2003) employ survey data conducted under the European Commission and find that stock returns causes consumer confidence. Brown and Cliff (2004) show that stock returns cause sentiment using survey data as direct sentiment measures. Wang et al. (2006) and Spyrou (2012) also report similar results. However, Schmeling (2009) finds bi-directional causality between stock returns and consumer confidence across different countries and Chen and Kuo (2014) shows the similar bi-directional causality using the Conference Board Consumer Confidence Index (CBCCI) as a proxy for consumer confidence.

On the other hand, other studies investigate more general relationships. Using Michigan survey as consumer confidence index and the Wilshire 5000 stock index, Otoo (1999) finds that sentiment increases with stock prices and stock returns play a role as a leading indicator. Fisher and Statman (2003) adopt two separate sentiment measures: American Association of Individual Investors (AAII) survey as an individual sentiment measure and Merrill Lynch survey as an institutional sentiment measure and find that individual sentiment shows a significant relationship and sentiment has some predictive power for small cap stock returns. Baker and Wurgler (2006) find that stock returns are conditional on sentiment and conclude that the asset pricing model should incorporate sentiment.

There are also studies that consider different aspects in the market. Huth et al. (1994) show that the sentiment indices are useful to forecast aggregate spending and business activity using the University of Michigan survey as a consumer sentiment index and the Conference Board’s index of consumer confidence. Fisher and Statman (2006) use the bullish sentiment index as an individual sentiment and find that the bullish sentiment index can work as a market timing instrument. Liao et al. (2011) show that investor sentiment can play a critical role in explaining fund manager herding based on the principal component analysis. Chen and Kuo (2014) investigate how investor sentiment affects the Eurodollar option smile and find a significant relationship between interest rate volatility smile and sentiment. As the use of online trading increases, several studies attempt to create the sentiment index from online stock message boards that measure individual sentiment. These studies include Tumarkin and Whitelaw (2001), Antweiler and Frank (2004) and Das and Chen (2007).

Most previous studies provide their results based on a single proxy for sentiment. Baker and Wurgler (2006), however, argue that all single proxies for investor sentiment are imperfect and defective. They construct a sentiment index on the basis of the first principal component of six major sentiment proxies. Since their index represents overall sentiment, it can be considered the composite sentiment index that combines individual and institutional sentiments. For our study, we choose the composite sentiment index developed by Baker and Wurgler (2006).

Although previous studies provide various information about the role of investor sentiment, there still exists room to delve into investor sentiment in the stock market. First, unlike previous studies, our study examines the relationship between investor sentiment and fundamental market variables: stock prices, dividends, and earnings. It seems important to study investor sentiment with dividends and earnings as well as stock prices because they constitute fundamental market ratios and provide critical information to market participants. Second, most previous studies directly investigate the causal relationship without testing the co-integration. There are many studies that investigate the co-integration associated with market variables other than investor sentiment. These studies include Chang and Nieh (2001), Assaf (2006) and Gosh and Clayton (2006). We implement the co-integration test in order to identify the long run equilibrium between market variables and investor sentiment. If two variables are co-integrated, we need to incorporate an error correction term into the causality test. Third, with real data rather than nominal data we attempt to identify the true relationship between market variables and investor sentiment by making our results comparable over the whole period of our study regardless of inflation. Fourth, we examine how investor sentiment moves with marker variables assuming that extreme market events occur. We calculate bivariate probabilities based on two copulas and see how the dependence structure between investor sentiment and market variables affects their relationship under extreme market scenarios. Lastly, we measure the predictive power of investor sentiment. We not only examine the predictive power of investor sentiment over the whole period of our study but also see how the predictive power of investor sentiment alters if there exist structural breaks.

We describe all data in Sect. 2 and show empirical results and discussions from Sect. 3 through Sect. 6. Then, we conclude in Sect. 7.

2 Data

We use S&P 500 stock index data downloadable from Robert Shiller’s website. His data provide all real values for market prices, dividends, and earnings. For the sentiment index, we employ the composite sentiment index of Baker and Wurgler (2006) obtained from Jeffrey Wurgler’s website. His website provides changes in the composite sentiment index as well as the composite sentiment index level. All stock and sentiment data are monthly data. We select the data from 1965 to 2010 because the latest (monthly) version of the composite sentiment index covers July 1965 to December 2010. Figures 1, 2, 3 and 4 show real prices, real dividends, real earnings and the composite sentiment index, respectively.

Fig. 1
figure 1

Historical real prices

Fig. 2
figure 2

Historical real dividends

Fig. 3
figure 3

Historical real earnings

Fig. 4
figure 4

Historical sentiment index

We also collect monthly data of some economic variables from the website of the Federal Reserve Bank of St. Louis. These variables include the consumer price index, industrial production index, and 10-year Treasury note. They are used as control variables in two regressions to measure the predictive power of sentiment changes. We truncate data at the beginning or the end of the whole period if necessary for short or long-term computations.

3 Co-integration and causality

First of all, we need to implement the unit root test for all variables. We examine if variables have a unit root in their levels and differences. The typical unit root test requires the following regressions:

$$\Delta x_{t} = \beta_{0} + \beta_{1} x_{t - 1} + \mathop \sum \limits_{j = 1}^{n} \gamma_{j}\Delta x_{t - j} + \varepsilon_{t}$$
(1)
$$\Delta x_{t} = \beta_{0} + \beta_{1} x_{t - 1} + \beta_{2} t + \mathop \sum \limits_{j = 1}^{n} \gamma_{j}\Delta x_{t - j} + \varepsilon_{t}$$
(2)

where the null hypothesis is β 1 = 0 (x t has a unit root). Equation (1) has no trend whereas Eq. (2) has a trend. Although several rules compete for determining the optimal lag length, there seems to be no perfect rule to choose the number of lags. Throughout our study, we adopt the well-known Akaike information criterion (AIC) that has been traditionally and widely used. Instead, to minimize the optimal lag problem, we show co-integration results with three cases including one lag more and less than the optimal lag length.

In Table 1, we show results for the unit root test using the Augmented Dickey–Fuller (ADF) and the Phillips–Perron (PP) methods. For variables’ levels in Panels A and B, we fail to reject the null hypotheses for all variables at both 1 and 5 % significance levels except only for sentiment with no trend under the ADF method. In Panels C and D, the same unit root test is repeated with variables’ log differences. Since, however, Baker and Wurgler (2006) provide changes in the composite sentiment index and they are also downloadable from Wurgler’s website, we use them for sentiment differences. All null hypotheses with differences are rejected even at the 1 % significance level. As a result, real prices, real dividends, real earnings, and sentiment turn out to be integrated of order one.

Table 1 Unit root test

Since all variables are I(1), we investigate if there exists the co-integrating relationship between sentiment and market variables. The co-integration test identifies the existence of the long run equilibrium between variables. If two variables are co-integrated, we have to construct the vector error correction model (VECM) for the purpose of conducting the causality test. Then, we can see how two variables move toward their long run equilibrium through the error correction term.

We employ the Johansen test to identify the existence of the co-integration between sentiment and market variables. To get statistically robust results, we test the co-integration with one lag ahead of and behind the optimal lag length as well. In Table 2, R is the number of co-integrating vectors and N is the number of lags included in the test. The null hypothesis for the trace method is that the number of co-integrating vectors is less than or equal to R whereas the null hypothesis for the maximum eigenvalue method is that the number of co-integrating vectors is equal to R.

Table 2 Co-integration—Johansen test

For real dividends and sentiment, we fail to reject the null hypotheses with all three lags using both trace and maximum eigenvalue methods. This means that there is no co-integrating relationship between real dividends and sentiment. Regarding real prices and sentiment, we fail to reject the null hypotheses except only for the case that the number of lags is 11. We, however, identify the existence of the co-integration between real earnings and sentiment. With the trace method, we reject the null hypotheses with all three lags when R is 0 and fail to reject the null hypotheses when R is 1. The maximum eigenvalue method also shows the same result in the case that the number of lags is 2. As a result, only real earnings and sentiment are co-integrated.

Now, we examine the causal relationship between market variables and sentiment. Since all variables should be stationary for the Granger causality test, we use stock returns, dividend changes, earnings changes, and sentiment changes because they are all stationary as shown in Table 1. According to Engle and Granger (1987), there has to be at least one directional causality if two variables are co-integrated. Also, we should incorporate the error correction term into the causality test in order to correctly identify the causal relationship between co-integrated variables. To do this, we construct the VECM for variables that are co-integrated and the vector autoregressive (VAR) model for variables that are not co-integrated.

$$\Delta x_{t} = \alpha + \mathop \sum \limits_{j = 1}^{n} \beta_{j}\Delta x_{t - j} + \mathop \sum \limits_{j = 1}^{n} \gamma_{j}\Delta y_{t - j} + \varepsilon_{t}$$
(3)
$$\Delta x_{t} = \alpha + \alpha_{1} \widehat{e}_{t - 1} + \mathop \sum \limits_{j = 1}^{n} \beta_{j}\Delta x_{t - j} + \mathop \sum \limits_{j = 1}^{n} \gamma_{j}\Delta y_{t - j} + \varepsilon_{t}$$
(4)

where α1 is the adjustment coefficient of the error correction term and \(\widehat{e}_{t - 1}\) is obtained from the co-integrating regression, \(x_{t} = \mu_{0} + \mu_{1} y_{t} + \varepsilon_{t}\). We use the VECM for earnings changes and sentiment changes and the VAR model for other variables. When co-integrated variables deviate from their long run equilibrium, the departure should be corrected through the error correction term in the VECM. We calculate F-values for the null hypothesis that γj = 0.

In Panel A of Table 3, we do not find any significant causal relationship. In particular, the causality between dividend changes and sentiment changes is not statistically significant at all. Since the dividend payout is much more likely to be affected by the firm’s dividend policy, this finding is not surprising.

Table 3 Causality test

For stock returns and sentiment changes, the causal relationship between them does not appear to be statistically significant. This finding is inconsistent with those of previous studies (see Brown and Cliff 2004; Spyrou 2012) that report one directional causality from stock returns to sentiment changes. We may consider a very weak causality from stock returns to sentiment changes but its F-value (1.38) is significant at most at the 20 % significant level. Since we use real values instead of nominal values, the true causal relationship between real stock returns and sentiment changes seems to be statistically insignificant.

On the other hand, Panel B of Table 3 shows that sentiment changes cause earnings changes rather than vice versa. This finding is new and interesting because it implies that we can use sentiment changes as a leading indicator for earnings changes. As Schmeling (2009) mentions, since sentiment is influenced by various economic factors and news media, the fact that sentiment changes lead earnings changes appears to be quite reasonable. The adjustment coefficient of the error correction term is statistically negatively significant for both directions. This means that whenever variables deviate from their long run equilibrium the error will be inversely adjusted toward the long run equilibrium.

4 Copula and extreme movements

In this section, we are interested in investigating how sentiment changes move with stock returns, dividend changes, and earnings changes under extreme market events (e.g., stock market crash or bubble). A copula is very useful to explain multivariate extreme movements. We use two asymmetric Archimedean copulas: Clayton copula and Gumbel copula. Archimedean copulas are popular and widely used for many applications. While the Clayton copula can be used to examine extreme movements under the bearish market scenario the Gumbel copula can be used to see extreme movements under the bullish market scenario. According to Klugman et al. (2008), a multivariate copula is the joint distribution function of Uniform random variables as follows:

$${\text{K}}\left( {u_{1} , u_{2 } , . . . . ., u_{p} } \right) = \Pr \left( {U_{1} \le u_{1} , U_{2} \le u_{2} , . . . . .,U_{p} \le u_{p} } \right)$$
(5)

where \(U_{i} \sim U\left( {1, 0} \right)\). According to Sklar’s theorem, for any joint distribution function F, there exists a unique copula K that satisfies the following equation.

$${\text{F}}\left( {x_{1} , x_{2 } , . . . . ., x_{p} } \right) = {\text{K}}\left( {F_{1} \left( {x_{1} } \right), F_{2} \left( {x_{2} } \right), . . . . .,F_{p} \left( {x_{p} } \right)} \right)$$
(6)

Therefore, we can build a multivariate joint distribution with marginal distributions and a specifically defined copula. One of different classes in copula is the Archimedean copula that can be easily constructed with a defined generator. The form of Archimedean copula is as follows:

$${\text{K}}\left( {u_{1} , u_{2 } , . . . . . ., u_{p} } \right) = \pi^{ - 1} \left( {\pi (u_{1} } \right) + \pi (u_{2} ) + \cdots \cdots + \pi (u_{p} ))$$
(7)

where \(\uppi\left( {\text{u}} \right)\) is a generator. Since we investigate a copula-based relationship between market variables and sentiment, we use bivariate copulas.

It is common to have dependence measures associated with the copula. The most popular dependence measure is Kendall’s tau (τ) that has a mathematical link with Archimedean copulas. The definition of Kendall’s tau is as follows:

$$\uptau\left( {X, Y} \right) = \Pr \left[ {\left( {X_{1} - X_{2} } \right)\left( {Y_{1} - Y_{2} } \right) > 0} \right] - \Pr \left[ {\left( {X_{1} - X_{2} } \right)\left( {Y_{1} - Y_{2} } \right) < 0} \right]$$
(8)

where iid bivariate random variables (X 1, Y 1) and (X 2, Y 2) have marginal distribution F X (x) for X 1 and X 2 and marginal distribution F Y (y) for Y 1 and Y 2. The relationship between Kendall’s tau and the copula function K is derived with the following equation:

$$\uptau\left( {{\text{X}}, {\text{Y}}} \right) = 4\mathop \int \limits_{0}^{1} \mathop \int \limits_{0}^{1} K\left( {u_{1} , u_{2} } \right)k\left( {u_{1} ,u_{2} } \right)du_{1} du_{2} - 1$$
(9)

where \(k\left( {u_{1} , u_{2} } \right) = \frac{{\partial^{2} K\left( {u_{1} , u_{2} } \right)}}{{\partial u_{1} \partial u_{2} }}\). The Clayton copula has greater dependence weight to the negative tail and can be constructed as follows:

$${\text{K}}\left( {u_{1} , u_{2} } \right) = Max\left[ {\left( {u_{1}^{ - \theta } + u_{2}^{ - \theta } - 1} \right)^{ - 1/\theta } , 0 } \right]$$
(10)
$$\uptau = \theta /\left( {2 + \theta } \right)$$
(11)

where \(\uppi\left( {\text{u}} \right) = - \frac{1}{\theta }\left( {1 - u^{ - \theta } } \right)\) and θ is the dependence parameter. Equation (11) is the mathematically derived relationship between Kendall’s tau and the single parameter in the Clayton copula. The Gumbel copula has greater dependence weight to the positive tail and can be derived as follows.

$${\text{K}}\left( {u_{1} , u_{2} } \right) = { \exp }\left\{ { - \left[ {( - \ln u_{1} )^{\lambda } + ( - \ln u_{2} )^{\lambda } } \right]^{1/\lambda } } \right\}$$
(12)
$$\uptau = 1 - 1/\lambda$$
(13)

where \(\uppi\left( {\text{u}} \right) = ( - \ln u)^{\lambda }\) and λ is the dependence parameter.

In Table 4, we calculate Kendall’s tau for each bivariate case and estimate the single parameter of the copula function. Panel A shows dependence parameter estimates in both Clayton and Gumbel copulas. For the bivariate case of dividend changes and sentiment changes, the dependence parameter of the Gumbel copula is not defined.

Table 4 Copula parameter estimates and extreme probabilities

In Panel B of Table 4, we calculate probabilities for extreme movements using copulas. Under the bearish market scenario with the Clayton copula, the probability that stock returns and sentiment changes fall within their lowest 10th percentile at the same time is 3.13 %. This is about 2.4 times as high as the probability (1.28 %) that earnings changes and sentiment changes fall within their lowest 10th percentile at the same time. The probability that stock returns and sentiment changes fall within their lowest 1st percentile at the same time is 0.22 % which is about nine times as high as the probability (0.024 %) that earnings changes and sentiment changes fall within their lowest 1st percentile at the same time. Probabilities for dividend changes and sentiment changes are the lowest in all three percentiles. As a result, under the assumption that large losses occur in the stock market, stock returns and sentiment changes are likely to move most closely together.

On the other hand, under the bullish market scenario with the Gumbel copula, we calculate the highest nth percentile using the survival function of the copula.

$$K_{S} \left( {u_{1} , u_{2} } \right) = 1 - u_{1} - u_{2} - K\left( {u_{1} ,u_{2} } \right)$$
(14)

where \(K_{S} \left( {u_{1} , u_{2} } \right)\) is the survival function of the copula, \(K\left( {u_{1} , u_{2} } \right)\). The probability that stock returns and sentiment changes fall within their highest 1st percentile at the same time is 0.23 % which is about five times as high as the probability (0.044 %) that earnings changes and sentiment changes fall within their highest 1st percentile at the same time. In the same way, the probability that stock returns and sentiment changes fall within their highest 10th percentile at the same time is 2.92 % which is about 2.2 times as high as the probability (1.30 %) that earnings changes and sentiment changes fall within their highest 10th percentile at the same time. When we assume that large gains occur in the stock market, stock returns and sentiment changes are also likely to move most closely together.

As we see in the previous section, only earnings and sentiment are co-integrated and their causal relationship is formed from sentiment changes to earnings changes. When, however, we assume extreme market events that may occur in the stock market, stock returns and sentiment changes turn out to move most closely together. The dependence structure between stock returns and sentiment changes may provide important information under extreme market events.

5 Sentiment and prediction

To investigate the predictive power of sentiment changes, we adopt two approaches. One is to use incremental adjusted R2 shown by Spyrou (2012) and Schmeling (2009) and the other is to use prediction criterion (PC) developed by Amemiya (1980). Although the recent related studies employ incremental adjusted R2s to examine the predictive power of sentiment changes, we also use PC because it explicitly considers a loss function to measure prediction errors. Since dividend changes are not significantly related to sentiment changes in previous sections, we examine the predictive power of sentiment changes only for stock returns and earnings changes. We construct two separate regressions to examine the predictive power of sentiment changes for future stock returns and earnings changes.

$$\Delta y_{t + i} = \alpha_{0} + \mathop \sum \limits_{j = 1}^{n} \beta_{j}\Delta c_{jt} + \varepsilon_{t}$$
(15)
$$\Delta y_{t + i} = \alpha_{0} + \mathop \sum \limits_{j = 1}^{n} \beta_{j}\Delta c_{jt} +\Delta s_{t} + \varepsilon_{t}$$
(16)

where Δy t+j is the subsequent i-month stock (real) return or earnings change at time t and Δs t is the sentiment change at time t. We use some macroeconomic variables as control factors (c jt ). These variables are consumer price index (CPI), industrial production index (IPI), and 10-year Treasury note rate (TN). Since the GDP data are available only on a quarterly or annual basis, instead, we include IPI as one of control factors.

As shown in Spyrou (2012) and Schmeling (2009), we check the incremental adjusted R2 to see if sentiment changes add some explanatory power to forecast future stock returns or earnings changes. We calculate the incremental adjusted R2 by examining the adjusted R2 of Eq. (16) relative to that of Eq. (15). We look into the predictive power of sentiment changes to forecast subsequent 1-month stock returns or earnings changes through subsequent 12-month stock returns or earnings changes. Thus, we can compare short-, medium-, and long-term predictive powers of sentiment changes.

Schmeling (2009) shows that the sentiment factor adds some explanatory power to predict future short- and medium-term stock returns by reporting incremental adjusted R2s. Spyrou (2012) finds that sentiment changes add a weak explanatory power to forecast stock returns in terms of incremental adjusted R2s.

In Table 5, Panel A shows incremental adjusted R2s by subtracting adjusted R2s of Eq. (15) from those of Eq. (16). When we investigate the predictive power of sentiment changes for subsequent stock returns, it turns out that sentiment changes have the most predictive power for subsequent 1-month stock returns. The incremental adjusted R2 for subsequent 1-month stock returns is 0.86 %. Sentiment changes also show some predictive power for subsequent 6- and 11-month stock returns but their incremental adjusted R2s are relatively small.

Table 5 Predictive power of sentiment changes—stock returns and earnings changes

For subsequent earnings changes, sentiment changes have the most predictive power for subsequent 6-month earnings changes. Its incremental adjusted R2 is 1.29 %. Also, sentiment changes have some predictive power for subsequent 7-, 8-, 9-, and 10-month earnings changes. In other words, sentiment changes have some explanatory power for future medium-term earnings changes. Although adjusted R2s are small, it seems to be typical as found in previous studies.

On the other hand, Panel B shows decremental PCs between Eqs. (15) and (16). Since the PC should be minimized, the positive decremental PC represents the added explanatory power of sentiment changes. As we see in the 4th and 7th columns, the results shown in Panel B are very similar to those of Panel A. Particularly, we confirm that sentiment changes have the most predictive power for medium-term earnings changes.

6 Impact of structural breaks on the predictive power of sentiment changes

Since earnings changes and sentiment changes are most closely related on the basis of the results obtained from previous sections, we attempt to identify structural breaks in real earnings and examine whether the predictive power of sentiment changes increases or decreases in sub-periods divided by structural breaks of real earnings. Structural breaks often play an important role when we investigate time series variables because permanent structural changes in the long run may influence a specific relationship between variables.

To detect structural breaks in real earnings, we employ well-known methods: Cumulative Sum Chart and Chow test. The cumulative sum chart shows detailed movements in a time series and thus, we can identify candidates for structural breaks. Then, with the Chow test, we test whether or not those breaks really cause structural changes. We calculate the cumulative sum of real earnings using the following equation:

$$SUM_{t} = SUM_{t - 1} + \left( {E_{t} - \overline{E} } \right)\quad for\quad t = 1, 2, . . . . . ., n$$
(17)

where E t is earnings at time t and \(\overline{E}\) is the historical average of earnings. The cumulative sum ends at zero starting from a zero value of SUM 0.

As we see in Fig. 5, it is clear that there exists one single candidate for structural break in the time series of real earnings. It is October 1994 in which the cumulative sum is minimized. In other words, we observe a clear decreasing trend before October 1994 and a clear increasing trend after October 1994. Since we focus on structural changes over the long run, we ignore some small fluctuations over the short run.

Fig. 5
figure 5

Historical cumulative sum of real earnings. We calculate the cumulative sum using the following equation: \(SUM_{t} = SUM_{t - 1} + \left( {E_{t} - \overline{E} } \right)\) for t = 1, 2, …, n where E t is earnings at time t and \(\overline{E}\) is the historical average of earnings. The cumulative sum ends at zero starting from a zero value of SUM 0

To statistically confirm if real earnings have a structural break in October 1994, we use the Chow test. In Table 6, the test statistic, 66.93, is significant at the 1 % level. This means that two sub-periods (before and after October 1994) are structurally significantly different. While the first sub-period is considered the low earnings period because earnings are decreasing relative to the historical average, the second sub-period is considered the high earnings period because of its increasing trend. Although it is not the purpose of our study to explain why earnings have structurally changed since the break date, it may be attributed to significant changes in corporate world (e.g., the fourth wave in mergers and acquisitions and remarkable progresses in IT industry) that have been initiated since the mid-1980s.

Table 6 Structural break in real earnings

Using Eqs. (15) and (16), we re-calculate incremental adjusted R2s and decremental PCs in both low and high earnings periods. In Table 7, we find that the predictive power of sentiment changes considerably increases for subsequent medium-term (6-, 7-, 8-, 9-, and 10-month) earnings changes in the high earnings period. Those incremental adjusted R2s range from 1.66 to 4.05 %. These results appear to be quite significant when compared with the result that incremental adjusted R2s shown in Spyrou (2012) range from 0.2 to 2.41 % and those shown in Schmeling (2009) range from 1 to 4 %. The decremental PCs in the high earnings period range from 0.37 basis points to 2.43 basis points for subsequent medium-term earnings changes. This is consistent with the results shown in Panel A. As a matter of fact, positive incremental adjusted R2s and decremental PCs in the high earnings period are overwhelmingly concentrated in subsequent medium-term earnings changes.

Table 7 Predictive power of sentiment changes—high and low earnings periods

Consequently, adjusted R2s and PCs for medium-term earnings changes significantly positively change in the high earnings period after the structural change. Thus, sentiment changes seem to be asymmetrically more sensitive to high earnings. If we are to use sentiment changes as a leading indicator for earnings changes, it may be critical to recognize high or low earnings.

7 Conclusion

Most previous studies employ various single proxies for investor sentiment and focus on investigating the relationship between stock returns and sentiment changes. To overcome the imperfection of single sentiment proxies, we use the composite sentiment index developed by Baker and Wurgler (2006). Our study is different from previous studies in several aspects.

First, we examine how sentiment is related to dividends and earnings as well as stock prices. These market variables are important because they form fundamental market ratios and provide critical information to market participants. In addition, we use real data rather than nominal data to draw the true relationship between market variables and sentiment. Second, we examine not only the causal relationship but also the co-integrating relationship. If two variables are co-integrated, their causality should be tested with an error correction term. Third, we show how sentiment changes move together with stock returns, dividend changes, and earnings changes under extreme market scenarios. This provides information about the relative dependence structure associated with extreme values. Fourth, we investigate the predictive power of sentiment changes for stock returns and earnings changes and see if sentiment changes asymmetrically respond with structural changes.

We discover that sentiment is co-integrated with earnings rather than stock prices or dividends and the causal relationship is formed from sentiment changes to earnings changes. Thus, sentiment changes appear to lead earnings changes. Sentiment changes, however, tend to move more closely together with stock returns under extreme market scenarios. This means that the relationship between stock returns and sentiment changes can be better explained by the dependence structure of their extreme movements.

The predictive power of sentiment changes is larger for future earnings changes than future stock returns and tends to be concentrated in future medium-term earnings changes. Also, when the entire period is divided into the low earnings period and the high earnings period by the break date identified in real earnings time series, the predictive power of sentiment changes decreases in the low earnings period and increases in the high earnings period. Our study provides a new insight to stock market participants.