1 Introduction

Information is an essential part of the financial market and plays a prominent role in asset pricing (Dyck and Zingales 2003; Boudoukh et al. 2013; Engelberg, Reed, and Ringgenberg 2012). In perfect information efficient financial markets, prices respond to new information immediately when it becomes available. However, before the stock price fully reflects public information, the information needs to pass through the whole path of transmission, which is described by the chain of information release, dissemination, and reception. First, firms need to release and disseminate all relevant information (information supply). Then, investors need to receive and digest all relevant information (information demand) and include it in financial decision-making (Wang 2018; Schroff et al. 2016). The information supply and demand are the two key factors to drive the movement of the stock price (Vlastakis et al. 2012; Hu et al. 2017; Chronopoulos et al. 2018).

According to the agency's theoretical framework of Jin and Myers (2006), managers have a tendency to withhold the bad news for various reasons. This causes the information asymmetry between the inside managers and outside investors. When the accumulative bad news reaches a threshold level, managers must release it, and all negative firm-specific news becomes public, leading to an immediate crash. The crash risk in security prices has attracted increasing attention in recent years. A large amount of literature has tried to investigate the determinates of stock price crash risk, and find it associates with several causes, such as opaque financial reports (Hutton et al. 2009), corporate social responsibility (Kim et al. 2014), corporate tax avoidance (Kim et al. 2011a, b), firm’s ownership by institutional investors (An and Zhang 2013), CEO age (Andreou, et al. 2017), stock liquidity (Chang et al. 2017), and media coverage (Aman 2013). More recently, researchers also find that stock price crash risk is related to investor sentiment (Yin and Tian 2015), and investor attention (Wen et al. 2019). However, to the best of our knowledge, the question of how the diffusion process for firm-specific information released by a firm affects intraday price movement remains unanswered.

This paper investigates the role of information supply and demand in intraday stock price crash risk. An increasing, but still limited, many studies have investigated the relationship between information supply, information demand, and stock pricing. The existing literature provides limited evidence of information demand effects on stock market volatility (Vlastakis et al. 2012), on stock returns (Wang 2018), and on return predictability (Chronopoulos et al. 2018). However, they all focus on the normal stock returns, ignoring the extreme change of price returns. Aman (2012) examines the relationship between media and price crashes and jumps with weekly data, but the intraday crash risk is more worth studying in the period of high-frequency trading. Our study further explores how the daily firm-specific information supply and demand affects intraday price crashes and jumps.

We use the data from China to complete our empirical study. There are several reasons for using Chinese data: first, compared with Western developed countries, the Chinese stock market is dominated by individual investors. According to Gao and Yang (2018), more than 60% of market participants are individual investors in China. On the one hand, individual investors more rely on public information released by the corporates to make financial decisions. On the other hand, the information demand of individual investors is easier to measure. Second, despite regulators constantly raise the regulatory standards and try to improve the market transparency, the investors in the Chinese financial market are still disturbed by the issue of information asymmetry. The problems of low information quality, failure to disclosure firm-specific information, and information fraud still exist in the Chinese financial market, and these all affect stock price crashes. Thus, investigating the effect of information supply and demand on intraday price crash risk in the Chinese financial market can help us understand the diffusion process of information and improve market efficiency.

The contribution of our study is that, first, in studying the determinates of stock price crashes at the firm level, we focus on the intraday effect. The above-mentioned prior literature (Aman 2012; Hutton et al. 2009; Kim et al. 2014) attribute to the effect of crashes all based on the weekly returns. In the modern trading environment, security prices react much faster to news (Cui and Gozluklu, 2016). An increasing number of studies investigate the price movement with intraday high-frequency perspective. Examples include examinations of intraday momentum (Elaut et al. 2018; Zhang et al. 2019; Chu et al. 2019), and intraday liquidity (Mazza 2015). We employ 1-min returns data to examine the intraday extremely price change, and further investigate the determinates of intraday price crashes and jumps. Our examinations of the intraday crashes and jumps can be valuable for broad literature on finance because the intraday extremely downside risk has attracted more attention in asset pricing research (Kirilenko et al. 2017; Brogaard et al 2014).

Second, we joint use of attention forms the information supply and demand side to examine the effect of information diffusion on intraday price crashes and jumps. In this paper, we employ the number of articles published in the Baidu News channel as the proxy for information supply and use search data to proxy for individual information demand. There several reasons for using Internet data: First, the internet has made a revolution in the financial market (Antweiler and Frank 2004; Rubin and Rubin 2010; Moussa et al. 2017). Second, Baidu is the largest Chinese search engine in the world, and the Baidu information-collecting process is regular and credible.

Finally, we revisit the relationship among the information supply, the information demand, and intraday price extremely changes by examining whether it varies with the financial cycle of bull and bear markets. Gordon and St-Amour (2000) suggest that the investors’ attitudes and preferences toward stock market risk should be more related to bull and bear market. Several early empirical studies provide evidence that the relationship between risk and return depends on the bull and bear market period (Chen 1982; Wu and Lee 2015). Thus, it is interesting to examine whether the market conditions have an impact on the relationship between information flow and intraday crashes and jumps.

The remainder of this paper is organized as follows. Section 2 reviews the relevant literature and outlines the research background. In Sect. 3, we discuss the datasets for measuring information supply and demand and describe the measures of crash frequency. This section also provides a preliminary descriptive analysis. Section 4 presents our main empirical results. The final section concludes the paper.

2 Literature Review and Hypothesis Development

2.1 The Literature on Stock Price Crash Risk

A considerable body of literature has focused on firm-level stock price crash risk, which is defined as the negative skewness of return distributions. Those studies show that corporate managers often possess higher-level private information about firm operation, asset values, and development prospects than outside investors. In particular, managers have a tendency to withhold or delay the disclosure of the bad news for an extended period. This tendency mainly arises from managerial incentives such as: keeping their own career position (Graham, Harvey, and Rajgopal 2005; Kothari et al. 2009), maintaining the esteem of peers (Ball 2009), increasing the value of the option in short term (Kim et al 2011a). If managers successfully prevent the flow of negative information into the market, the uninformed investor would overvalue the stock price and make the distribution of stock returns asymmetric (Hutton et al. 2009; Kothari et al. 2009). When the accumulation of bad news reaches a threshold, all the negative firm-specific information becomes public at once, leading to a large negative drop in stock price.

For the determinants of crash risk, one stream of the existing literature focus on the agency on the agency framework. Jim and Meyer (2006) argue that the existence of information asymmetries corporate managers and outside investors could cause crash risk. Using earnings management as a measure of opacity, Hutton et al. (2009) find that opaque firms are more prone to stock price crashes. Corporate tax avoidance is another technique for managing earnings and thereby withholding bad news. Kim (2011) finds that corporate tax avoidance increases crash risk, which is consistent with the opinion that aggressive tax strategies allow manages to conceal negative information, thereby causing crashes. As a non-financial reporting activity, corporate social responsibility disclosures are a kind of voluntary disclosure tool. Kim et al. (2014) show that firms with better corporate social responsibility have lower crash risk.

Another steam of literature relating internal corporate governance mechanisms with crash risk. Better corporate governance plays an important role in financial disclosure and reporting quality (Bedard et al. 2004; Larcker et al. 2007) and hence in reducing crash risk (Andreou et al. 2016). Chen et al. (2017) focus on the strength of internal control on Chinese corporates and find that high-quality internal control alleviates crash risk. Callen and Fang (2013) report a negative relationship between institutional investor stability and stock price crash. Xu et al. (2017) find that firms with disproportionately more analysts herding in their coverage are associated with higher crash risk in china. Aman (2013) investigates the linkage between stock price crashes and jumps and media coverage by using data from Japanese stock markets and newspaper articles. Their evidence indicates that media coverage has no effect on price jumps, but has a positive effect on price crashes. Other determinant factors, like political connections (Lee and Wang 2017), religiosity (Callen and Fang 2015b), stock liquidity (Chang et al. 2017), and investors’ attention (Wen et al. 2019).

However, in the Chinese stock market, which is dominated by retail investors, the role of retail investors’ information demand and online information supply on intraday crash risk is not clear. This study employs high-frequency data to extends the literature on crash risk.

2.2 The Literature on Information Supply

Information flow plays an important role in the financial market. Many existing studies are based on the hypothesis that market activity, such as return volatility and trading volume, is directly related to the rate of information arrival in the market. The link between information flow and financial market stems from the so-called “Mixture of Distribution Hypothesis” (Clark 1973; Richardson and Smith 1994) (DMH). The DMH provides an explanation to the observed association between volatility and trading volume by requiring a joint dependence of both the volume and return on a hidden information process.

Since the information flow cannot be directly observed, seeking a suitable proxy for information flow is crucial when empirically study its effect on financial markets (Vlastakis et al. 2012). Using the number of macroeconomic and firm-specific news announcement in the Wall Street Journal, Mitchel and Mulherin (1994) find that the displaying pattern of flow of information is consistent with the behavior of asset prices. They also find evidence of a statistically significant relationship between information and volume. Barber and Odean (2001) emphasize how technological development and more internet use has an impact on financial markets. The Internet has become the main channel to release information for listed companies, and investors also rely more on the Internet to get information for making decisions (Eran and Amir Rubin 2010). Antweiler and Frank (2004) find that the significant effect of web information on stock returns.

The information supply literature related to this study is on online media reports and stock returns. Tetlock (2007) uses linguistic analysis to judge the emotion of news article, and find that the pessimistic tone predicts downward pressure on price and a subsequent reversal. Fang and Peress (2009) show a persistent no-media coverage premium, hence, the stocks without media coverage have a higher cross-sectional stock return. Aman (2013) investigates the linkage between stock price crashes and jumps and media coverage in the Japanese stock market, and find media coverage can induce higher crash frequency. Moussan et al. (2017) evaluate the impact of information demand and supply on stock market return and volatility, and find that public information effect is conditioned by the nature of disclosure. However, the relationship between the online firm-specific information supply and intraday extremely change of stock price has not yet been studied empirically.

2.3 The Literature on Information Demand

Information demand plays a crucial role to ensure the public information obtained by investors and reflected in assets price in the final. Merton (1987) introduce the concept of investor recognition and provide theoretical frameworks to study the relationship between information demand and stock markets. He finds that an increase in investor information demand leads to positive price pressure in short-term and lower returns in the long-term. Barber and Odean (2008) suggest that retail investors tend to search for more firms related information when selecting stocks. The retail investors can implement their positive expectations on any company by buying its stock, however, their negative expectations cannot be transformed into trading behavior because of the limitation on sell short. This provides an explanation of why investor information demand leads to temporary upward pressure on the stock price. Based on this evidence, several other theoretical studies highlight the importance of information demand in investors’ trading behavior and investigate the effect of information flow on market activity (Peng and Xiong 2006; Tetlock 2010; Latoeiro et al. 2013).

The empirical studies in information demand remain challenging since the lack of direct measure. Barber and Odean (2008) and Gervais (2001) use high trading volume and extreme return as the proxies for investor attention. For the reason that these two indirect measures are associated with the arrival of information. More recently, Internet search queries are widely adopted in a growing number of literature. Da et al. (2011) use Google search volume to measure the individual investors’ attention and find the short-term upward price pressure on the stock market, which is consistent with Barber and Odean (2008). Vlastakis and Markellos (2012) find a positive relationship between information demand and trading activity and volatility. Aouadi et al. (2013) find that Google search volume is a reliable proxy of investor attention and suggest that investor attention is strongly correlated to trading volume, stock market liquidity, and volatility. Chronopoulos et al. (2018) use the daily internet search volume index from Google as the proxy for information demand and show that investor information demand can improve the volatility forecasts for GARCH models.

2.4 Hypothesis Development

There are two conflicting hypotheses about the effect of information supply on crash risk. One is crash-reducing hypotheses, which hypothesize that information supply can smooth the firm-specific information into the price. This suggests that the price crash decrease with the firm-specific information supply. The information asymmetries between the inside and outside of the firm and between professional and retail investors are mitigated with the firm-specific information continuously releasing (Raimondo 2019). Hutton et al. (2009) show that improved accounting transparency contributes to reducing the price crash. Actually, not only the accounting transparency but also the other types of information supply are important to mitigate the extremely falling of stock price. Chan (2003) demonstrates that media reports are important for price movement, but find that price has the tendency to draft slowly after bad news. Fang and Peress (2009) show that press coverage improves the information efficiency of the stock market in the US. Song et al. (2016) find that banks experience less crash when their information environment is more transparent. Kim et al. (2014) show that better corporate social responsibility disclosure can reduce information asymmetry and crash frequency.

Another is crash-induce hypotheses, which hypothesize that information disclosure exacerbates extremely falls of stock price; that is to say, the firm-specific information supply would increase the crash frequency. Several studies have provided evidence that media exacerbate investors' irrational instead of limiting it. Media coverage might generate irrational attention about a certain firm, attention-grabbing information may lead to trading activity towards misplaced expectations and cause the extremely large market response (Hong and Stein, 2007; Raimondo, 2019). Chan et al. (2001) find that the daily volatility increases with the public salient political news in the Kong Hong stock market. Tetlock (2007) shows that the content of newspaper articles has an over effect on investors’ behavior, and inducing downward pressure on the stock price. Xu et al. (2013a, b) find that the stock price crash increase with the firm’s analyst coverage in the Chinese stock market, and this is more pronounced when analysts are more optimistic. In addition, Aman (2013) using data from Japanese stock markets and newspaper articles, find that intensive media reports on firm cause extremely large reactions in the market to corporate news.

Therefore, according to the above discussion, we propose the hypotheses are:

Hypothesis 1a

Corporate information supply is positively associated with intraday stock price crash risk.

Hypothesis 1b

Corporate information supply is negatively associated with intraday stock price crash risk.

Information is the most important and valuable scarce resource in the financial market. The public available information does not imply that information is instantaneously received by all market participants. Investors must expend effort to obtain the information via various channels (Drake et al. 2012). With the development of technology, more market participants use online brokerage firms and no longer need professional advice from traditional brokers. This leads investors to rely more on the Internet to acquire information for making decisions (Barber and Odean 2001). When investors are aware of pending bad news, they would express their demand for public information and seek more information to help them make the correct decision. If investors acquire more useful information about stock, the information asymmetry between the firms and investors would be mitigated (Ding and Hou 2015; Gao et al. 2018). We can expect that individual investor information demand would reduce the stock price crash. However, if investors cannot obtain valuable information from an Internet search, the information asymmetry causes higher trading between informed and uninformed investors (Chen et al. 2001), and this would increase the crash risk.

Hypothesis 2a

The positive relationship between individual information demand and intraday crash risk.

Hypothesis 2b

The negative relationship between individual information demand and intraday crash risk.

3 Sample and Variable Construction

3.1 The Sample

CSI 300 index consists of the 300 largest and most liquid A-share stocks and can reflect the overall performance of the China A-share market. This index within the scope of the IOSCO Assurance Report on 30 September 2018, and is the most important index in the Chinses stock market. Our sample consists of all listed firms incorporated in the CSI300 index from January 2013 through April 2019, excluding the delisted firms and the firms with insufficient online media and Internet search data, because some firms in the market are too short and not timely included in the Baidu Search Index and Baidu News channel. In addition, similar to earlier studies, we exclude financial services and utility firms, because financial characterizes in these industries are different from other industries. (Kim, Li and Zhang 2011a, b; Andreou et al. 2016; An and Zhang 2013) As Table 1 shows, our final sample includes 417 firms with 571,419 firm-day observations, and the sample period covers from January 1, 2013 to April 1, 2019.

Table 1 Sample development, industry membership, and fiscal years of sample

3.2 Data and Variables

3.2.1 Crash and Jump Variables

Crash risk, defined as the remote and negative outlier in firms’ residual stock return (Jim and Myers 2006), captures asymmetry in risk and is important for investment decision and risk management. To construct our crash risk measures, we calculate 1-min logarithmic returns from high-frequency intraday stock price data, which is acquired from Wind Financial Database. We follow expanded index model regression of Hutton et al. (2009) and Chang et al. (2017) by calculating the firm-specific returns per minute (residual returns) for each firm in each day:

$${r}_{i,m}={\beta }_{0}+{\beta }_{1}{r}_{Mkt,m-1}+{\beta }_{2}{r}_{Ind,m-1}+{\beta }_{3}{r}_{Mkt,m}+{\beta }_{4}{r}_{Ind,m}+{\beta }_{5}{r}_{Mkt,m+1}+{\beta }_{6}{r}_{Ind,m+1}+{\varepsilon }_{i,m}$$
(1)

where \(m\) is the return on stock \(i\) in the minute \(m\), \({r}_{Mkt,m}\) is the return on the CSMAR value-weighted A-share market index, \({r}_{Ind,m}\) is the return on China Securities Index’s value-weighted industry index, and \({\varepsilon }_{i,m}\) is the error term. Compared with the fundamental CAPM market model, we introduce the lead and lag market and industry index return to handle the nonsynchronous trading (Dimson, 1979).Footnote 1 This regression can separate firm returns into two components: the returns due to market-wide and industry-wide movements, and firm-specific returns as captured by the residuals of regression. In this study, we focus on firm-specific returns. We define firm-specific minutely returns for firm \(i\) in minute \(m\) (\({M}_{i,m}\)) as the natural logarithm of 1 plus the residual (\({M}_{i,m}=\mathrm{ln}(1+{\varepsilon }_{j,m})\)).

We measure the likelihood of crashes or jumps based on the number of the firm-specific minutely returns exceeding 3.09 standard deviations below or above its mean value, respectively. The number of 3.09 is chosen to generate a 0.1% frequency in the normal distribution. The first crash measure, crash frequency (Crash Freq), is the frequency of Firm-Specific Minutely Returns falling 3.09 standard deviations below the mean minutely firm-specific return for that trading day. The alternative crash measure, crash dummy (Crash), equals to 1 if a firm experience 1 or more Firm-Specific Minutely Returns that are 3.09 standard deviations below the mean minutely firm-specific returns over the trading day; otherwise, Crash is set equal to zero.

The jump is widely used to examine extreme adverse price movements in comparison with crashes. Similarly, Jump Freq is the frequency of jump minutes for that trading day. And the Jump, an indicator variable, is set equal to 1 if a firm experiences 1 or more jump minutes over the trading day and equals 0 otherwise. Specifically, the jump minutes are defined as those when a firm experiences Firm-Special Minutely Returns that are 3.09 standard deviations above the mean firm-specific minutely returns over the trading day.

According to Anas et al. (2008) algorithm to identify different market cycle phases. The detail produce is described as follow: (1) The identification of a first candidate set of turning points on the time series of CSI 300 Index (\({y}_{t}\)) is determined by the Bry and Boschan (1971) algorithm, the peak at \(t\) when \(\left\{{y}_{t}>{y}_{t-k}, {y}_{t}>{y}_{t+k}, k=1,\dots ,K\right\}\) and the trough at \(t\) when \(\left\{{y}_{t}<{y}_{t-k}, {y}_{t}<{y}_{t+k}, k=1,\dots ,K\right\}\), where \(K=2\) for quarterly time series,\(K=5\) for monthly time series, and \(K=120\) for daily time series. (2) Turning points within six months of the beginning or end of the series are disregarded. (3) To make sure that peaks and troughs alternate, if there is a double trough, the lowest value is chosen; and if there is a double peak, the highest value is chosen. (4) A phase of the market cycle must at least 6 months and a complete market cycle must have a minimum duration of 15 months. (5) The deepness of each phase must larger than 0.005. The deepness is calculated as \(Deepness=\frac{{|X}_{P}-{X}_{T}|}{{X}_{T}}\), where \({X}_{P}\) and \({X}_{T}\) are respectively the values of the series at the peak and trough of the market cycle to be considered. To locate the bull and bear market phase in our sample period, we run this algorithm, and the bull and bear market cycle phase is presented in Fig. 1. The detail bull and bear phases are reported in Table 2

Fig. 1
figure 1

Bull and bear market phases for each cycle from January 1, 2013 to April 1, 2019

Table 2 Bull and bear market phases for each cycle

In Table 3, Panel A reports that 57.6% of the firm-days in our sample experienced at least one crash (329,165 firm-days); 69.86% of the firm-days experience at least one jump (399,064 firm-days). Panel B reports the comparison results for the intraday crash and jump frequency in two market cycle phases, bull market phase and bear market phase. As the Panel B shows, 56.12% of firm-days experience at least one crash (178,711 firm-days) in the bull market phase and 59.47% (150,454 firm-days) in the bear market phase. In addition, 70.04% of the firm-days experience at least one jump (223,008 firm-days) in the bull market phase, and 69.58% (176,038 firm-days) in the bear market phase.

Table 3 Crashes and jumps, frequency and the returns per minute

Panel C reports the mean of raw minutely returns for two subsamples of firm-minutes: Crash minutes and Jump minutes. The third column represents the average value of minutely raw returns for individual firms. The mean minutely returns for Crash minutes is -0.64%, for Jump minutes, 0.69%. The fourth column represents statistics results for the market index. In this panel, Crash (or Jump) minutes refer to any minute in which any firm in the sample crashes (or jumps). The last column represents the statistics averaging across industries. If any firm in an industry crashes (or jumps) in a given minute, that is defined as a crash (or jump) minute for the industry.

3.2.2 Measuring Information Supply

To consider the information supply impact, we use the Baidu Media Index, which is the number of news items containing a specified keyword in their headlines collected by the Baidu News channel as the proxy for the information supply. The Baidu news channel features articles from around 400 online Chinese-language news sources, including the largest portals in china (e.g., 163.com, sohu.com, sina.com, chinanews.com, and yahoo.com), the main financial website (e.g., China Securities Journal (www.cs.com), Shanghai Securities News (www.cnstock.com), Securities Time (www.stcn.com), Securities Daily (www.zqrb.com), and Weekly on Stock (www.hongzhoukan.com)), and websites of popular daily newspapers and weekly magazines. It is important to note that the same headlines of news from different news sources may be repeatedly counted in Media Index. However, this reproduction of certain news from a mass of news sources precisely reflects the relative importance of that news in the whole Chinese market. We note that the ticker symbols in the Chinese stock market are chosen to be unique, which are composed of six digits. Compared with the ticker symbol, firm name (Chinese abbreviation) accords with Chinese individual investors’ search habits much more than the six-digit ticker symbols in the Chinese stock market (Zhang et al. 2013; Shen et al. 2017). Thus, we use the stock name (Chinese abbreviation) rather than the stock ticker as the keywords to acquire everyday Media Index from the Baidu Index websites with a crawler program during the period from January 1, 2013 to April 1, 2019.

3.2.3 Measuring Information Demand

To measure information demand, we employ the volume of internet search queries as provided by Baidu Search Index. Baidu Index is a keyword-searching tool launched by Baidu, the largest search engine in China. Similar to Google Trend, Baidu Index is calculated based on the search frequency of keywords by internet users through the Baidu search engine. There are two different keyword methodologies to capture investors’ information demand expressed for individual stocks. One is to identify search volume by company ticker symbol (Da, Engelberg, and Gao 2011; Drake et al. 2012), the other is to use ordinary company names (Bank, Larch, and Peter 2011; Fan et al. 2017). For Chinese keyword analysis, the stock name is better than the stock ticker for the keyword. The main reason is that Chinese retail investors are more likely to use the stock Chinese name when searching for information on the Internet, and Baidu does not provide the search volume of the stock ticker. For each stock in our sample, we manually attempt to obtain search volume (information demand). During the period of search volume availability, Baidu reports weekly and daily search volume from PC, mobile phone, and total number. In our analysis, we focus on the daily total search volume. There are 608 listed firms are included in the CSI 300 Index from 2013 to 2019, we finally acquire 541 stocks daily data from Baidu Index in the sample period.

We note that there is variation in the raw Baidu search volume index (BSVI) across the days of the week, specifically, Baidu search volume is considerably lower on weekends than it is on weekdays. To remove the influence of potential day-of-the-week effects, we model the expected level of BSVI. We follow Drake et al. (2012) to measure the expected level of search volume separately for each day of the week as the average raw search volume for the same day of the week \(k\) over the prior 10 weeks. Then, we calculate the abnormal search volume (\(Ab\_BSVI\)) for firm \(i\) on day \(t\) as the raw BSVI minus the average raw BSVI for the same day of the week \(k\) over the prior 10 weeks, scaled by the average raw BSVI for the same day of the week \(k\) over the prior 10 weeks. The calculation formula is given as follow,

$${Ab\_BSVI}_{i,t}=\frac{{BSVI}_{i,t}}{{\stackrel{-}{BSVI}}_{i,t}}-1$$
(2)
$${\stackrel{-}{BSVI}}_{i,t}=\frac{1}{10}{\sum }_{k=1}^{10}{BSVI}_{i,t-70+7\times k}$$
(3)

where \({BSVI}_{i,t}\) is the raw BSVI for firm \(i\) on day \(t\).

3.2.4 Control Variables

We include several control variables used in existing models. To incorporate the effect of crashes and jumps caused by the large change in liquidity, we include the turnover of trading volume (Turnover), which is used as the proxy for liquidity and is defined as the ratio of trading volume to the number of shares outstanding on a daily basis. Considering the lower trading cost can prompt the information to be reflected in the price, we expect that the increased liquidity to reduce the probability of crash and jump. We control for firm size (Size) using the natural logarithm of the market capitalization, and the market-to-book ratio (MB) because these two variables are closely associated with crash risk (Hutton, et al. 2009). The large firms usually have superior transparency because they continuously receive more attention from security analysts and other financial institutions, we expect that the crash risk is declined with firm size. The MB is used to measure the growth opportunities (Chang, et al. 2017) and the extreme price falls caused by adjustments in overvaluation (Aman H. 2013). We further control for leverage (Leverage), defined as the firm’s total liabilities scaled by total assets, return on assets (ROA), defined as an indicator of how profitable a company is relative to its total assets, and price earnings ratio (PE), defined as the current market price of a company share divided by the earnings per share of the company. Highly leveraged and less-profitable firms are expected to exhibit more price crash. The detailed definitions of the variables are represented in Table 4.

Table 4 The definitions of variables

3.3 Descriptive Statistics

Table 5 reports the descriptive statistics. The mean value of crash frequency (\(Crash Freq\)) is 0.8734, and the standard deviation is 0.9392. As for another key variable, crash dummy (\(Crash\)), the mean value is 0.5760 and the standard deviation is 0.4942. These figures indicate that more than half of firms experience a large-scale crash over the course of trading days. The mean value of jump frequency (\(Jump Freq\)) is 1.2424, and 0.6983 for jump dummy (\(Jump\)). These results suggest that the firm in our sample experiences more intraday jumps than crashes in the sample period.

Table 5 Descriptive statistics

Turning to the statistics to information supply and demand in Table 5. The mean value of the daily number of articles published in the Baidu News channel (\(News\)) is 4.9128. The median of News is 1 and the largest value is 163. The large difference between the mean and median may influence the regression results. Thus, in our regression, we use the natural logarithm to adjust for this problem. In addition, the mean of the daily abnormal search volume (\(Ab\_BSVI\)) is 0.07342, which indicates that the average value of daily investors’ information demand is 7.32% greater than the normal level.

Table 6 reports the correlations among our key variables and other control variables. The correlation coefficient between crash frequency (\(Crash Freq\)) and online information supply (\(LnNews\)) is 0.0399, and 0.0257 for crash dummy (\(Crash\)). Moreover, the correlation coefficient between crash frequency (Crash Freq) and investor information demand (\(Ab\_BSVI\)) is 0.0806, and 0.0552 for crash dummy (\(Crash\)). These results indicate that crash risk (Crash Freq and Crash) is positively associated with information supply and demand throughthe scale is small. For jumps, the frequency measure (Jump Freq) and jump dummy (Jump) are both positively correlated with information supply and demand, respectively.

Table 6 Correlation matrix

4 Empirical Results

4.1 The Impact of Information Supply on Stock Price Crash and Jump Risk

In this section, we employ regression analysis to examine the relation between intraday crash (or jump) risk and online information supply. To formalize this evidence in a multivariate setting, we employ a fixed-effects Poisson regression analysis. Because, crash (or jump) frequency and crash (or jump) dummy are based on typical count data with a discrete integer value, and the OLS regression model is not suitable. Standard errors are clustered at the firm level in all regressions. The regression specifications are as follows:

$${Crash}_{i,t}={\beta }_{0}+{\beta }_{1}{LnNews}_{i,t}+{\beta }_{2}{MB}_{i,t}+{\beta }_{3}{Turnover}_{i,t}+{\beta }_{4}{PE}_{i,t}+{\beta }_{5}{ROA}_{i,t}+{\beta }_{6}{SIZE}_{i,t}+{\beta }_{7}{LEV}_{i,t}+{\varepsilon }_{i,t}$$
(4a)
$${Crash Freq}_{i,t}={\beta }_{0}+{\beta }_{1}{LnNews}_{i,t}+{\beta }_{2}{MB}_{i,t}+{\beta }_{3}{Turnover}_{i,t}+{\beta }_{4}{PE}_{i,t}+{\beta }_{5}{ROA}_{i,t}+{\beta }_{6}{SIZE}_{i,t}+{\beta }_{7}{LEV}_{i,t}+{\varepsilon }_{i,t}$$
(4b)
$${Jump}_{i,t}={\beta }_{0}+{\beta }_{1}{LnNews}_{i,t}+{\beta }_{2}{MB}_{i,t}+{\beta }_{3}{Turnover}_{i,t}+{\beta }_{4}{PE}_{i,t}+{\beta }_{5}{ROA}_{i,t}+{\beta }_{6}{SIZE}_{i,t}+{\beta }_{7}{LEV}_{i,t}+{\varepsilon }_{i,t}$$
(4c)
$${Jump Freq}_{i,t}={\beta }_{0}+{\beta }_{1}{LnNews}_{i,t}+{\beta }_{2}{MB}_{i,t}+{\beta }_{3}{Turnover}_{i,t}+{\beta }_{4}{PE}_{i,t}+{\beta }_{5}{ROA}_{i,t}+{\beta }_{6}{SIZE}_{i,t}+{\beta }_{7}{LEV}_{i,t}+{\varepsilon }_{i,t}$$
(4d)

where \(i\) denotes the firm, \(t\) denotes the day, and \({\varepsilon }_{i,t}\) is the error term.

Table 7 reports the main estimation coefficients and associated standard errors cluster at the firm level. The t-statistics are calculated using standard errors adjusted for heteroskedasticity and clustering at the firm level. Model 1 shows the results for the crash dummy (\(Crash\)) (Eq. 4a). The coefficient of information supply (\(LnNews\)) is positive and statistically significant at 1% level, suggesting that firms with high information supply are more likely to experience intraday stock price crash. The marginal effect of information supply (0.9468) on the crash dummy is 0.0240 (0.0254 \(\times \hspace{0.17em}0.9468)\), indicating that one standard deviation rises in firms’ information supply lead to 0.0254 increase in crash probability. Model 2 reports the estimation results for the crash frequency (\(Crash Freq\)) (Eq. 4b). The coefficient of information supply (\(LnNews\)) is positive and statistically significant at 1% level, which suggests that the more information supply, on average, lead to more crash frequency. The effects of online information supply on crash risk is not only statistically significant but also economically meaningful. These positive estimation coefficients imply that more information supply induces more frequency and extremely large intraday price reactions to firm-specific information. The effect of online information supply in increasing intraday crashes is important, which is a contrast with the prior evidence of Jim ad Myers (2006) and Huton et al. (2009), who find that the more information disclosure decrease crash frequency. However, our findings are consistent with our Hypothesis 1a and Aman (2013), who find that the more media reports lead to more weekly crash frequency.

Table 7 The impact of online information supply on intraday crash and jump risk

Turning to jumps, Model 3 reports the estimation results of the effect of online information supply on jump dummy (\(Jump\)) (Eq. 4c). The coefficient of online information supply (\(LnNews\)) is positive and statistically significant at 1% level, which indicates that the firm with more online information supply is more easily to experience an intraday stock price jump. In term of economic significance, increasing of online information supply by one standard deviation (0.9468) raise positive jump probability by 0.0158 (0.0167 \(\times\) 0.9468). Model 4 reports the estimation results for jump frequency (\(Jump Freq\)). The coefficient of online information supply is positive and statistically significant at 1% level, suggesting that more online information supply induce more frequent extreme upward movements in intraday stock price. This result is inconsistent with Hutton et al. (2009) and Aman (2013), who find no evidence of the disclosure transparency and media coverage on jumps. Overall, it appears that the information supply plays a more important role in disseminating both bad news and good news.

The results of the control variables are summarized as follows. The coefficient of \(Turnover\) is positive and statistically significant both for crashes and jumps. This indicating that increase liquidity tends to higher the crash and jump frequency, and is consistent with Chang et al. (2017), who find that stock liquidity increases stock price crash risk. The \(MB\) is positively correlated with crashes and jumps, which indicates that a higher market-to-book ratio leads to more frequent crashes and jumps. The estimation coefficient of \(Size\) indicates that larger firms tend to lower crash frequency but raise the jump frequency. And, the higher \(Leverage\) causes less frequent crashes and jumps.

4.2 The Impact of Information Demand on Stock Price Crash and Jump Risk

In this section, we investigate the effect of the individual information demand on intraday stock crashes and jumps. Because crash (or jump) frequency and crash (or jump) dummy are the discrete integer values, we fit a Poisson regression model to our data. The regression models are as follows:

$${Crash}_{i,t}={\beta }_{0}+{\beta }_{1}{Ab\_BSVI}_{i,t}+{\beta }_{2}{MB}_{i,t}+{\beta }_{3}{Turnover}_{i,t}+{\beta }_{4}{PE}_{i,t}+{\beta }_{5}{ROA}_{i,t}+{\beta }_{6}{SIZE}_{i,t}+{\beta }_{7}{LEV}_{i,t}+{\varepsilon }_{i,t}$$
(5a)
$${Crash Freq}_{i,t}={\beta }_{0}+{\beta }_{1}{Ab\_BSVI}_{i,t}+{\beta }_{2}{MB}_{i,t}+{\beta }_{3}{Turnover}_{i,t}+{\beta }_{4}{PE}_{i,t}+{\beta }_{5}{ROA}_{i,t}+{\beta }_{6}{SIZE}_{i,t}+{\beta }_{7}{LEV}_{i,t}+{\varepsilon }_{i,t}$$
(5b)
$${Jump}_{i,t}={\beta }_{0}+{\beta }_{1}{Ab\_BSVI}_{i,t}+{\beta }_{2}{MB}_{i,t}+{\beta }_{3}{Turnover}_{i,t}+{\beta }_{4}{PE}_{i,t}+{\beta }_{5}{ROA}_{i,t}+{\beta }_{6}{SIZE}_{i,t}+{\beta }_{7}{LEV}_{i,t}+{\varepsilon }_{i,t}$$
(5c)
$${Jump Freq}_{i,t}={\beta }_{0}+{\beta }_{1}{Ab\_BSVI}_{i,t}+{\beta }_{2}{MB}_{i,t}+{\beta }_{3}{Turnover}_{i,t}+{\beta }_{4}{PE}_{i,t}+{\beta }_{5}{ROA}_{i,t}+{\beta }_{6}{SIZE}_{i,t}+{\beta }_{7}{LEV}_{i,t}+{\varepsilon }_{i,t}$$
(5d)

here \(i\) denotes the firm, \(t\) denotes the day, and \({\varepsilon }_{i,t}\) is the error term.

Table 8 reports the estimation results. Model 1 gives results for the crash dummy. The coefficient of investor information demand (\(Ab\_BSVI\)) is positive and statistically significant at 1% level, indicating that firms with higher investor information demand are more likely to experience an intraday crash. The marginal effect of investor information demand on crash dummy is 0.0851, indicating that one standard deviation rises in investor information demand (0.3735) leads to 0.0318 (0.0851 \(\times\) 0.3735) increase in crash probability. Model 2 reports the results for crash frequency. The coefficient of investor information demand (\(Ab\_BSVI\)) is positive and statistically significant at 1% level, which suggesting that more investor information demand induce more frequent and intraday extremely large price falls.

Table 8 The impact of online information demand on intraday crash and jump risk

As for jumps, Model 3 shows the results for jump dummy (\(Jump\)). The coefficient of information demand (\(Ab\_BSVI\)) is positive and statistically significant at 1% level, which suggests that individual investors’ information demand increases the probability of price jump. This coefficient also suggests that one standard deviation increase in information demand (0.3735) leads to a 0.0271 (0.0725 \(\times\) 0.3735) increase in intraday jumps. Model 4 shows the results for jump frequency (\(Jump Freq\)). The coefficient of information demand is positive and statistically significant at 1% level, which indicates that individual investors’ information demand induces higher intraday jump frequency. These results are consistent with Schroff et al. (2015), who find that retail investor information demand induces an upward price pressure on securities price.

Stock price crash (or jump) occurs when accumulated bad (or good) news is released in the capital market (Habib, et al. 2017). The higher abnormal information demand indicates that some investors are aware of pending bad (or good) news, the trading activities between informed investors and the uninformed investor will produce price crashes (or jumps) (Chen, et al. 2001). Our results are consistent with Hypothesis 2a, suggesting that individual investors do get valuable information from the Internet channel.

4.3 Additional Test

4.3.1 Subperiod Analysis

To further investigate whether our findings are different in bull and bear market periods, several additional models are fitted. Table 2 presents the bull and bear market phases over the period from January 1, 2013 to April 1, 2019. We defined phase dummy variable as follows: \(Period\) equal to 1 if the market is prosperity (Bull market period), and zero otherwise. Specifically, we estimate the following models:

$${Crash}_{i,t }({Jump}_{i,t})={\beta }_{0}+{\beta }_{1}{LnNews}_{i,t}+{{\beta }_{2}{Period}_{i,t}+{\beta }_{3}({LnNews}_{i,t}\times {Period}_{i,t})+\beta }_{4}{Controls}_{i,t}+{\varepsilon }_{i,t}$$
(6a)
$${Crash Freq}_{i,t }({Jump Freq}_{i,t})={\beta }_{0}+{\beta }_{1}{LnNews}_{i,t}+{{\beta }_{2}{Period}_{i,t}+{\beta }_{3}({LnNews}_{i,t}\times {Period}_{i,t})+\beta }_{4}{Controls}_{i,t}+{\varepsilon }_{i,t}$$
(6b)
$${Crash}_{i,t }({Jump}_{i,t})={\beta }_{0}+{\beta }_{1}{Ab\_BSVI}_{i,t}+{{\beta }_{2}{Period}_{i,t}+{\beta }_{3}({Ab\_BSVI}_{i,t}\times {Period}_{i,t})+\beta }_{4}{Controls}_{i,t}+{\varepsilon }_{i,t}$$
(6c)
$${Crash Freq}_{i,t }({Jump Freq}_{i,t})={\beta }_{0}+{\beta }_{1}{Ab\_BSVI}_{i,t}+{{\beta }_{2}{Period}_{i,t}+{\beta }_{3}({Ab\_BSVI}_{i,t}\times {Period}_{i,t})+\beta }_{4}{Controls}_{i,t}+{\varepsilon }_{i,t}$$
(6d)

here \(i\) denotes the firm, \(t\) denotes the day, and \({\varepsilon }_{i,t}\) is the error term. And \(Controls\) include \(MB\), \(Turnover\), \(PE\), \(ROA\), \(SIZE\), and \(LEV\).

Table 9 reports the estimation results for regression 6 (a–b), showing that the impact of information supply on intraday crash and jump risk with considering the interaction of phase dummy. Model 1 shows the results for the crash dummy. The coefficient of \(Period\) is-0.0368 and statistically significant at 1% level, meaning that the probability of the crash occurs is lower in the bull market phase. The coefficient of online information supply (\(LnNews\)) is positive and significant at 1% level. This result is also consistent with Hypothesis 1(a), meaning that the online information supply would increase the probability of crash risk. The coefficient of the interaction term \((Period\times LnNews)\) is significantly negative, which means that online information supply explains more of the probability of the crash in the bear market phase. Model 2 shows the results for crash frequency. The coefficient of \(Period\) is -0.0598 and statistically significant at 1% level, meaning that the crash frequency is higher in the bear market phase. The coefficient of \(LnNews\) is significantly positive but the coefficient of the interaction term is significantly negative, which means that the online information supply can explain more crash frequency in the bear market phase. Model 3 and 4 show the results for crash dummy and crash frequency, respectively. The coefficients of the market cycle dummy (\(Period\)) are both significantly positive, meaning that the probability and frequency of price jump are higher in the bull market phase. Meanwhile, the coefficients of \(LnNews\) are both significantly positive, but the coefficients of the interaction terms are both negative, which means that online information supply can explain more the probability and frequency of price jumps in bear market phase. Overall, the evidence shows that firms’ price more likely to experience extreme fall in bear market phase, but the extreme price rise more likely to occur in the bull market phase. Furthermore, consistent with Hypothesis 1 (a), the online information supply is positively related to the extreme price change, but the online information supply plays a more important role to explain crash and jump risk in the bear market phase.

Table 9 The impact of online information supply on intraday crash and jump risk with considering the market period

Table 10 reports the estimation results for regression 6 (c–d). Model 1–2 show results for crash dummy and crash frequency. We find that the investors’ information demand has a significantly positive impact both on the probability and frequency of crash. The coefficients of the market cycle dummy (\(Period\)) are both significantly negative, but the coefficients of the interaction term (\(Period\times Ab\_BSVI\)) are both significantly positive, which means that price crash more likely to occur in bear market phase and the investors’ information demand can explain more probability and frequency of crash in bull market phase. Model 3–4 show the results for jump dummy and jump frequency. We also find that there is a significantly positive relationship between the investors’ information demand and price jump. The coefficients of the market cycle dummy are both significantly positive, meaning that the firms’ price is more likely to experience price jump in the bull market phase. Meanwhile, the coefficients of the interaction term are both significantly positive, which means that the investors’ information demand can explain more of the price jump in the bull market phase. Overall, we find that the investors’ information demand has a significant impact both on the price crash and jump, which is consistent with Hypothesis 2(a). The evidence also shows that firms’ prices more likely to experience the crash in the bear market phase, but the price jump more likely to occur in the bull market phase. Finally, the investors’ information demand plays a more important role to explain price extreme change in the bull market phase.

Table 10 The impact of online information demand on intraday crash and jump risk with considering the market period

4.3.2 An alternative Measure of Crash Risk

To ensure our baseline results are robust, we conduct further analysis of alternative model specification and variable definitions. We employ the negative coefficient of skewness (\(NCSKEW\)) as an alternative measure of crash risk. We define \(NCSKEW\) as the third moment of firm-specific minutely returns for each firm in a trading day divided by the standard deviation of firm-specific minutely returns raised to the third power, and then multiplied by-1 (Chen et al. 2001; Andreou et al. 2017). For a given firm \(i\) in a trading day \(t\), we compute \(NCSKEW\) as follows:

$${NCSKEW}_{i,t}=-1\times \frac{n{(n-1)}^{3/2}\sum {M}_{i,m}^{3}}{(n-1)(n-2){(\sum {M}_{i,m}^{2})}^{3/2}}$$
(7)

here, \(n\) is the number of firm-specific minutely returns during the trading day \(t\). A higher value of \(NCSKEW\) indicates a more left-skewed return distribution and the firms more likely to experience crashes.

Table 11 reports the OLS regressions in which \(NCSKEW\) is used as a dependent variable. Similar to crash dummy and crash frequency, information supply and demand is positively and statistically significantly associated with \(NCSKEW\). This estimation coefficient of information supply (\(LnNews\)) is translated to the economic scale of a 0.01534 (0.0162 \(\times\) 0.9468) standard deviation increase in \(NCSKEW\) responding to one standard deviation increase in \(LnNews\). And the marginal effect of information demand (\(Ab\_BSVI\)) on \(NCSKEW\) is 0.0578, which suggests that a one standard deviation increase in information demand leads to 0.0214 (0. 0578 \(\times\) 0.3735) increase in \(NCSKEW\). That is, increase information supply (or demand) is likely to increase the probability of crashes for firms. This result supports our prior findings.

Table. 11 The impact of information supply and demand on crash risk: Alternative measure

4.3.3 Exclude Special Observations

We note that our sample includes observations of a firm experiencing both crashes and jumps on the same day. These special observations mean that there are both crashes and jumps for the same information supply and demand. To test the robustness of our results, it is necessary to make an appropriate distinction between cases in which we observe either cash or jump and cases in which we observe both. Thus, we exclude the observations where the company’s price is measured as both crash and jump on the same day, then, we examine the relationship between information supply (demand) and intraday price crash again.

Table 12 reports the results of online information supply and demand on intraday crash risk with the sample excluding the observations that experience both price crash and jump on the same day. We find that the online information supply has a significantly positive impact both on intraday crash probability and crash frequency. Moreover, the coefficients of the investors’ information demand variable are both significantly positive, which means that there is a positive relationship between the investors’ information demand and intraday price crash. These results are consistent with Hypothesis 1 (a) and Hypothesis 2(a), and prove that our finding is robust.

Table 12 The impact of information supply and demand on crash risk: Exclude special observations

5 Conclusion

The economic role of information in the financial market is highly important and has been widely examined. In this study, we investigate the impact of daily information supply and demand on intra-daily stock price behavior, focusing particularly on the intraday crashes and jumps. The analysis employs a novel proxy for information supply which is derived on the basis of the number of news items containing a specified keyword in their headlines collected by the Baidu News channel. The internet search volume is used as the proxy for individual investors’ information demand. Compared with previous literature, we first provide primary evidence to prove the existence of intraday crashes and jumps in the Chinese stock market. Then, we further investigate how firm-specific information is incorporated into price immediately after the fundamentals of firms have changed.

We find a significant positive effect of firm-specific information supply on intraday crashes and jumps. This implies that crash (or jump) frequency tends to increase with information supply. We argue that this finding stems from that information supply prompts investors’ extremely large reactions to information. This confirms that firm-specific information supply causes the intraday crashes and jumps, both for crash (or jump) dummy and crash (or jump) frequency.

Similar to the information supply, we find that positive and statistically significant effect of firm-specific information demand on intraday crashes and jumps. This suggests that information demand is likely to induce extremely price falls (or rise). The investors’ information demand increases when they are aware of pending bad (good) news before the firm disclosure it. The undisclosed information can also cause the price extremely change.

Finally, we divide the whole sample period into two subperiods: Bull market period and Bear market period, and find that firms are more likely to experience crashes in bear market phase, but intraday price jump is more likely to occur in bull market phase. In addition, the interaction term of market cycle phase dummy and information supply shows the evidence of interactive effects. We find that the information supply can explain more of the intraday crash in the bear market phase. As for information demand, the interactive effect of information demand on the intraday price crash is a lager in the bull market phase.