Introduction

How financial reporting affects managers’ investment decisions is a fundamental question in accounting. Financial reporting contributes to mitigating information asymmetry between firms and investors, as well as among investors. Thus, it will alleviate external financial constraints and enhance external supervision. Consequently, this will reduce agency costs from adverse selection and moral hazards and encourage managers to make better investment decisions (Biddle and Hilary 2006; Biddle et al. 2009). Although the actual effect of financial reporting is well recognized, most studies in this field focus on quantitative disclosure in the annual report, such as earnings management (McNichols and Stubben 2008), accounting conservatism (Balakrishnan et al. 2016) or disclosure transparency (Zhong 2018). Whether this real effect can extend to qualitative disclosure remains unknown. Motivated by the claims that few papers examine the “real effects” of disclosure processing costs on corporate actions, our study fills this gap by exploring the association between the textual readability of the annual report and research and development (R&D) investment.

Specifically, a growing body of the literature has contributed to the information content of annual report readability in recent years. Many studies find that poor readability of textual information decreases the disclosure processing fluency of investors and increases their disclosure processing costs (You and Zhang 2009; Lee 2012; Rennekamp 2012). Consequently, it induces adverse market reactions, for example, fewer and less profitable trades among small investors (Miller 2010), less accurate analyst forecasts (Loughran and McDonald 2014), higher stock return volatility, stock price crashes (Bonsall et al. 2017; Kim et al. 2019), and higher cost of equity capital (Rjiba et al. 2021). Additionally, some focus on the effect of disclosure readability on executive actions. For example, Lo et al. (2017) find that managers have strong incentives to manage earnings when the disclosures are more complex. Luo et al. (2018) point out that more readable annual reports experience lower agency costs through reducing information asymmetry, which reflects an indirect effect of disclosure readability.

R&D investment is a critical managerial decision for enhancing a firm’s competitiveness. However, the under-investment phenomenon of R&D expenditures is popular among listed companies in China. Because R&D activities have high input expenses, long duration, and high uncertainty of innovation outputs, abundant cash reserves are necessary to support R&D investment (Hall 2002). The agency theory suggests that external financial constraints due to information asymmetry induce managerial myopia and cut R&D investment (Roychowdhury 2006). Moreover, cash is the most liquid asset of firms and is easily misused or tunneled by managers or large shareholders, which reduces resource allocation efficiency and harms R&D investment (Chen et al. 2017). In the Chinese capital market, considering higher equity capital costs and strict requirements for seasoned equity offerings, listed companies are difficult to obtain external financial resources (Ju et al. 2013). Particularly, the weak corporate governance and legal protection for small investors’ rights induce tunneling behaviors of large shareholders rampantly, so cash holdings are more likely to be misused (Kuo et al. 2022). Accordingly, most Chinese listed companies face insufficient cash or resources to support R&D investment.

Most research attributes R&D under-investment to managerial myopia and finds that high-quality disclosure can alleviate myopic R&D cuts (Shroff 2017; Bae et al. 2017; Heitzman and Huang 2019). We develop our argument from the perspective that increased annual report readability mitigates the information asymmetry of investors and it will improve R&D investment levels. Specifically, a more readable annual report helps reduce disclosure processing costs of investors, which mitigates their information asymmetry. Lower information symmetry implies fewer financial constraints and tunneling behaviors from large shareholders (La Porta et al. 1999; O’Hara 1999; Ascioglu et al. 2008). Furthermore, since fewer financial constraints and tunnelings expand cash or resources invested in R&D activities, it will encourage managers to take on more risky R&D investment projects. We argue that the higher readability of the annual report enhances the level of R&D investment.

We choose China’s data to verify our argument for the following reasons. First, most studies rely on the analysis of the USA or other English-speaking countries, and few have explored the real economic effects of annual report readability in the Chinese context. Chinese expression has many differences from English in linguistics and logic structures, which implies that we cannot use prior results to test Chinese disclosure readability. Second, because small or non-professional investors are the dominant component in China’s capital market, they usually face higher disclosure processing costs than developed capital markets. In addition, China’s capital market is still weak on investor legal protection (Ke and Zhang 2021), which induces listed firms with lower information disclosure quality and increased agency costs. In particular, listed firms are challenged to realize financing at lower capital costs, and tunnelings of large shareholders are common. Accordingly, investors may be more sensitive to annual report readability, which will significantly impact management R&D investment decisions.

Taking 26,359 firm-year observations for Chinese A-share listed firms from 2007 to 2019, we explore the association between annual report readability and corporate R&D investment. According to the prior literature, we use the linguistic complexity of hand-collected yearly reports to measure readability. We find that higher readability of annual reports is associated with increased R&D investment, consistent with our argument. Moreover, our finding exists after several robustness tests, such as the instrumental variable method, Heckman’s two-step, sample matching, and other robustness checks.

Through mediation analysis, we find that a more readable annual report improves R&D investment by reducing financial constraints and restraining the tunnelings of large shareholders. Cross-sectional tests suggest that the lower manipulating incentives of management and lower information processing ability of investors enhance the real effect of annual report readability. These results help mitigate the endogeneity concern from omitted variables. Furthermore, we find that there is a positive spillover effect of peer annual report readability for the R&D investment of target firms.

Overall, our results demonstrate that increased annual report readability is conducive to improving the level of R&D investment. Our study makes the following contributions to the literature. First, we contribute to the literature on how firm disclosure quality affects real investment decisions, such as earnings quality (Biddle et al. 2009), mandatory disclosure regulations (Chen et al. 2013; Albuquerque and Zhu 2019), and weakness of internal control over financial reporting (Cheng et al. 2013; Feng et al. 2015). We explore this research from the novel insight of qualitative disclosure quality. Our results confirm that the real effect of disclosure quality not only exists in financial reporting but also extends to qualitative disclosure as manifested in readability.

Second, our study contributes to the literature on the economic consequences of annual report readability. On the one hand, we construct a readability index for the annual report based on the Chinese context, which is significantly different from the English context, and enlarge the research on China’s annual report readability. On the other hand, we extend the results of annual report readability from investors’ valuation to management’s financial activities, which responds to the claims that few papers examine the “real effect” of disclosure processing costs on corporate actions (Blankespoor et al. 2020).

Finally, our study enlarges the literature on the channels of annual report readability impacting managerial R&D investment decisions. Our study confirms an association between annual report readability, agency costs, and R&D investment. Additionally, we prove that there exist considerable differences in information processing ability across professional and non-professional investors in China’s capital market and negative information content of manipulating readability. These results clarify how annual report readability shapes managerial investment decisions and varies with information environments.

The remainder of our paper is organized as follows. “Literature review and hypothesis development” section reviews the literature and develops our hypothesis. “Research design” section introduces our research design and sample. “Empirical results” section discusses the empirical results and robustness tests. “Additional analysis” section presents additional analyses, and “Conclusion” section draws our conclusion.

Literature review and hypothesis development

The effect of financial reporting on real investment

It is well documented that financial reporting affects corporate investment decisions. Roychowdhury et al. (2019) attribute their mechanisms to information friction, such as information asymmetry and information uncertainty. According to the principal-agent theory, the agent conflict caused by incentive incompatibility can easily induce managers’ moral hazard behaviors (Jensen and Meckling 1976; Jensen 1986). High-quality financial reporting helps external shareholders to understand and supervise the performance of the manager’s fiduciary responsibility timely, which in turn alleviates the information asymmetry and reduces the manager’s overinvestment or opportunistic behavior in building a business empire (Biddle and Hilary 2006; Hope and Thomas 2008; Bushman et al. 2011; Delgado-Domonkos and Zeng 2023). Moreover, high-quality financial reporting can better describe the asset value of existing investment opportunities and reduce the information asymmetry between external investors and internal managers; thus, it helps reduce external financial constraints and expand capital expenditures (Biddle et al. 2009; Chen et al. 2011; Balakrishnan et al. 2016; Shroff 2020; Goldstein et al. 2023), but crowds out private firm’s capital investment (Liu et al. 2023).

Managers may also face the challenge of information uncertainty when making investment decisions because they do not fully understand all current and future investment opportunities (Roychowdhury et al. 2019). However, the management is motivated to learn incremental information through private investors, analyst reports, product market adjustments, and peer firms in the same industry due to information acquisition and processing costs. As a result, they can alleviate information uncertainty and optimize investment decisions (Durnev and Mangen 2009; Shroff et al. 2014; Goldstein and Yang 2019; Heitzman and Huang 2019).

There is ample evidence that financial reporting disclosures influence corporate R&D investment. Since the accounting standard requires that firms disclose R&D expenditures as expenses during the operating period, management can increase the current income by reducing R&D activities to meet earnings performance targets, resulting in insufficient R&D investment (Bushee 1998; Stein 2003; Roychowdhury 2006). However, some argue that high-quality financial reporting helps to improve R&D investment. For example, Park (2018) reports a negative association between earnings management and innovation efforts. Zhong (2018) finds that higher financial reporting transparency mainly promotes management’s innovation activities by reducing career concerns, alleviating the uncertainty of corporate long-term R&D investment, and improving the R&D investment level. Laux and Ray (2020) find more conservative accounting policies prompting corporate innovation activities. Chircop et al. (2020) show that the accounting comparability among peer firms is conducive to enhancing the ability to predict future cash flows generated by R&D investment and improving the efficiency of R&D investment.

Other evidence suggests that qualitative disclosure in the annual report impacts corporate investment. Dyreng et al. (2016) find that requiring companies to disclose the geographical location of subsidiaries helps reduce their motivation to invest and set up subsidiaries in tax havens. Christensen et al. (2017) find that positively declaring coal mine safety performance by coal mine owners in their annual reports is beneficial to increasing investment in coal mine safety. Cheng et al. (2012), based on Chinese listed companies, suggest that non-financial disclosures can alleviate external financial constraints and reduce under-investment but increase over-investment of managements. Huang et al. (2014) report that the tone management level of the annual report is positively associated with merger and acquisition activities. Durnev and Mangen (2020) find that the positive MD&A tone of peers is conducive to promoting the target firm’s real investment. De Simone and Olbert (2022) reveal that the mandatory disclosure of private country-by-country reporting improves corporate capital investment.

Linguistic properties and disclosure processing costs

Under the practical and imperfect capital market, investors need to spend time and effort to obtain and understand the information content of listed companies, that is, disclosure processing costs. Blankespoor et al. (2020) divide disclosure processing costs into three types such as awareness costs, acquisition costs, and integration costs. They further point out that the rapid development of the new information era has made investors access information conveniently, significantly reducing awareness and acquisition costs. How investors integrate information more efficiently has become an important research topic for disclosure processing costs. Some evidence finds that the linguistic properties of qualitative disclosure in the firm’s annual report significantly impact the cost of disclosure processing. For example, the annual report tone can reflect management’s sentiment and convey value-related information to external information users, which alleviates integration costs of disclosure processing. However, disclosure tone lacks adequate external supervision, and management has the selfish motivation to manipulate disclosure tone, thus expanding disclosure integration costs (Li 2010; Huang et al. 2014; Brochet et al. 2019).

Linguistic readability or complexity is another important disclosure property, which reflects the ease of comprehending textual information. Psychological research shows that individual information processing fluency depends on the specific context and subconsciously feels the degree of subjective comfort (Schwarz 2004). Individuals tend to feel higher credibility, more positive sentiment, and more substantial confidence in disclosure with higher fluency (Alter and Oppenheimer 2009). So, the readability of written texts is closely related to the fluency of processing, which determines disclosure processing costs. Existing research finds that the worse readability of the annual report induces higher post-earnings announcement drift (You and Zhang 2009), lower trading activities and investment return of retail investors (Miller 2010; Lawrence 2013), more efforts in writing analysts (Loughran and McDonald 2014), greater equity mispricing (Chen et al. 2023), and lower stock return synchronicity (Gangadharan and Padmakumari 2023). Rennekamp (2012) explains the mechanism between readable disclosure and investors’ decision-making through experimental research. They find that a readable annual report increases the fluency of investors’ disclosure processing, which enhances the perception of management’s reliability and thus produces a positive market reaction. Tan et al. (2019) find that the impact of linguistic processing fluency on investor decision-making significantly depends on the level of individual investors’ industry knowledge. Moreover, the less readable annual report can predict the probability of a firm’s bankruptcy (Le Maux and Smaili 2021). However, Dalwai et al. (2021) find no significant relationship between annual report readability and firm performance.

To further explore the motivation of textual readability, researchers struggle to isolate manipulating readability from the non-manipulating readability of the annual report. Firms that have ethical and legal responsibilities tend to disclose more readable and understandable annual reports (Bajaj et al. 2023; Xu et al. 2022). On the one hand, the complexity of corporate business operations, disclosure rule changes, and stricter supervision have led to a textual readability decrease normally. On the other hand, qualitative disclosure primarily consists of descriptive texts that are fuzzy and difficult to identify (Huang et al. 2014). Therefore, compared with quantitative disclosure, management motivated by self-interest prefers to conceal negative information by manipulating complex qualitative disclosure (Nadeem 2022; Sun et al. 2022a). Bonsall et al. (2017) find that if controlling non-discretionary complexity factors, some findings of disclosure readability are diminished. Bushee et al. (2018) report that discretionary complexity reduces stock liquidity. DeHaan et al. (2021) suggest that discretionary complexity will exacerbate the disclosure processing costs of investors and result in higher fund fees. However, there has been no consensus on clearly distinguishing between textual manipulating readability and non-manipulating readability.

Overall, higher-quality disclosure is beneficial to promoting real corporate investment by information mechanism. The readability of qualitative disclosure, as a critical information property, represents either high disclosure quality or discretionary behavior of management. However, few studies have focused on the effect of annual report readability on corporate investment decisions. Consequently, we, therefore, aim to investigate the effect of annual report readability on R&D investment in the context of China.

Hypothesis development

According to the background of the R&D under-investment problem resulting from a lack of financial support, we argue that increased readability of annual reports facilitates reducing financial constraints and tunnelings, and then providing more available funds to improve corporate R&D investment. Based on the information processing fluency theory, the information will become difficult to process if one text contains vague and obscure words or complex sentences (Alter and Oppenheimer 2009). Investors regard the relatively fluent information as a subconscious and heuristic decision-making clue, which can improve their perception of the reliability of management, regardless of whether the disclosure is positive or negative (Hafner and Stapel 2010, Reneekamp 2012). So, readable disclosure facilitates investors to constrain information integration costs and obtain valuable information that reduces information asymmetry between corporate and external investors. Specifically, the readable disclosure can help many retail investors who are noise and non-professional traders in China’s capital market minus the information gap to professional investors.

Moreover, with the decrease in information asymmetry, investors will make positive evaluations of management and reduce the cost of equity capital that induces external financial constraints (O’Hara 1999; Ascioglu et al. 2008). From the perspective of resource dependency theory, corporate R&D investment is a costly, long-lasting, and hazardous activity that needs abundant cash reserves and financial support (Hall 2002). Therefore, a more readable annual report mitigates the under-investment of R&D activities by reducing financial constraints.

In addition, a readable annual report also plays a governance role in corporate innovation activities. The tunneling behavior by large shareholders, mainly through unfair related party transactions or internal group loans (Cheung et al. 2006; Jiang et al. 2010), is common in China. Because cash is the most liquid and transferable asset, severe tunnelings will harm or misuse cash required for innovation activities, which frustrates management from investing more in R&D projects (Chen et al. 2017). However, higher readability disclosure can provide value-related information and helps reduce information asymmetry between large and small shareholders (Luo et al. 2018), supervising large shareholders’ tunnelings on cash misuses and then improving accessible funds to support corporate R&D investment.

Some of China’s institutional features highlight the effects of annual report readability on R&D investment. First, the structure of investors in China is dispersed, contrary to the US capital market. Due to the limited ability of information acquisition and integration, retail or individual investors will respond more actively to higher readability disclosure than large investors (Miller 2010; Vieru et al. 2006; Lawrence 2013). Thus, annual report readability might impact R&D investment decisions significantly through information and governance effects. Second, China’s economy is in a transition period and needs large-scale innovation to drive high-quality economic development. Under the encouragement of favorable policies for innovation, firms prefer to invest in R&D activities rather than other real investments to obtain government subsidies and realize durable development. This particular institutional feature enhances the sensitivity of corporate R&D investment on available resources, highlighting the value of a readable annual report in solving investment problems.

The above discussions imply that increased readability of annual reports reduces the information asymmetry of investors, relieving insufficient funds due to financial constraints and tunneling, and then encourages corporate R&D investment activities. Overall, we expect increased readability of annual reports to promote corporate R&D investment levels. This leads to the following hypothesis.

H1

Ceteris paribus, higher annual report readability is associated with increased corporate R&D investment.

Moreover, two counterarguments add tension to H1. First, to the extent that information users perceive qualitative disclosure as “cheap talk,” managements lack external pressure to conform disclosure readability to real corporate situations. In this case, manipulating the readability of annual reports cannot provide valuable information to investors, even misleading their decisions, which results in higher information asymmetry and will fail to alleviate financial constraints and tunnelings that hinder corporate R&D investment. So, managers are more likely to cut R&D expenditures when higher manipulating readability increases information asymmetry. Second, disclosure processing ability depends on the information user’s educational background, industry knowledge, information environment, etc. There is a vast difference between different investors. For example, professional investors, such as institutional investors, sell-side analysts, or short sellers, use disclosure information more sophisticatedly than non-professional investors (Campbell et al. 2009; Battalio et al. 2012; Lee and Zhu 2022). In this view, the positive effect of a more readable annual report on corporate R&D investment might change when considering investors’ properties.

Research design

Sample selection

Our initial sample consists of all A-share listed firms from 2007 to 2019. Our sample period begins in 2007 because the Ministry of Finance of the People’s Republic of China implemented the new accounting standards for business enterprises in 2007, which has converged to IFRS and impacted corporate information disclosure significantly. We collect all firm-level financial data from the CSMAR databases except for the textual information of the annual report from the Shanghai Stock Exchange and Shenzhen Stock Exchange, the officially designated disclosure platform. In addition, we exclude observations (1) within the financial or banking industry; (2) labeled as ST or ST* status because they face severe operational risk, so their decisions may differ from ordinary firms; and (3) with missing data for estimating equations. After imposing these filtering criteria, our final sample consists of 26,359 firm-year observations, where the number of observations varies across different tests considering the data availability required by the tests. To mitigate the disturbance of outliers, we winsorize all continuous variables at the top and bottom 1%.

Measure of annual report readability

Some studies use specific linguistic characteristics, including Fog and Bog score, Flesch Reading Ease, and jargon, to describe annual report readability (Bonsall et al. 2017; Lo et al. 2017; Tan et al. 2019). Other proxies focus on the overall processing costs of textual disclosure, such as the number of words, the average length of sentences, and the file size of annual reports (Miller 2010; Loughran and McDonald 2014). However, research on annual report readability has only just emerged in China, and there is no consensus about proxies to measure the readability of Chinese firms’ annual reports (Luo et al. 2018).

On the one hand, there is a significant difference between Chinese- and English-speaking countries in logic, linguistics, the structure of words and sentences, and so on. Therefore, the Fog Index, Flesch Reading Ease, complex words, and other similar indexes to measure annual report readability are not comfortable in the Chinese context. On the other hand, Bonsall et al. (2017) assert that a large file size may include unrelated information to the underlying text in the disclosure, so the file size measurement is a rather noisy proxy for readability. Given the above reasons, we cannot directly use the prior methods to measure Chinese annual report readability.

According to Wang et al. (2018), we construct annual report readability proxies from the perspective of understandable ease of disclosure.Footnote 1 Pretorius (2006) argues that individuals struggle to understand written texts with complex logical relationships. Considering adversative sentences are hard to understand, we measure readability based on the proportion of disjunction numbers over every 100 total words in the annual report. Referring to Wang et al. (2018), as “rare words” in the text will reduce the fluency of readers’ reading and increase reading difficulties, annual report readability will decrease if it consists of many rare words. We take the intensity of “rare words” numbers over every 100 total words as the second proxy for readability. Additionally, jargon refers to management’s use of technical or obscure language in disclosures, which impedes the processing fluency of readers and the readability of management disclosures (Bonsall et al. 2017; Tan et al. 2019). Most small investors are non-professional. They are unfamiliar with professional accounting terms disclosed in the annual report and usually look at them as jargon.Footnote 2 Thus, we take the number of accounting terms to account for every 100 words of the annual report to proxy readability. Finally, we adjust three variables through one minus their range standardization and get Read_reverse, Read_rare, and Read_jargon. On this base, we calculate a comprehensive variable Read_comp which equals the sum of Read_reverse, Read_rare, and Read_jargon. A greater value of each variable represents more readable annual reports. We use textual analysis techniques in Python programming language to capture and process qualitative information.Footnote 3

Regression specification

To test H1 and evaluate the relative importance of annual report readability on R&D investment, we construct Eq. (1) as follows:

$$\begin{aligned} {\text{R}}\& {\text{D}}_{{{\text{t}} + {1}}} & = \beta_{0} \, + \,\beta_{{1}} {\text{Readability}}_{{\text{t}}} \, + \,\beta_{{2}} {\text{Size}}_{{\text{t}}} \, + \,\beta_{{3}} {\text{Lev}}_{{\text{t}}} \, + \,\beta_{{4}} {\text{Cash}}_{{\text{t}}} \, + \,\beta_{{5}} {\text{Netfin}}_{{\text{t}}} \\ & \quad + \beta_{{6}} {\text{Soe}}_{{\text{t}}} \, + \,\beta_{{7}} {\text{Age}}_{{\text{t}}} \, + \,\beta_{{8}} {\text{AH}}_{{\text{t}}} \, + \,\beta_{{9}} {\text{Rev}}_{{\text{t}}} \, + \,\beta_{{{1}0}} {\text{Employ}}_{{\text{t}}} \, + \,\beta_{{{11}}} {\text{KL}}_{{\text{t}}} \, + \,\beta_{{{12}}} {\text{BM}}_{{\text{t}}} \, \\ & \quad + \beta_{{{13}}} {\text{Roa}}_{{\text{t}}} \, + \,\beta_{{{14}}} {\text{DRoa}}_{{\text{t}}} \, + \,\beta_{{{15}}} {\text{Ret}}_{{\text{t}}} \, + \,\beta_{{{16}}} {\text{Tobinq}}_{{\text{t}}} \, + \,\beta_{{{17}}} {\text{HHI}}_{{\text{t}}} \, + \,\beta_{{{18}}} {\text{GDP}}_{{\text{t}}} \, + \,{\text{Fixed Effects}}\, + \,\varepsilon \\ \end{aligned}$$
(1)

In Eq. (1), R&D represents the next period’s research and development investment level, calculated as R&D expenditures divided by total assets (Zhong 2018). The explanatory Readability means annual report readability, including previous variables Read_reverse, Read_rare, Read_jargon, and Read_comp. In addition, we incorporate the following control variables that influence corporate R&D investment decisions. Size is the natural logarithm of total assets at the end of the period. Lev is financial leverage, defined as liabilities divided by total assets at the end of the period. Cash is cash and cash equivalents over total assets at the end of the year. Netfin is the sum of net finance, calculated as the difference between net cash received from equity and bonds issued and cash paid for purchasing financial assets scaled by total assets. Soe is a dummy variable that equals one for state-owned enterprises and zeroes otherwise. Age is the natural logarithm of one plus the listed years in the A-share market. AH is an indicator variable that equals one for firms cross-listing of A and H share and zeroes otherwise. Rev is the natural logarithm of major revenue. Employ is the natural logarithm of one plus the number of staff. KL is net fixed assets divided by total employees at the end of the year. BM is the book value of equity divided by the market value of equity. Roa is the ratio of net income over average total assets. DRoa is a change in net income scaled by average total assets. Ret is the annual stock return. Tobinq is the market value of equity scaled by total assets. HHI is the industry Herfindahl index, calculated as the sum of the square of the operating income divided by the total operating income in the same industry and taking its square root. GDP is the natural logarithm of the province’s per capita gross domestic product during the fiscal year. Fixed Effects represent the year and industry fixed effects. The standard errors of regression coefficients are two-way clustered by firm and year.

We include Size to control for any size effects. We include Lev, Cash, and Netfin because the financial resource is a critical element for innovation. We include Soe, Age, AH, Employ, and KL to control the effects of firms’ nature. We include Rev, Roa, and DRoa because better financial performance affects firm innovation activities. We include BM, Ret, and Tobinq as a proxy for firm value because the market evaluations induced by investors encourage managers to innovate. We also include HHI and GDP to capture the effects of external product market competence and macroeconomic development on corporate R&D investment decisions. According to the expectation of H1, the coefficient β1 in Eq. (1) should be significantly positive.

Table 1 exhibits the descriptive statistics. The mean R&D of 0.0153 implies that the proportion of R&D investment in total assets is 1.53%, close to 1.50% of the US listed firms (Zhong 2018). The mean of Read_reverse, Read_rare, Read_jargon, and Read_comp are equal to 0.5861, 0.8483, 0.5429, and 1.9773 respectively. These values indicate that most of China’s listed firms’ annual reports are readable but vary widely across listed firms, providing statistically powerful tests to examine our arguments.

Table 1 Descriptive statistics

Table 2 reports the matrix of Pearson correlation coefficients. Read_rare and Read_jargon are positively and significantly correlated with R&D. However, Read_Reverse and Read_comp are negatively correlated with R&D, inconsistent with our expectations of the previous hypothesis. Thus, we should rely on multiple regression analysis to test our hypothesis accurately. Most of the correlation coefficients between R&D and control variables are statistically significant, so our model is reasonable. Furthermore, the average of variance inflation factors (VIF) is 2.13, and the highest VIF is 6.50, suggesting that multicollinearity is not a concern.

Table 2 Pearson correlation matrix

Empirical results

Main empirical results

Table 3 reports the OLS regression results for examining the relationship between annual report readability and corporate R&D investment for Hypothesis 1. According to Column (1), the coefficient on Read_reverse is positive and significant at the 1% level(coeff. = 0.0024, t = 3.09), indicating the lower density of adversative words in the annual report promotes R&D investment levels in the future. Column (2) reports that the coefficient on Read_rare is positive and significant at the 5% level (coeff. = 0.0025, t = 2.40), suggesting that management’s use of less obscure or rare words in the annual report facilitates investing in research and development activities. In Column (3), the coefficient on Read_jargon is also positive and significant at the 1% level (coeff. = 0.0043, t = 4.25), showing the lower frequency of accounting terms in the annual report helps improve R&D investment levels in the next period. Moreover, the results presented in Column (4) show that the coefficient on Read_comp is significant and positive at the 1% level (coeff. = 0.0028, t-stat = 5.21). For economic significance, this result implies that the annual report readability improves the level of future R&D investment by 27.18% of its sample median (0.0028/0.0103). In conclusion, these results support H1 that higher readability of the annual report enhances the level of R&D investment in the future.

Table 3 Annual report readability and corporate R&D investment

Turning to our control variables, consistent with prior studies, we find that firms with lower financial leverage, lower listed age, lower book-to-market value of equity, and lower industry competence facilitate expanding R&D expenditures, which is consistent with Zhong (2018). In addition, We also find a positive and statistically significant coefficient on Rev, Roa, Employ, KL, and GDP, confirming that firms with higher profitability, abundant human capital, and highly developed macro-economy help inspire firms’ inputs of innovation.

Robustness tests

Endogeneity concern

Factors that impact corporate innovation decisions are numerous and complex, and our study may face the problem of omitting latent variables. Additionally, annual report disclosure and R&D investment are the results of the managerial decision, which implies the manager makes dynamic adjustments for the readability of the annual report according to current R&D expenditures. Thus, our findings may be challenged by the problem of reverse causality. Therefore, we take several tests to mitigate the above endogeneity concerns, where the results are exhibited in Table 4. Control variables results are not presented for brevity since they are similar to those in Table 3. Additionally, Read_comp is the sum of Read_reverse, Read_rare, and Read_jargon, which is more representative than the three variables on readability. Given this, we use Read_comp as the primary explanatory variable.

Table 4 Regression results of endogeneity concerns

We use the instrumental variable (IV) method by 2SLS regression to relieve the variable omitted problem. We take the educational background of one firm’s Chief Financial Officer (CFO) as an instrumental variable of annual report readability. First, innovation activities, as an essential strategy, are primarily decided by the chairman of the board and CEO. So the CFO’s educational background is unlikely to affect R&D investment directly, which meets the exogenous assumption of instrumental variables. Second, the CFO has major responsibilities for the content and quality of the annual report. A better CFO’s educational background represents richer professional theoretical knowledge, careful thinking, and a more vital ability to understand and refine the critical information from complex situations so that they can express statements to the outside more concisely. Thus, it meets the correlation assumption of instrumental variables. We use the final educational degree of the CFO obtaining and set variable CFO_degree representing for CFO’s educational background. If a CFO experiences technical secondary school or below, the value of CFO_degree is 1, technical school is 2, a bachelor’s degree is 3, a master’s degree is 4, and a doctoral degree is 5. As shown in Column (1) of Table 4, the coefficient on CFO_degree is positive and significant, indicating that the selected IV better explains the annual report readability without the problem of weak instrumental variables. The second regression of 2SLS in Column (2) of Table 4 shows that our inferences are unchanged.

The readability of the annual report may be the consequence of management characteristics. We use Heckman’s two-step to mitigate the self-selection problem. We select the management’s male ratio (Gender), the average age (Mage), average educational background (Degree), average receiving salary percentage (Paid), management shareholdings (Share), average overseas back (Oversea), and average financial back (Finback) as instrumental variables for annual report readability.Footnote 4

As shown in Column (3) of Table 4, the coefficient on management characteristics is significant, confirming that individual characteristics impact the annual report readability. Column (4) in Table 4 reports that after controlling the reverse mills ratio (IMR), the coefficient on Read_comp is still positive and significant, consistently with H1.

Moreover, we further use the propensity score matching (PSM) and entropy balance methods (Hainmueller 2012) to relieve issues of omitted variables and sample selection bias. First, set up the probability of readable annual reports as the dependent variable D_Readability, which equals one if firms are top 20% of the annual report readability and zero if firms are bottom 20%. Then, taking firm-level and other factors similar to control variables in Eq. (1) as the independent variable to make Probit regressions. Second, calculating the propensity score and getting the control group by one-to-one matching or calculating entropy weights for every covariant variable. After matching, the differences in covariant variables between the treatment group and control group become insignificant. Finally, we use matched samples or weighted variables repeating Eq. (1) regression. As shown in Column (5) and Column (6) of Table 4, these results do not change our conclusions.

We also conduct the following tests to mitigate endogeneity concerns. First, the China Securities Regulatory Commission issued announcement No. 24 in 2015, which modified the disclosure standard of “Management Discussion and Analysis” and impacted corporate qualitative disclosure significantly. At the same time, there was no innovation policy shock, but the average R&D investment of listed firms increased dramatically after 2015 (t = 13.87). Therefore, we take announcement No. 24 as an external policy shock to test the causality of the annual report readability and R&D investment. As shown in Column (7) and Column (8) of Table 4, the coefficient on Read_comp after policy shock is more significant and positive than the coefficient before policy shock (P-value for difference = 0.00). Second, the standard disclosure requires mandatory disclosing of R&D expenditures and voluntary disclosing relative information about innovation in the annual report, so it is necessary to exclude R&D information from the annual report to mitigate reverse causality. We use the residual of the annual report readability regressing on R&D investment as pure readability and repeat Eq. (1) regression. Third, we conduct a firm fixed effects model to control the influence of firm characteristics that are not changed over time or are unobservable. As shown in Column (9) and Column (10) of Table 4, the coefficient on Read_comp is positive and significant. The above tests do not change our conclusions.

Other robustness tests

We perform the following robustness tests, where the results are reported in Table 5. Koh and Reeb (2015) argue that missing R&D information in the annual report does not mean that there are no R&D activities. If we use zero to replace the missing value, it will produce inaccurate results. So, according to Zhong (2018), we use the median value of R&D filling in missed data or retaining non-missing samples to reexamine Eq. (1). As shown in Column (1) to Column (2) of Table 5, the coefficients on Read_comp are significant and positive at the 1% level, supporting our inferences. Then, we substitute measurements of Read_comp. First, the longer the sentences in the annual report, the more difficult it is to read and understand for information users. We set up Read_sent1 and Read_sent2 variables to represent the average length of a sentence and the ratio of long difficult sentences over total sentences, respectively, and one minus their standardization.Footnote 5 Thus, the annual report readability is measured by a new variable Read_comp1 which is the sum of Read_comp, Read_sent1, and Read_sent2, for which a more excellent value stands for higher readability. Second, the MD&A sector of the annual report consists of much qualitative information that reflects management’s sentiment and judgment. Accordingly, we use MD&A readability to measure annual report readability, similar to Read_comp on calculation, and get a new Read_comp2 variable.Footnote 6 As shown in Column (3) to Column (4) of Table 5, our conclusions do not change.

Table 5 Robustness tests

Furthermore, we add current R&D investment levels into Eq. (1) to control the trend of innovation activities. We also take non-high-tech firms after 2012 as our subsamples to avoid the shock of abnormal R&D investment and the disclosure policy of innovation information in 2012.Footnote 7 Finally, considering the left truncation problem of R&D expenditures, we perform the Tobit model to regress Eq. (1). The results in Columns(5) to (7) of Table 5 suggest that using different robustness tests does not affect our conclusions.

Additional analysis

The channel through which readability impacts R&D investment

In this section, we explore the causal chains that the more readable annual report enhances the level of R&D investment by improving the cash resource of innovation. Our test is motivated by the theory of resource dependency that financial constraints and tunneling from large shareholders limit managers’ ability to invest in R&D activities, which constructs the causal chain underlying our hypothesis.

According to the above analysis, higher readability of the annual report can enhance the information processing fluency of investors and reduce disclosure processing costs (Reneekamp 2012; Bonsall et al. 2017). Thus, it alleviates investors’ information asymmetry (Rjiba et al. 2021). On the one hand, with information asymmetry between the firm and investors decreasing, it will improve investors’ evaluation of firms and further reduce financial constraints. Moreover, due to the higher information symmetry, the tunneling behaviors of large shareholders will be constrained when the external supervision from small investors is stronger. Moreover, low financial constraints and tunneling behaviors provide a better resource base that encourages managers to spend more time and cash on risky activities (Li 2011; Chen et al. 2017), enhancing R&D investment.

We use the SA index (FC_SA) to measure financial constraints because it avoids endogeneity influences from cash flows, leverage, and other financial factors (Hadlock and Pierce 2010).Footnote 8 Financial constraints decrease if the value of FC_SA is high. Referring to Jiang et al. (2010), we proxy for the tunneling degree with the ratio of other receivables on total assets (Tunneling) that a greater value represents higher levels of tunneling. We implement a mediation analysis to verify our argument’s channels. We herein perform a two-stage analysis: the first stage explores whether increased annual report readability reduces financial constraints and tunneling behaviors, and the second stage tests whether lower financial constraints and tunneling promote R&D investment accompanied by the effect of readability.

In Columns (2) and (4) of Table 6, where the dependent variable is FC_SA and Tunneling, the coefficients on Read_comp are significantly positive and negative at the 1% level, respectively, confirming that more readable annual reports mitigate financial constraints and tunneling behaviors by reducing information asymmetry of investors. In Columns (3) and (5), we find that Read_comp and FC_SA are positively associated with R&D, and Tunneling is negatively associated with R&D, which suggests that lower financial constraints and tunneling motivation improve levels of R&D investment. Sobel’s z-statistics are -1.86 and -1.99 for the cases of FC_SA and Tunneling, representing that mediation effects exist and are significant.

Table 6 Mediation analysis

Collectively, the results in Table 6 confirm that annual report readability enhances the level of R&D investment through channels of relieving financial constraints and tunneling behaviors. Moreover, these results show the direct governance effect of disclosure readability on investors and managerial investment decisions.

Cross-sectional tests on the association between annual report readability and R&D investment

In this section, we conduct three cross-sectional tests to examine whether the change in information processing costs for investors affects annual report readability on the effect of R&D investment. The results help assess whether the effect of annual report readability exhibits cross-sectional variations and whether our findings are reasonable, mitigating the concern from omitted factors.

Managerial incentives to manipulate readability

We examine whether the effect of annual report readability on R&D investment is conditional on managerial incentives to manipulate disclosure readability. Specifically, the qualitative disclosure as “soft information” with low supervision costs, managements have incentives to manipulate textual readability to conceal negative information about firms, which increases the processing costs of investors under the biased disclosure (Leuz and Wysocki 2016; Bushee et al. 2018; DeHaan et al. 2021). However, firms with little manipulating incentives will display normal complexity of the annual report based on their business transactions and regulatory policies, which helps investors better understand firms. Therefore, we expect the association between the annual report readability and corporate R&D investment to be enhanced when management’s manipulating incentives are low.

First, we measure manipulating incentives with selfish managerial levels represented by the ratio of the free cash flow to total assets (FCF). Jensen (1986) argued that more free cash flow will induce higher agency costs from managers. A firm is more likely to manipulate disclosure if the free cash flow is high. Second, the high internal control quality can detect opportunistic behaviors of management which constrains the probability of manipulating disclosure. We proxy for low manipulating incentives with a high internal control index (ICQ).Footnote 9 Third, a firm with diversified businesses may disclose more complex information typically, whereas manipulating incentives will decrease. Thus, we use the natural logarithm of one adding the number of major products (BUC) to measure business complexity. We repeat Eq. (1) by separating the sample at the median of FCF, ICQ, and BUC, respectively.

As shown in Table 7, the effect of annual report readability on R&D investment is more substantial for subsamples with low management manipulating incentives. There is a significant difference in this effect between the high and low subsamples for the case of internal control quality and business complexity. These results suggest that the influence of annual report readability on R&D investment depends on managerial manipulating incentives.

Table 7 Cross-sectional test: managerial manipulating incentives and the effect of annual report readability

The effect of readability on the information processing ability of investors

We then explore whether the relationship between annual report readability and R&D investment varies with different investors. Theoretically, sophisticated investors, such as institutional investors and financial analysts, have a stronger ability for information process, analysis, and integration than non-sophisticated investors (Campbell et al. 2009). Consequently, small investors are more sensitive to the readability of the annual report because they face higher disclosure processing costs than large investors (Blankespoor et al. 2020). We thus expect that increased readability of the annual report will have a more significant effect on R&D investment for small investors since there is a different ability on disclosure processing.

We measure the professional investor based on institutional investor ownership (Inshold), analyst followings (Analyst), and media coverage (Media). Inshold equals the ratio of institutional investor shareholdings. Analyst is the natural logarithm of one adding the number of analyst reports for the firm over the fiscal year. Media is the natural logarithm of one adding the amount of media coverage for the firm over the fiscal year. Table 8 shows that the effect of annual report readability is more significant for the groups with low institutional ownership, low analyst following, and decreased media coverage (i.e., lower than the median). Consistent with our expectation, the lower processing ability of investors strengthens the governance effect of the annual report readability, which in turn optimizes the information environment of investors.

Table 8 Cross-sectional test: information processing ability of investor and the effect of annual report readability

The spillover effects of peer annual report readability for R&D investment

We further explore the spillover effects of peer annual report readability for R&D investment. There is a significant “peer effect” phenomenon in firm behaviors. Roychowdhury et al. (2019) point out that management is motivated to learn incremental information related to investment decisions from peer firms in the same industry to reduce information uncertainty when internal information is limited. Some studies focus on how qualitative disclosures of peers affect the target firm’s real investment. Durnev and Mangen (2020) reveal that MD&A tone provides incremental information related to an investment decision for the target firm, and management can improve investment efficiency through an information learning mechanism.

R&D investment is riskier than other real investments, requiring management to collect valuable information to avoid innovation failure (Hall 2002). More readable contexts help mitigate the target firm’s processing costs, which are convenient for management reading and learning relative information about innovation from peers’ annual reports. As a result, the lower information uncertainty encourages managers to seize the investment opportunity and improve R&D investment efficiency. In conclusion, we expect that the higher readability of peers’ annual reports is positively associated with the R&D investment levels of the target firm.

We extend Eq. (1) by introducing peer readability (Peer_comp) as the explanatory variable and new control variables reflecting the characteristics of peer firms. Following Durnev and Mangen (2020), Peer_comp is the average of Read_comp in the same industry, excluding the target firm’s readability. In Columns (1) and (2) of Table 9, the coefficients on Peer_comp are positive and significant. After controlling characteristics of peer firms and firm fixed effects, as shown in Columns (3) and (4), respectively, the results do not change. These suggest that annual report readability has positive peer spillover effects on corporate R&D investment.

Table 9 Spillover effects of annual report readability for R&D investment

Conclusion

Our study explores the association between annual report readability and corporate R&D investment. We assert that higher readability of the annual report facilitates investors to process disclosures fluently and reduce information asymmetry, which helps to relieve insufficient funds and encourages management to invest in R&D activities. By exploiting Chinese listed firms, we find that increased readability of the annual report enhances R&D investment levels. The mechanism test reveals that the effect of readability works through the channels of mitigating financial constraints and tunneling behaviors, confirming our argument’s causality. The impact of readability on R&D investment varies with the degree of managerial manipulating incentives and investor ability to process disclosures. Moreover, the spillover effects of annual report readability for R&D investment are significant.

Overall, our research suggests that the annual report readability not only affects investors’ decisions directly but also influences managers’ decisions on real investment indirectly, which response to the call of Roychowdhury et al. (2019), Blankespoor et al. (2020) how firm information disclosures impact management behaviors. Moreover, our study provides novel insights into the growing literature about the real economic effect of qualitative disclosure properties. Hence, simplifying qualitative information in the annual report benefits a firm in improving innovation efforts and quality.

We notice that our study has limitations. First, the construction of indicators for annual report readability in the Chinese setting is relatively subjective and cannot include all the features reflecting disclosure readability. Future research can identify textual complexity using convolutional neural network algorithms in deep learning and WordVec2 neural network language feature engineering. Second, in common with most studies (Biddle et al. 2009; Blankespoor et al. 2020), although we make great efforts to constrain the reverse causality, omitted variable, and selected bias, we cannot wholly resolve all the potential endogeneity concerns. Future research can focus on scientific approaches to mitigate endogeneity issues. Third, we only consider the single mechanism of annual report readability impacting R&D investment from the resource dependency perspective. As suggested by Roychowdhury et al. (2019), researchers can explore mechanisms based on information uncertainty in the future.