Introduction

Disclosure of corporate social responsibility (CSR) activities has increased significantly over the years. Across the globe, CSR reporting requirements have more than doubled since 2013, with governments, financial market regulators, and stock exchanges issuing most of the reporting guidelines (KPMG 2016). At the same time, both socially responsible investment fund and mainstream investors increasingly evaluate and incorporate firms’ CSR performance when making investment decisions (CDP 2019; PWC 2019). More and more firms now release stand-alone CSR reports,Footnote 1 which provide comprehensive and in-depth information about firm-level performance in various social, environmental, and governance-related domains. According to the Governance and Accountability Institute (2018), 85% of S&P 500 companies published CSR reports in 2017, up from less than 20% in 2011.

CSR reporting is different from financial reporting in several ways. While financial reporting is mandatory, verifiable, and enforced through methods including external audit, litigation, and regulatory oversight, CSR reporting is voluntary in most countries, largely unregulated, and does not have a widely enforced reporting framework (Perrini 2006; Tschopp and Huefner 2015). In addition, while financial reporting targets the investor community and focuses primarily on financial data, the targeted audiences of CSR reporting consist of various stakeholders, such as customers, employees, business partners, advocacy groups, and investors (Perrini 2006). CSR reports primarily include textual, non-quantifiable information regarding firms’ policies, practices, and performance in social, environmental, and governance domains (Dhaliwal et al. 2011; Du et al. 2017). Given the descriptive and non-financial nature of CSR reports, the textual properties, such as readability and tone, naturally play a prominent role in determining the effectiveness of CSR communication and in shaping the information content of those reports.

Effective information disclosure and information transparency are key aspects of corporate ethical behavior that could build stakeholder trust and sustainable competitive advantage (Das Neves and Vaccaro 2013; Jones et al. 2018). Whether and how the textual characteristics of CSR reports convey useful information and improve transparency is a question of importance to investors, managers, regulators, and other stakeholders. The Securities and Exchange Commission (SEC) is evaluating the importance and effectiveness of CSR disclosure. In particular, the SEC states, “we seek feedback on which, if any, sustainability and public policy disclosures are important to an understanding of a registrant’s business and financial condition and whether there are other considerations that make these disclosures important to investment and voting decisions” (SEC 2016). Furthermore, the Sustainability Accounting Standards Board (SASB) argues that sustainability accounting should have both confirmatory and predictive value, so that it can be used for future planning and decision support (SASB 2017). While a few studies have examined how contemporaneous CSR performance affects CSR report readability (e.g., Nazari et al. 2017; Wang et al. 2018) and how CSR disclosure quality influences analyst forecast (Muslu et al. 2019), it remains unknown whether and how the textual attributes of CSR reports help predict future CSR performance and influence investors’ trading behavior. This study examines the implications of CSR report readability and tone for future CSR performance and the market reaction around the release of CSR reports, and in so doing, sheds light on how firms could effectively reduce information asymmetry through CSR reporting.

Using a hand-collected dataset of Fortune 500 companies that published stand-alone CSR reports from 2002 to 2014, we document a positive association between 1-year-ahead CSR performance and the changes in CSR report readability and tone. The change in CSR report readability is also predictive of 2-year-ahead CSR performance. These results suggest that increases in readability and tone in a firm’s CSR report are indicative of better future CSR performance. Furthermore, we find that the stock market reacts significantly to the changes in report readability and tone around the release of a CSR report. Specifically, the abnormal trading volume around report issuance dates is positively associated with the change in report readability, suggesting that more readable CSR reports spur trading by releasing more value relevant information to investors or increasing information precision. In line with the finding that enhanced readability of CSR reports is indicative of better future CSR performance, there is a positive association between the abnormal returns and the change in CSR report readability. With regard to CSR report tone, we find that the abnormal returns around the release of CSR reports are positively associated with tone change, but there is no trading volume reaction to tone change.

This paper contributes to the literature on discretionary information disclosure in general and CSR reporting in particular. Truthful and effective information disclosure is an integral part of corporate ethical behavior and stakeholder relationship management (Das Neves and Vaccaro 2013; Martinez-Ferrero et al. 2016; Cui et al. 2018; Jones et al. 2018). As Jones et al. (2018, p. 375) stated, ethical firms engage in behaviors of, among others, “refraining from taking advantage of power imbalances or information asymmetries, and willingly sharing relevant information.” Yet, how to truthfully and effectively communicate complicated and multi-dimensional information such as CSR performance is not straightforward. By analyzing how both readability and tone of CSR reports convey information about future CSR performance and demonstrating the value relevance of these two textual attributes, this study substantiates the important roles of CSR report readability and tone in increasing information transparency and imparting value relevant information to the market.

As importantly, this study advances current understanding of the information content of CSR reports. To the best of our knowledge, this is the first study that investigates stock market reaction to CSR report readability and tone. Prior studies have found that issuing stand-alone CSR reports reduces the cost of capital (Dhaliwal et al. 2011) and analyst forecast error (Dhaliwal et al. 2012), and moves stock prices (Du et al. 2017). Muslu et al. (2019) show that the overall disclosure score of CSR reports influences analyst forecast accuracy. Our study complements this line of research by shedding light on the possible channel through which CSR reports affect analyst forecast and stock prices: the readability and tone of CSR reports may help analysts and investors predict value relevant future CSR performance. The results also provide support for the SASB’s view that CSR disclosure should have predictive value to be useful for decision making.

Furthermore, the significant market reaction to CSR report readability and tone demonstrates not only investors’ demand for information contained in CSR reports, but also positive business returns, in the form of higher stock prices, for socially responsible firms that effectively communicate their superior CSR performance using a positive tone and readable text. More generally, in line with instrumental stakeholder theory (Freeman 1999; Jones et al. 2018), our results suggest that there is a strong business case for sustainability reporting and that managers could reduce information asymmetry by using textual properties (e.g., readability and tone) to communicate future CSR performance information.

Our analysis is subject to several caveats. First, due to a lack of the word list specifically tailored to CSR reports, we calculate report tone using the word lists by Loughran and McDonald (2011), which are developed from financial reports and thus may not accurately capture the positive and negative words of CSR reports. Given the differences between financial and CSR reports, future research should generate word lists focused on the CSR context. Second, we only study readability and tone of CSR reports in our analysis. Future research may examine the effects of other textual aspects of CSR reports, such as numerical and horizon content, boilerplate, and specificity, on the stock market.

The rest of the paper is organized as follows. Section 2 reviews prior literature and develops key arguments for our hypotheses. Section 3 outlines the research methodology. We describe the sample and descriptive statistics in Sect. 4. Section 5 reports main empirical results. Section 6 presents additional analyses. Section 7 provides concluding remarks and discussion.

Literature Review and Predictions

CSR Reporting

CSR reporting has been on the rise over the past decades (Cho et al. 2015). In their study on the evolution of CSR reporting, Tschopp and Huefner (2015, p. 565) state, “(CSR reporting) seems destined to become a key part of the overall accounting reporting framework, joining external financial reporting, income tax reporting, regulatory reporting, and internal reporting.” CSR reports, ranging from several dozens to several hundred pages in length, provide comprehensive and in-depth information about firms’ social and environmental performance. As compared to alternative CSR disclosure methods, such as communication on corporate websites, individual social/environmental data disclosure, or CSR information in financial reports, stand-alone CSR reports are unique in the sense that they provide greater depth and breadth regarding information about corporate social performance in all key domains (e.g., employee welfare, diversity, community outreach, product safety, environment) and serve as a one-stop source of CSR performance information for stakeholders.

CSR performance information could be value relevant because CSR practices can enhance firm financial performance. Socially responsible firms enjoy higher brand equity (Torres et al. 2012), greater customer satisfaction and loyalty (Luo and Bhattacharya 2006; Ailawadi et al. 2014). Firms with superior CSR performance have advantages in attracting, motivating, and retaining talented employees (Turban and Greening 1997; Surroca et al. 2010). CSR is an important means for boosting firm productivity (Hasan et al. 2018) and innovation (Luo and Du 2015). More generally, a positive record of CSR performance helps a firm attain legitimacy and the license to operate at local communities as well as receive more favorable treatment from the media and the regulators (Fombrun et al. 2000). Furthermore, the goodwill derived from CSR can act as “an insurance policy” that minimizes firm risk (Klein and Dawar 2004; Godfrey et al. 2009). Overall, prior literature has documented positive effects of firm CSR performance on financial performance and market value (Khan et al. 2016; Margolis et al. 2007; Servaes and Tamayo 2013; Hasan et al. 2018), consistent with the value relevance of CSR performance information.

Since information about firm CSR performance is value relevant, publishing CSR reports could enable market participants to have a better understanding of firm performance and increase financial transparency. Prior studies have found that the issuance of CSR reports reduces the cost of equity capital (Dhaliwal et al. 2011), enhances analyst forecast accuracy (Dhaliwal et al. 2012), and triggers significant stock market reaction (Du et al. 2017). Nevertheless, it remains, to date, largely a black box as to what information in CSR reports has predictive value and influences investors’ trading decisions. We seek to shed light on these questions by examining whether the textual attributes of CSR reports—readability and tone—convey information about future CSR performance and influence market reaction to CSR reports.

Effects of CSR Report Readability on Future CSR Performance and Market Reaction

The strategic reporting literature suggests that management tends to be more forthcoming in disclosure when the firm is performing well, but has incentives to obfuscate information when firm performance is poor (Schrand and Walther 2000). Li (2008) finds that firms with less readable annual reports have lower subsequent earnings, suggesting that management tries to hide poor future performance from investors by increasing the complexity of annual reports.

In the case of CSR reports, the incentive to obfuscate CSR performance information is also likely to exist when managers anticipate inferior CSR performance in the future. Stakeholders react favorably to positive CSR performance and unfavorably to negative CSR performance (Fombrun et al. 2000). When learning about a firm’s poor CSR performance, stakeholders are likely to sanction the firm by engaging in negative word-of-mouth, boycotting or switching brand, and increased employee turnover (Klein and Dawar 2004; Godfrey et al. 2009; Surroca et al. 2010). To the extent that less readable CSR reports can hide the bad news of poor future CSR performance by increasing stakeholders’ information processing costs, firms with poor future CSR performance have incentives to issue less readable CSR reports. In contrast, firms with favorable future CSR performance are likely to issue CSR reports that are transparent and easier to read, in order to reap goodwill and the associated business benefits from its various stakeholders (Du et al. 2010). Thus, the management obfuscation hypothesis suggests that lower CSR report readability indicates less favorable future CSR performance.

On the other hand, because CSR reports are voluntary, unaudited, and do not follow a mandatory reporting framework (Perrini 2006), managers have significant discretion in deciding what CSR information to report, or in some cases, whether to release a CSR report or not. Instead of obfuscating information by using complex words/sentences, opportunistic managers may choose to omit areas of concern or disclose less when CSR performance is poor. Supporting this view of selective reporting, Clarkson et al. (2008) find that firms with higher environmental performance have higher levels of discretionary environmental disclosures (i.e., a greater number of disclosure items). Similarly, Nazari et al. (2017) find that CSR performance is positively related to the length of CSR reports, suggesting that firms with superior CSR performance disclose more, whereas those with inferior CSR performance disclose less. If managers resort to selective reporting, instead of information obfuscation, to hide poor future performance, then report readability is less likely to be predictive of future CSR performance.

In examining the stock market reaction to CSR report readability, we look at both abnormal trading volume and abnormal returns around the release of CSR reports. Prior literature suggests that investors are less likely to rely upon less readable financial reports due to higher information processing cost and lower information precision (Kim and Verrecchia 1991; Bloomfield 2002). Consistent with this argument, several studies (Miller 2010; Franco et al. 2015) have documented lower abnormal trading volume around the release of less readable 10-Ks or analyst reports. Given the large amount of descriptive and non-quantifiable information regarding various CSR domains (e.g., employee welfare, product safety, environment, and community relations) in CSR reports, it would be difficult for investors to assess CSR information in less readable CSR reports and its implications for future financial performance. In contrast, investors may make greater use of CSR performance information in their trading decisions if CSR reports are more readable and contain more transparent information. Therefore, we expect that the trading volume reaction to the release of CSR reports is likely to be stronger when the reports are more readable.Footnote 2

Furthermore, CSR report readability is likely to have a favorable impact on the abnormal returns around the release of CSR reports. First, the large body of research on the link between CSR performance and financial performance reveals an overall positive relationship (Margolis and Walsh 2003; Margolis et al. 2007; Servaes and Tamayo 2013; Hasan et al. 2018),Footnote 3 pointing to the value relevance of CSR performance due to its positive effects on stakeholder satisfaction (Luo and Bhattacharya 2006; Surroca et al. 2010), moral capital (Godfrey et al. 2009), productivity (Hasan et al. 2018), innovation (Luo and Du 2015), and so on. If firms with more readable CSR reports tend to have better future CSR performance, and consequently, better future financial performance, CSR report readability should be positively associated with the abnormal returns. Second, empirical evidence has pointed to lower firm risk (Loughran and McDonald 2014) and cost of capital (Ertugrul et al. 2017) associated with more readable financial disclosure. Dhaliwal et al. (2011, 2012) suggest that CSR disclosure could reduce information asymmetry among investors or between managers and investors, and decrease the cost of capital. Firms with more readable CSR reports are likely to enjoy a greater reduction in information asymmetry and the cost of capital, suggesting an increase in stock prices around the release of such reports.

Taken together, if higher CSR report readability reduces information processing cost and increases information precision, then more readable CSR reports should release more digestible and actionable information to investors, leading to a positive association between the change in report readability and trading volume reaction to the release of CSR reports. With regard to price reaction to CSR report readability, if improved report readability is indicative of better future performance or leads to a lower cost of capital, one would expect a positive association between the abnormal returns around the release of CSR reports and the change in report readability.

Effects of CSR Report Tone on Future CSR Performance and Market Reaction

To convey information about CSR performance, managers can use both numerical/quantifiable indicators (e.g., key performance indicators, year over year comparisons) and textual, non-quantifiable indicators in the CSR reports. However, CSR performance information relies upon qualitative, textual description to a much greater extent as compared to financial performance information. For instance, in talking about employee wellbeing, a key aspect of CSR performance, 3M’s 2014 CSR report contains detailed textual description on how the company addresses issues such as employee benefits, education and career growth, health and wellness, employee engagement, and global safety. Even when numerical indicators are used to describe aspects of CSR performance (e.g., environmental performance), they are not as informative as the numerical indicators (e.g., sales, profit) in financial reports because companies can report on different quantitative indicators due to a lack of uniform reporting framework (Clarkson et al. 2008). Thus, the predominantly textual nature of CSR reports entails that managers are more likely to use tone to communicate hard-to-quantify information.

A priori, it is not straightforward whether CSR report tone is indicative of future CSR performance. The truthful disclosure hypothesis (e.g., Tetlock et al. 2008; Davis et al. 20122015) suggests that managers use positive and negative words to convey private and hard-to-quantify information, and to signal their expectation about future financial performance. Supporting the view that CSR reports contain credible information about CSR performance and are relevant for assessing firm performance, Dhaliwal et al. (2011, 2012) find that the issuance of CSR reports reduces cost of equity capital and analyst forecast error; similarly, Muslu et al. (2019) document a positive association between CSR disclosure quality and analyst forecast accuracy. Under this view, an increase in CSR report tone would be indicative of higher future CSR performance.

On the other hand, the opportunistic disclosure motive is likely to play a role as well in CSR reporting. Prior literature suggests that greenwashing and impression management is common when it comes to CSR communication (Patten 1992; Cho et al. 2010; Du et al. 2010; Mahoney et al. 2013). Because CSR reporting remains unregulated and CSR performance information is not easily verifiable, opportunistic managers may manipulate tone to mislead investors about firms’ future CSR performance. Cho et al. (2010) find evidence that firms with low environmental performance use biased language and tone (i.e., more optimism and less certainty) to present a more favorable depiction of their performance. If the opportunistic disclosure motive prevails, then one would expect a negative or non-significant association between tone and future CSR performance.

While the truthful and opportunistic disclosure motives are likely to coexist in CSR reporting (Clarkson et al. 2008; Du et al. 2010), it is worth noting that there are mechanisms functioning to constrain managers’ opportunistic disclosure motive in CSR reporting. Stakeholders have access to not only company-issued CSR reports, but also a variety of CSR information from independent third parties such as CSR ratings by KLD, Newsweek Green rankings, environmental impact information by CDP, and the mass media. These third-party information intermediaries can deter managers from information distortion as stakeholders react negatively when they detect false information in CSR communication from the corporate sources. For example, Parguel et al. (2011) find that consumers rely upon independent, third-party sustainability ratings to evaluate a firm’s corporate-controlled CSR communication.

Turning to the effect of CSR report tone on the market reaction, previous studies have provided evidence that the market reacts positively to upward tone revision in financial reports. For example, Feldman et al. (2010) find that tone change in the MD&As is positively associated with abnormal returns surrounding the filing of the MD&As. Similarly, Davis et al. (2012) document a positive association between tone revision and abnormal returns around earnings press releases. If more positive CSR report tone indicates better future CSR performance, there should be a positive association between abnormal returns around the release of CSR reports and tone change. Given that trading volume reaction is not conditional on the direction of news released by CSR reports, we expect a positive association between the abnormal trading volume and the magnitude of tone revision in CSR reports.

In summary, if the overall sentiment of a CSR report conveys value relevant information about future CSR performance, tone change should be positively associated with the abnormal returns, and the magnitude of tone change should be positively linked with the abnormal trading volume. However, if managers manipulate tone opportunistically, the effect of CSR report tone on the market reaction would be weakened to the extent that investors may see through tone management. Thus, whether and how CSR report tone may affect future CSR performance and market trading activities remain open questions and demand further empirical investigation.

Research Methodology

Measuring Readability and Tone of CSR Reports

We use the Fog Index to measure CSR report readability in our main analyses (Li 2008). The Fog Index, developed by Robert Gunning, is a well-known and popular formula to appraise readability. It captures text complexity as a function of two components: the average number of words per sentence and percentage of complex words with more than two syllables. It is defined as follows.

$${\text{Fog Index }}\, = \, 0.{4 } \times \, \left( {{\text{words per sentence }} + {\text{ percentage of complex words}}} \right)$$
(1)

The Fog Index reflects the number of years of formal education that a reader of average intelligence would need to read and understand the text. In general, a value of the Fog Index above or equal to 18 indicates that the text is unreadable; 14–18 difficult to comprehend; 12–14 ideal; 10–12 acceptable; and 8–10 childish. For ease of presentation and interpretation of the readability coefficients in our empirical analyses, we scale the Fog Index by − 100 (i.e., FOG is the Fog Index divided by − 100). As a result, higher values of FOG indicate more readable CSR reports.

Loughran and McDonald (2014) argue that the Fog index is a poor proxy for measuring readability of financial documents, because, among others, many complex words may be quite easy to understand for investors and financial analysts. They recommend using file size as a readability measure for 10-K reports, since longer documents have higher information processing cost and are more difficult to read.

While acknowledging the limitation of the Fog Index, we choose to use the Fog index as a measure for CSR report readability for two reasons. First, due to the discretionary nature of CSR reporting, file size is more likely to be correlated with the amount of CSR disclosure and thus is not a good proxy for report readability. Prior research suggests that disclosure length is a positive indicator of disclosure transparency and informativeness, particularly in the case of CSR disclosure (Lang and Stice-Lawrence 2015; Nazari et al. 2017; Muslu et al. 2019). Dhaliwal et al. (2011) argue that the length of a CSR report is a proxy for firm’s efforts and commitment to better disclosure. Second, unlike financial reports, CSR reports have targeted audiences including not only investors and financial analysts, but also a variety of other key stakeholders, such as consumers, employees, business partners, community members, media, and so on. Complex words that could be easily understood by investors or analysts may not be comprehensible for employees and other shareholder groups. In this regard, using the Fog index as a proxy for CSR report readability seems of less concern.

To measure CSR report tone, we use the list of positive and negative words compiled byLoughran and McDonald (2011) (LM list).Footnote 4 Prior research (see Loughran and McDonald 2016 for a review) suggests that the LM list is more appropriate and relevant for business disclosure than alternative lists, such as Diction and Harvard General Inquirer word lists. For example, many negative words in the Harvard General Inquirer list are used to describe financial aspects (e.g., cost, tax, and foreign), corporate governance (e.g., board and vice), and industries (e.g., mine, tire, and crude), and are typically not negative in business disclosure (Loughran and McDonald 2011). In addition, the LM list also includes negative words that are used in the business context, such as restated, litigation, and restructuring. Following prior research, we define POS (NEG) as the number of positive (negative) words scaled by the number of total words in a CSR report. TONE is used to gauge the overall sentiment of a report and is defined as the difference in the proportions of positive and negative words (i.e., POS minus NEG).

Tetlock et al. (2008) and Davis et al. (2012) suggest that investors use the textual properties of financial reports in the past year as the benchmark and react only to changes in readability and tone. Furthermore, prior research (Feldman et al. 2010) suggests that relative to the changes in readability and tone, the levels of textual properties are more likely to be affected by the boilerplate usage of certain words in an industry or a firm as well as the choice of a particular word list. In addition, we recognize that the second component (percentage of complex words) of the Fog Index could be a potentially misleading factor in measuring readability (Loughran and McDonald 2014), given that some multi-syllable words, such as sustainability and environment, are common CSR terms and may be easily understood by stakeholders. If these multi-syllable words and the boilerplate usage of certain words are used in a similar way across years for a given company, using changes in readability and tone should mitigate concerns about measurement error associated with level measurements. Therefore, in our empirical analysis, we focus on changes in report readability and tone, and examine their effects on future CSR performance and the market reaction around the release of CSR reports.

Measuring CSR Performance

Following prior research (e.g., Dhaliwal et al. 2011, 2012; Kim et al. 2012; Servaes and Tamayo 2013), we use KLD ratings provided by MSCI ESG Research (formerly KLD Research and Analytics Inc.) as the proxy for overall firm CSR performance. KLD ratings are among the most influential and most widely accepted measures of CSR performance used by academics (Dhaliwal et al. 2011). Since 2003, KLD ratings cover the 3,000 largest U.S. companies and provide CSR performance ratings in key social and environmental domains.

KLD data include seven different CSR domains (environment, community, diversity, employee relations, product, human rights, and corporate governance) and provide the numbers of strengths and concerns for each domain. Since the numbers of strength and concern indicators have changed over the years in the KLD dataset, we scale the number of total strengths (concerns) for each firm-year by the maximum possible number of strengths (concerns) in each year to obtain the corresponding strength and concern indices that range from 0 to 1, and then subtract the concern index from the strength index to get the net CSR performance that ranges from − 1 to + 1 (for similar transformation of KLD data, see Waddock and Graves 1997; Servaes and Tamayo 2013).

Empirical Models

We use the following model to test the effects of CSR report readability and tone on future CSR performance.

$${\rm {CSRP}}_{t+1}\,=\,{\beta }_{0}+{\beta }_{1}{\rm {CSRP}}_{t}{+{\beta }_{2}{\rm {CSRP}}_{t-1}+\beta }_{3}{\rm SIZ{E}}_{t}+{\beta }_{4}{\rm RO{A}}_{t}+{\beta }_{5}{\rm LE{V}}_{t}+{\beta }_{6}\text{FIN}_{t} +{\beta }_{7}{\rm LIQUI{D}}_{t}+{\beta }_{8}{\text{RD}}_{t}{+{\beta }_{9}{\rm {AF}}_{t}+{\beta }_{10}{{\rm READ}\_10{\rm K}}_{t}+\beta }_{11}\Delta {\rm REA{D}}_{t} +{\beta }_{12}\Delta \text{TONE}_{t}+{\text{Industry}}\;{\text{and}}\;{\text{Year}}\;{\text{Fixed}}\;{\text{Effects}}+{\varepsilon }_{t+1}$$
(2)

The variables of interest are ΔREADt and ΔTONEt. ΔREADt is the change in CSR report readability, as measured by FOG from year t − 1 to year t . ΔTONEt is the change in CSR report tone from year t − 1 to year t. CSRPt +1, CSRPt, and CSRPt − 1 are firms’ net CSR performance for year t  + 1, t, and t − 1, respectively.

The control variables are taken from prior studies examining factors that affect CSR performance. We include firm size (SIZE) as larger firms have greater visibility and face more intense stakeholder pressure to engage in CSR (Smith 2003). SIZEt is the natural logarithm of total assets at the end of year t. We include ROAt to control for the positive association between financial performance and CSR performance (Margolis and Walsh 2003). ROAt is the return on assets, calculated as income before extraordinary items divided by total assets at the end of year t. We control for financial leverage (LEV) because firms with constrained financial resources are less likely to engage in CSR (Waddock and Graves 1997; Surroca et al. 2010). LEVt is calculated as total debt (i.e., short term debt plus long-term debt) divided by total assets at the end of year t .

Prior studies suggest that a firm’s financing needs, stock liquidity, research and development (R&D) investment, and firm information environment might influence its incentive to engage in CSR (McWilliams and Siegel 2000; Dhaliwal et al. 2011). We include corporate financing activities (FIN), stock liquidity (LIQUID), research and development intensity (RD), and analyst following (AF) as additional control variables. Specifically, FINt is calculated as the net amount of debt and equity capital raised by the firm (i.e., the net sale of common and preferred shares plus the net issuance of long-term debt) during the year scaled by total assets at the end of year t . LIQUIDt is the number of shares traded divided by the number of shares outstanding for year t . RDt is R&D expenses deflated by sales for year t . AFt is defined as the natural logarithm of one plus the number of analysts at the end of year t .

Li (2008) shows that 10-K readability is related to future financial performance, which in turn may be associated with future CSR performance. We thus control for the potential effect of 10-K readability on future CSR performance. Similar to the treatment of the CSR report readability measure, our proxy for 10-K readability (READ_10K) is defined as the natural logarithm of the file size in megabytes of the SEC EDGAR “complete submission text file” for the 10-K filing (Loughran and McDonald 2014) divided by − 100, so that a higher value of READ_10K indicates more readable 10-Ks. Finally, following Dhaliwal et al. (2011), we include fixed year and industry effects, with industry classifications based on Barth et al. (1998).

To examine whether CSR report readability and tone affect the stock market reaction around the release of CSR reports, we look at both abnormal trading volume and abnormal stock returns. Prior research (Cready and Hurtt 2002) suggests that abnormal trading volume and abnormal returns capture different aspects of market reactions to information events. In particular, abnormal returns reflect the changes in the expectations of the market as a whole, while abnormal trading volume reflects the changes in the expectations of individual investors (Beaver 1968). An increase in trading volume around the release of a CSR report does not indicate good or bad news, but suggests that the information contained in the report changes individual investors’ expectations, leading to an altering of their optimal portfolio positions. On the other hand, an increase in stock price (i.e., positive abnormal returns) indicates that the CSR report releases good news to the market as a whole, resulting in a higher equilibrium price. Using abnormal trading volume and abnormal returns allows us to gauge the effects of new information due to the release of CSR reports on the market. The stock market would react to CSR reports to the extent that the new information as conveyed by CSR report readability and tone changes investors’ expectations of future firm performance.

We use the following model to examine the effects of CSR report readability and tone on the cumulative tabnormal trading volume around the release of CSR reports.

$${\text{CABVOL}}_{t} = \beta _{0} + \beta _{1} \Delta {\text{CSRP}}_{t} + \beta _{2} {\text{SIZE}}_{t} + \beta _{3} {\text{ROA}}_{t} + \beta _{4} {\text{LEV}}_{t} + \beta _{5} {\text{FIN}}_{t} + \beta _{6} {\text{LIQUID}}_{t} + \beta _{7} {\text{RD}}_{t} + \beta _{8} {\text{AF}}_{t} + \beta _{9} {\text{READ}}\_10{\text{K}}_{t} + \beta _{{10}} \Delta {\text{READ}}_{t} + \beta _{{11}} {\text{ABS}}\Delta {\text{TONE}}_{t} + {\text{Industry}}\;{\text{and}}\;{\text{Year}}\;{\text{Fixed}}\;{\text{Effects}} + \varepsilon _{t}$$
(3)

CABVOLt is the cumulative abnormal trading volume during the window (− 1, 1) centered on the release date of CSR reports for year t, calculated as the logarithm of the cumulative trading volume during the three-day event window minus the logarithm of the firm-specific median cumulative trading volume for contiguous three-day periods over the estimation period from 100 trading days prior to the three-day event window (− 1, 1) to 21 trading days prior to this window (Franco et al. 2015). ∆CSRPt is the change in firm CSR performance from year t − 1 to year t . To the extent that KLD ratings proxy for the factual/quantitative information about CSR performance in a CSR report, including ∆CSRPt allows us to control for the market reaction associated with the factual information in the report. ABS∆TONEtis the absolute value of ∆TONEt, and captures the magnitude of the change in CSR report tone from year t − 1 to year t. Additionally, we include the same set of control variables as in model (2) to make sure that the market reaction to CSR report readability and tone is not driven by firm fundamentals correlated with CSR performance.

The following model is used to examine the effects of CSR report readability and tone on the cumulative abnormal returns around the release of CSR reports.

$${\text{CAR}}_{t} = {\mkern 1mu} \beta _{0} + \beta _{1} \Delta {\text{CSRP}}_{t} + \beta _{2} {\text{SIZE}}_{t} + \beta _{3} {\text{ROA}}_{t} + \beta _{4} {\text{LEV}}_{t} + \beta _{5} {\text{FIN}}_{t} + \beta _{6} {\text{LIQUID}}_{t} + \beta _{7} {\text{RD}}_{t} + \beta _{8} {\text{AF}}_{t} + \beta _{9} {\text{READ}}\_10{\text{K}}_{t} + \beta _{{10}} \Delta {\text{READ}}_{t} + \beta _{{11}} \Delta {\text{TONE}}_{t} + {\text{Industry}}\;{\text{and}}\;{\text{Year}}\;{\text{Fixed}}\;{\text{Effects}} + \varepsilon _{t}$$
(4)

CARt is the cumulative market-adjusted abnormal returns during the window (− 1, 1) around the release of CSR reports for year t. Throughout our regression analyses, standard errors are adjusted based on two-way clustering at firm and year to address the concern that ordinary least squares (OLS) may underestimate standard errors (Petersen 2009).

The Sample

We begin with Fortune 500 companies that published stand-alone CSR reports during the period 2002 to 2015. CSR reports are collected from various internet sources, including CSRwire.com, CorporateRegister.com, GlobalReporting.org, SocialFunds.com, BusinessWire.com, and corporate websites. We then match each CSR report with its corresponding fiscal year. The initial sample includes 1,780 CSR reports for 340 firms for fiscal years 2002 to 2014. We exclude the first observation (i.e., the first CSR report) for each firm from our sample since the changes in readability and tone for the first CSR report cannot be calculated. We then merge readability and tone information for CSR reports with CSR performance from KLD, financial information from Compustat, analyst following from I/B/E/S, and 10-K readability from Loughran and McDonald 10X File Summaries.Footnote 5 Observations with missing data for required variables are deleted. The final full sample includes 1258 observations for 262 unique firms from 2002 to 2014. This full sample is used to assess how CSR report readability and tone are related to future CSR performance.

To examine the market response to CSR report readability and tone around the release of CSR reports, we start with the full sample and search for the release dates of the CSR reports, using various keywords (e.g., “CSR report,” “sustainability report,” “corporate citizenship report,” “release,” “announce,” “issue,” “today,” “becomes available,” and other similar terms) and various Internet sites, including CSRwire, CorporateRegister.com, Business wire, Reuters, PRweb, and company websites (the newsroom or investor relations section). We verify the report release dates by reading the press releases. CSR reports with unidentifiable release dates are excluded. To control for confounding events, we check for other major news concerning the firms and eliminate the firm-date observations from our sample, if CSR report release dates are within the 5-day window (− 2, 2) around the release of other major corporate events, such as earnings announcements or merger and acquisition announcements. The final market reaction sample includes 574 observations with identifiable release dates of CSR reports for 175 unique firms. Observations in the bottom or top 1% regression residuals are deleted to mitigate the effect of outliers in the regression analyses.

Panel A of Table 1 reports the distribution of the full sample by year. Consistent with the trend that a growing number of firms engage in CSR reporting in the recent years (Tschopp and Huefner 2015), the number of CSR reports among Fortune 500 firms increases from 7 to more than 200 in 2013 and 2014. Panel B of Table 1 reports the distribution of the full sample by industry based on Barth et al.’s (1998) industry classifications. The Durable Manufacturing industry contains the largest number of CSR reports (220) and firms (46), accounting for 17.5% of all the CSR reports and 17.6% of all the firms. The retail industry has the second largest number of CSR reports (136) and firms (30). On the other hand, the Insurance and Real Estate industry and the Other industry have the smallest number of CSR reports (4 and 8, respectively) in our sample.

Table 1 Sample distribution

Table 2 reports the descriptive statistics of our main variables for the full sample. The mean (median) Fog index of CSR reports is 15.80 (16.08), which falls into the category of “difficult to read,” but is lower than the mean Fog index for 10-K reports (18.68) reported in Loughran and McDonald (2014). This indicates that although CSR reports are difficult to read, they are slightly easier to comprehend than 10-K reports. This is consistent with the view that, as compared to financial reports, CSR reports target a greater variety of stakeholders including less sophisticated audiences. The mean (median) of TONE is 1% (0.9%), indicating that CSR report tone is, in general, relatively positive. There is also a relatively large variation in the changes in CSR report readability and tone (ΔFOG: Q1 = − 0.6%, Q3 = 0.6%; ΔTONE: Q1 = − 0.3%, Q3 = 0.3%). In addition, ΔPOS (SD = 0.6%) exhibits much higher variation than ΔNEG (SD = 0.3%), suggesting that managers tend to use changes in positive words to convey information rather than negative words. The mean of CSRPt is 0.086, indicating an overall positive CSR performance for the sample firms.

Table 2 Descriptive statistics

Table 3 presents the Pearson correlations of the main variables for the full sample. Correlation coefficients that are significant at the 0.10 level or higher are in bold. The correlation between ΔFOG and ΔTONE (0.02) is insignificant, suggesting that CSR readability and tone tap into distinct aspects of textual properties. In addition, future CSR performance (CSRPt +1) is positively correlated with ΔTONE (correlation = 0.05), consistent with the argument that more positive tone revision is associated with better future CSR performance.

Table 3 Correlations of the main variables

Empirical Results

Effects of Readability and Tone on Future CSR Performance

Table 4, Panel A, reports the effects of CSR report readability (as measured by FOG) and tone on 1-year-ahead CSR performance. Recall that FOG is the Fog Index divided by − 100; thus higher values of FOG indicate more readable CSR reports. Column I presents the results based on model (2). The coefficient on ΔREAD is positive (coeff. = 0.044, t stat = 2.19), suggesting that less readable CSR reports are indicative of lower future CSR performance. This result is consistent with the management obfuscation hypothesis (Li 2008), suggesting that firms try to hide poor future CSR performance by decreasing CSR report readability. Also importantly, 1-year-ahead CSR performance is positively associated with ΔTONE (coeff. = 0.806, t stat = 4.17), consistent with the truthful disclosure hypothesis that managers use tone in CSR reports to convey credible information regarding future CSR performance.

Table 4 The effects of CSR report readability and tone on future CSR performance

In Columns II and III, we examine the separate effects of the proportions of positive and negative words by replacing ΔTONE with ΔPOS and ΔNEG, respectively. The coefficient on ΔPOS is positive (coeff. = 0.957; t stat = 3.63), while the coefficient on ΔNEG is negative (coeff. = − .934; t stat = − 2.20), suggesting that the proportions of positive and negative words in CSR reports are informative about future CSR performance. However, when both ΔPOS and ΔNEG are included as the independent variables in Column IV, ΔPOS completely absorbs the explanatory power of ΔNEG; the coefficient on ΔPOS remains significantly positive, whereas the coefficient on ΔNEG becomes insignificant. While this result is in contrast with prior findings in the case of financial reports that positive words provide little incremental information compared to negative words (Loughran and McDonald 2011), it seems to support the view that, given the voluntary and unregulated features of CSR reports, negative words in CSR reports may be used only in boilerplate format, and managers are more likely to use changes in positive words to communicate information about future CSR performance. However, an alternative explanation is that there is not enough power to detect the association between future CSR performance and ΔNEG due to relatively low variation in ΔNEG in our sample.

Given the long-term orientation of CSR reports, we test whether CSR report readability and tone have implications for 2-year-ahead CSR performance in Table 4, Panel B. The additional data requirement for 2-year-ahead CSR performance reduces our sample size to 1235 observations. The results in Column I indicate that more readable CSR reports are also associated with higher 2-year-ahead CSR performance (coeff. on ΔREAD = 0.061, t stat = 2.75), although the coefficient on ΔTONE is insignificant. In Columns II and III, we replace ΔTONE with ΔPOS and ΔNEG, respectively. There is weak evidence that an increase in the proportion of positive words is indicative of more favorable 2-year-ahead CSR performance (Column II: coeff. on ΔPOS = 0.583, t stat = 1.68). However, when both ΔPOS and ΔNEG are added in the model in Column IV, the coefficient on ΔPOS becomes insignificant.

Market Reaction to Readability and Tone of CSR Reports

Panel A of Table 5 reports the effects of CSR report readability and tone on cumulative abnormal trading volume around the release of CSR reports. Column I reports the results based on model (3). The coefficient on ΔREAD is positive (coeff. = 0.334, t stat = 5.06), suggesting that firms with a larger increase in CSR report readability experience higher abnormal trading volume around the release of CSR reports. However, the coefficient on ABSΔTONE is not significant, indicating that the volume reaction is not associated with the magnitude of tone change in CSR reports. We further replace ABSΔTONE with ABSΔPOS and ABSΔNEG in Columns II and III to examine the volume reaction to the magnitudes of the changes in the proportions of positive and negative words. ABSΔPOS and ABSΔNEG are equal to the absolute value of ΔPOS and ΔNEG, respectively. The coefficients on ABSΔPOS and ABSΔNEG are not significant. The results consistently suggest that tone revision in CSR reports does not affect the abnormal trading volume.Footnote 6 Overall, the results reported in Panel A are consistent with the view that more readable CSR reports decrease information processing cost and increase information transparency, thus spurring trading activities around the release of these reports.

Table 5 The effects of CSR report readability and tone on the market reaction around the release of CSR reports

Panel B of Table 5 reports the effects of CSR report readability and tone on the cumulative abnormal returns round the release of CSR reports. Column I shows the results based on model (4). The coefficient on ΔREAD is positive (coeff. = 0.026, t stat = 2.84), suggesting that the market reacts positively to increase in report readability. The coefficient on ΔTONE is also positive (coeff. = 0.188, t stat = 2.07), indicating that the market reacts favorably to upward tone change. For a firm with the mean market value of $56.3 billion in our sample, moving from the bottom decile of ΔREAD to its top decile could increase firm value by $40 million.Footnote 7 Similarly, moving from the bottom decile of ΔTONE to its top decile could increase firm value by $127 million.Footnote 8 Overall, the results are consistent with the view that investors treat increases in the readability and tone of CSR reports as credible signals of higher future CSR performance, thus leading to more favorable market reactions to CSR reports with such textual characteristics.

When we replace ΔTONE with ΔPOS and ΔNEG in Columns II and III, respectively, the coefficient on ΔPOS is positive (coeff. = 0.322, t stat = 3.08), but the coefficient on ΔNEG is not significant (t stat = 0.94). The results are consistent with the those documented in Table 4 and suggest that the positive association between the cumulative abnormal returns and ΔTONE in Column I is likely to be driven by the change in the proportion of positive words.

Additional Analysis

Using Alternative Readability Measures

We use two alternative readability measures, FLESCH and ARI, based on the FLESCH Reading Ease Score and the Automated Readability Index, respectively, to check the robustness of our results. FLESCH Reading Ease Score and Automated Readability Index are calculated as follows:

$${\text{FLESCH Reading Ease Score }} = 206.835- \, \left(1.015 \times {\text{ words per sentence}} \right) \, - \, \left(84.6 \times {\text{syllables per word}} \right)$$
(5)
$${\text{Automated Readability Index}} = \, - 21.43 + \, \left(4.71 \times {\text{characters per word}} \right) + \, \left( 0.5 \times {\text{ words per sentence}} \right)$$
(6)

Similar to the construction of FOG, FLESCH is defined as the FLESCH Reading Ease Score divided by 100, and ARI is defined as the Automated Readability Index divided by − 100, so that higher values of FLESCH and ARI indicate more readable CSR reports. Panels A and B of Table 6 report the results using FLESCH and ARI, respectively, as the proxy for readability. The results are largely consistent with those based on FOG.

Table 6 The effects of CSR report readability and tone on market reaction using alternative readability measures

Effect of CSR Report Tone on Market Reaction Conditional on Report Readability

Franco et al. (2015) find that analyst report readability and tone reinforce each other such that the effect of tone on the market reaction to analyst reports is stronger for more readable analyst reports. We examine whether CSR report readability moderates the association between the abnormal returns around the release of CSR reports and tone revision in Panel A of Table 7. More specifically, in Column I, we add the interaction between ΔREAD and ΔTONE (ΔREAD*ΔTONE) into model (4). Consistent with Franco et al. (2015), the results indicate that, ceteris paribus, improving CSR report readability could enhance the effect of tone revision on the abnormal returns (coeff. on ΔREAD* ΔTONE = 7.224; t stat = 1.89). We replace ΔTONE with ΔPOS in Column II. The results are even stronger (coeff. on ΔREAD* ΔPOS = 8.074; t stat = 3.88), suggesting that the market reaction to the change in the proportion of positive words is more pronounced for more readable CSR reports.

Table 7 Conditional effects of CSR report readability and tone on the abnormal returns

Effect of CSR Report Readability on Market Reaction Conditional on Analyst Following and Financial Opacity

The literature (Ayers and Freeman 2003) has provided evidence consistent with the intermediary role of financial analysts in information generation and capitalization. To the extent that analysts acquire information in a CSR report from alternative sources before its release and accelerate information pricing, market reaction to CSR reports should be less pronounced for firms with more analysts following. Furthermore, Dhaliwal et al. (2012) find that issuance of CSR reports reduces analyst forecast error to a greater extent for firms with a higher level of financial opacity. This suggests that CSR reports play a complementary role in enhancing financial transparency and are more useful for firms with greater financial opacity. We thus posit that market reaction to CSR report readability should be more pronounced for firms with less analyst following and greater financial opacity. We examine this conjecture using the following model.

$$\text{CAR}_{t}\,=\,{\beta }_{0}+{\beta }_{1}{\Delta {\rm CSRP}}_{t}+{\beta }_{2}{\rm SIZ{E}}_{t}+{\beta }_{3}{\rm RO{A}}_{t}+{\beta }_{4}{\rm LE{V}}_{t}+{\beta }_{5}\text{FIN}_{t} +{\beta }_{6}{\rm LIQUI{D}}_{t} +{\beta }_{7}{\text{RD}}_{t}+{\beta }_{8}{\rm {AF}}_{t}+{\beta }_{9}{{\rm READ}\_10{\rm K}}_{t}+{\beta }_{10}\Delta {\rm REA{D}}_{t}+{\beta }_{11}\Delta {\rm {POS}}_{t}+{\beta }_{12}{\rm {AF}}_{t}*\Delta {\rm REA{D}}_{t}+{\beta }_{13}{\text{FFIN}}_{t}+{\beta }_{14}{\text{FFIN}}_{t}\text{*}\Delta {\rm REA{D}}_{t}+{\rm Industry}\, {\rm and}\, {\rm Year}\, {\rm Fixed} \, {\rm Effects}+{\varepsilon }_{t}$$
(7)

FFIN is financial opacity, equal to 1 if the absolute value of a firm’s scaled accruals averaged over the past three years is higher than the corresponding industry-year mean, and 0 otherwise (Dhaliwal et al. 2011; Muslu et al. 2019). Scaled accruals are computed as follows: (ΔCA – ΔCL – ΔCASH + ΔSTD – DEP + ΔTP)/LAGTA, where ΔCA (ΔCL) is the change in total current assets (liabilities); ΔCASH is the change in cash; ΔSTD is the change in the current portion of long-term debt; DEP is depreciation and amortization expense; ΔTP is the change in income taxes payable; and LAGTA is total assets at the end of the previous year.

The results are reported in Panel B of Table 7. In Column I, the abnormal returns around the release of CSR reports are positively associated with ΔREAD, but negatively associated with AF*ΔREAD (coeff. =  – 0.15; t stat =  – 2.11), suggesting that the market reaction to CSR report readability is less pronounced for firms with more analyst following. In Column II, the coefficient on FFIN*ΔREAD is positive (coeff. = 0.239; t stat = 2.38), consistent with the argument that the market relies upon CSR reports to a greater extent for firms with a higher level of financial opacity. Column III reports the results based on the full model (7). The results are qualitatively similar to those reported in Columns I and II.

Addressing Alternative Explanations

An alternative explanation for the positive association between future CSR performance and the change in CSR report tone is that KLD may assess CSR performance by fixating on the tone of prior-year CSR disclosure even if tone is manipulated upwards, leading to a mechanic positive relationship between CSR disclosure tone and future CSR performance. While we cannot fully rule out this possibility, we notice that, to arrive at the KLD ratings, experienced research analysts apply a same set of criteria to related companies and use data gathered from a wide range of sources, both internal and external to the firm (Waddock and Graves 1997; Kim et al. 2012). In explaining its KLD rating methodology, MSCI ESG Research (2018) states that, in addition to corporate disclosure, KLD utilizes 100 + specialized datasets from governments and NGOs, and engages in daily monitoring of 1600 + media sources (global and local news sources, government, NGO, and other stakeholder sources); furthermore, it relies upon systematic communication with issuers to verify data accuracy and conducts in-depth quality review processes (e.g., specialized research, formal committee review) at all stages of rating.

To the extent that sophisticated KLD analysts could use other information sources to verify information disclosed in CSR reports and see through tone management, KLD ratings are less likely to be affected by tone management. Furthermore, if CSR reports are subject to substantial tone manipulation and KLD ratings are driven by the manipulated tone of CSR reports, then KLD ratings should not be informative of financial performance. However, prior studies have provided ample evidence that KLD ratings are positively associated with financial performance (e.g., Margolis and Walsh 2003), which does not support the view that KLD ratings are based on distorted tone information, if any, in CSR reports.

One may argue that the market overreacts to CSR report readability and tone due to functional fixation. As a result, the market reaction to CSR report readability and tone as documented in Table 5 may be attributable to market mispricing rather than the value relevance of readability and tone. To address this concern, we examine whether the changes in CSR report readability and tone are negatively predictive of future returns using the following model.

$${\text{R}\text{E}\text{T}}_{\text{t}+1}={\upbeta }_{0}+{\upbeta }_{1}{\text{C}\text{A}\text{P}\text{D}}_{\text{t}}+{\upbeta }_{2}{\text{B}\text{E}\text{T}\text{A}\text{D}}_{\text{t}}+{\upbeta }_{3}{\text{B}\text{T}\text{M}}_{\text{t}}+{\upbeta }_{4}{\text{P}\text{E}}_{\text{t}}+{\upbeta }_{5}{\text{T}\text{A}\text{C}\text{C}}_{\text{t}}+{\upbeta }_{6}{\text{N}\text{O}\text{A}}_{\text{t}} +{\upbeta }_{7}{\text{D}\text{T}\text{A}}_{\text{t}}+{\upbeta }_{8}\Delta \text{R}\text{E}\text{A}{\text{D}}_{\text{t}}+{\upbeta }_{9}\Delta {\text{T}\text{O}\text{N}\text{E}}_{\text{t}}+\text{I}\text{n}\text{d}\text{u}\text{s}\text{t}\text{r}\text{y} \text{a}\text{n}\text{d} \text{Y}\text{e}\text{a}\text{r} \text{F}\text{i}\text{x}\text{e}\text{d} \text{E}\text{f}\text{f}\text{e}\text{c}\text{t}\text{s}+{\upvarepsilon }_{\text{t}}$$
(8)

RETt+1 is 1-year-ahead stock returns following the release year t of CSR reports. We control for well documented risk factors and market anomaly. More specifically, CAPDt and BETADt are size and beta deciles at the end of year t from the CRSP database. BTMt is the book to market ratio at the end of year t , calculated as the book value of equity divided by the market value of equity. PEt is the price to earnings ratio at the end of year t , calculated as the fiscal year end stock price divided by the EPS. TACCt is total accruals for year t , calculated as the difference between earnings before extraordinary items and cash flow before extraordinary items scaled by lagged total assets. NOAt is net operating assets at the end of year t, calculated as the difference between operating assets and operating liabilities scaled by lagged total assets. DTAt is the debt to assets ratio, defined as total liabilities divided by total assets at the end of year t .

The results are reported in Table 8. We find no evidence that the market overreacts to CSR report readability and tone. Instead, 1-year-ahead returns are positively associated with ΔREAD (coeff. = 0.223; t stat = 3.76), but not associated with ΔTONE or ΔPOS, suggesting that investors underreact to CSR report readability. The results seem consistent with the view that investors do not fully understand the long-term implications of CSR report readability for future CSR performance as documented in Panel B of Table 4.

Table 8 The predictability of CSR report readability and tone for 1-year-ahead returns

Addressing Sample Selection Bias

In this section, we further test the robustness of our results by addressing the potential sample selection bias. Since our sample only includes firms that issue stand-alone CSR reports, the OLS estimation may be subject to the potential sample selection bias. We perform the Heckman two-stage procedure (Heckman 1979) to account for the endogenous nature of firms’ decision to publish a CSR report or not. Specifically, in the first stage, we estimate the following Probit model.

$${\text{DISC}}_{t} = \beta _{0} + \beta _{1} {\text{CSRP}}_{t} + \beta _{2} {\text{SIZE}}_{t} + \beta _{3} {\text{ROA}}_{t} + \beta _{4} {\text{LEV}}_{t} + \beta _{5} {\text{FIN}}_{t} + \beta _{6} {\text{LIQUID}}_{t} + \beta _{7} {\text{RD}}_{t} + \beta _{8} {\text{AF}}_{t} + \beta _{9} {\text{READ}}\_10{\text{K}}_{t} + \beta _{{10}} {\text{MKTSHARE}}_{t} ~ + \beta _{{11}} {\text{AGE}}_{t} + \beta _{{12}} {\text{CAPX}}_{t} + \beta _{{13}} {\text{ROAVOL}}_{t} + \beta _{{14}} {\text{FFIN}}_{t} + {\text{Industry and Year Fixed Effects}} + \varepsilon_{t}$$
(9)

DISCt is a dummy variable, equal to one if the firm releases a CSR report for year t and zero otherwise. The independent variables are largely taken from prior literature on the determinants of CSR disclosure (Dhaliwal et al. 2011, 2012). In additional to the control variables specified in model (2), we include market share (MKTSHAREs), firm age (AGE), capital expenditure (CAPX), earnings volatility (ROAVOL), and financial opacity (FFIN) in model (9). MKTSHAREst is the firm’s fraction of sales in its two-digit SIC industry. AGEt is the number of years since a firm’s first appearance in CRSP. CAPXt is capital expenditure scaled by total assets. ROAVOLt is computed as the standard deviation of the return on assets over the most recent 5 years; at least three non-missing observations are required to calculate ROAVOLt.

The results are reported in Panel A of Table 9. DISC is positively associated with CSRP, SIZE, AF, MKTSHARE, AGE, and CAPX, suggesting that larger and older firms as well as firms with better CSR performance, more analyst following, higher market share and capital expenditure are more likely to issue CSR reports. In addition, DISC is negatively associated with LEV, FIN, RD, and READ_10K, indicating that firms with higher financial leverage and constraints, higher R&D intensity, and more readable 10-K reports are less likely to issue CSR reports.

Table 9 Correcting for self-selection bias

In the second stage, we add the inverse Mills ratio (LAMBDAt) computed from model (9) into models (2), (3), and (4) as an additional control variable. Note that MKTSHARE, AGE, and CAPX are included in model (9) but excluded from the second stage models. These variables impose important exclusion restrictions on the second stage estimation. The results are presented in Panel B of Table 9 and are consistent with those documented in Tables 4 and 5.

Summary and Discussion

We examine the information content of CSR report readability and tone. Using a hand-collected dataset of Fortune 500 companies that published stand-alone CSR reports for years 2002 to 2014, we find that future CSR performance is positively associated with changes in both readability and tone of CSR reports, suggesting that CSR reports with higher readability and more optimistic tone are indicative of better future CSR performance. In addition, the positive association between future CSR performance and tone change is primarily due to the change in the proportion of positive words, suggesting that managers tend to use positive, rather than negative, words to convey information about future CSR performance.

The stock market appears to treat CSR report readability and tone as credible signals of future CSR performance and reacts accordingly. Specifically, the change in report readability is positively associated with both abnormal trading volume and abnormal returns around the release of CSR reports, consistent with the argument that improved report readability not only reduces information ambiguity, thus spurring trading activities, but also indicates better future CSR performance, thus leading to higher abnormal returns. Similarly, tone change is positively associated with abnormal returns, and this positive association is primarily due to the effect of the change in the proportion of positive words, in line with the view that managers use positive, but not negative, words to communicate future CSR performance.

Our results are robust to using alternative readability measures and further controlling for the sample selection bias. Additional analysis suggests that CSR report readability influences the effect of report tone such that the market reaction to CSR report tone is more pronounced for firms with more readable CSR reports. Furthermore, consist with the view that CSR disclosure plays a complementary role in improving financial transparency, we find that the market reaction to CSR report readability is stronger for firms with less analyst following and higher financial opacity. In addition, we find no evidence that the market overreacts to CSR report readability and tone. Investors appear to underreact to CSR report readability, as evidenced by the positive association between future stock returns and the change in CSR report readability.

This study contributes to the literature on discretionary information disclosure, particularly CSR reporting (Martinez-Ferrero et al. 2016; Muslu et al. 2019). Our results provide direct evidence for the information content of CSR reports, suggesting that CSR reports play an important role in reducing information asymmetry by imparting value relevant information to investors. Our results highlight the importance of examining the textual properties (i.e., readability and tone) of CSR reports as they serve as credible signals of future CSR performance and affect the effectiveness of CSR disclosure. This study also contributes to the broad literature on business ethics, in particular, that of stakeholder management. Information asymmetry gives rise to moral hazard problems and is fertile ground for unethical corporate behavior (Kulkarni 2000; Jones et al. 2018). Stakeholders demand information transparency to make informed decisions and monitor corporate behaviors. In line with instrumental stakeholder theory (Freeman 1999), our findings suggest that there are tangible financial benefits associated with providing transparent CSR information to the market. More generally, we not only document the business returns to CSR reporting, but also highlight the means to effective CSR disclosure (i.e., communicating good future CSR performance through a positive tone and readable text).

This research provides important implications for companies, investors, and regulators. Our results highlight the important role of CSR disclosure in reducing information asymmetry between firms and investors, especially for firms with lower analyst following and higher financial opacity. Given the significant market reaction to CSR report readability and tone, firms with superior CSR performance can maximize the benefits of CSR disclosure by improving CSR report readability. Furthermore, regulators (e.g., SEC and SASB) interested in assessing and improving the effectiveness of CSR disclosure could look into the textual properties of CSR reports and provide guidelines and toolbox on how firms could use simple language and appropriate tone to truthfully communicate value relevant CSR information.

There are several avenues for future research. First, while we focus on CSR reports issued by Fortune 500 companies and the US stock market, future research can extend our inquiry by examining readability and tone of CSR reports issued by smaller companies and the implications of these textual characteristics for other stock markets. Second, despite the popular use of a pre-existing word list for tone analysis, such dictionary-based approach does not consider the individual context of a negative or positive word (Li 2010; Loughran and McDonald 2016). Future research should employ alternative methods for tone analysis to corroborate our findings, such as a statistical approach (e.g., Bayesian machine learning method) that could use algorithms to analyze the statistical correlations between key words and more precisely classify the sentiments of the text.