1 Introduction

How narrative information is becoming central in how corporations tend to assess their creation procedures and that more often than not, narrative information is often excluded from appearing in Corporate Annual reports (CARs). Consequently, it is argued that if we are to increase narrative information in annual reports—information that is innovative and non-financial (Ware 2000; FASB 2001), then an improvement of corporate reporting quality is to be expected. Moreover, additional business aspects lend themselves to exploration and delineation when narrative reporting is available. The narrative section of CARs seems to have been drawing regulatory consideration on itself, which culminated in 2005 when the IASB issued its own views on it.

The European Community regulation No. 1606/2002 clearly states that all EU-listed corporations must produce consolidated financial reports in order to comply with the International Financial Reporting Standards (IFRS) as of January 1, 2005 (Pagletti 2009). The IFRS primary aim is the enhancement of corporate financial reporting quality, as well as the improvement of reporting comparability. According to International Accounting Standards Board (IASB 2005), corporate annual reporting falls into two categories; (a) narrative information produced by a given management team and which provides data of qualitative nature including commentary and analysis of key matters regarding a company’s philosophy, goals, financial aims, as well as strategic planning, (b) financial data consisting of quantitative financial information usually reported in balance sheets, cash flow reporting and financial and standard ratios.

Researchers and actual practitioners of Management Commentary (MC) have already begun to realize the centrality of narrative information in CARs (Garefalakis et al. 2016; IASB 2010). In fact, steps towards the enhancement of the sheer amount and quality of narrative information that is produced and given to investors and others is today, considered vital. The narrative part of financial reports in the US is called “Management Discussion and Analysis” (MD&A), whilst in the UK, the same narrative data is referred to as “Business Review” or “Operating and Financial Review” (OFR). Nonetheless, in most of other countries it is commonly referred to as Management Commentary (MC). According to IASB (2005, 2006, 2010) MC is the amount of information supplementing at financial reports and it’s considered to be an essential component of a given corporation’s financial statements. MC elaborates on the significant progresses and possible issues that might inform the development, performance, and current position of a corporation. What’s more it applies to the exact temporal period referred to in the rest of the financial reporting. In addition, MC illustrates and examines significant trends and parameters that are expected to impact the future progress, performance and position of said corporation (Hou et al. 2017).

Given that Management Discussion and Analysis, Operating and Financial Review and Business Review, are all referring to the same concept: Management Commentary, yet in different countries is named differently, in the present paper, we will be exclusively employing the term MC which will be referring to the narrative part of CARs.

In this paper we seek to explore the outcomes of the obligatory application of the IFRS in code-law EU countries and the results from the optional application of the IFRS in the US. This paper engages with the comparison of financial reporting narrative data between the periods 2002–2004 and 2005–2007, prior to and subsequent to the integration of the IFRS regulatory accounting guidelines respectively. The study divided in two-portions, in the first portion of this paper is the efficiency testing of a new, innovative, measurement tool for assessing narrative report quality, labeled as G&T AIQADFootnote 1 which using the Management Commentary Index (Ma.Co.I) of 70 key performance indicators (KPIs) classified into five categories, the index was constructed in line with the IASB (2010) (Garefalakis et al. 2016). In the second portion a comparison of narrative information available in financial reports is conducted. This comparison of P-IFRS and A-IFRS narratives will enable us to draw conclusions regarding the changes taking place as to the quality level and type of progress brought about A-IFRS implementation.

This research is allocated into four parts. The first part concise an overview of the literature on financial report quality evaluation, and the various measurement tools for the assessment of financial report quality used to this day. In the second part the research methodology and the construction techniques of Ma.Co.I and G&T AIQAD tool are analyzed. The third part describes the analysis per geographical regions. Finally, the conclusions of the research are presented.

2 Literature overview

2.1 IFRS and narrative reporting quality

The International Financial Reporting Standards is a framework which aimed for all EU member states’ accounting guidelines to be forsaken and implement instead, the International Financial Reporting Standards. As Armstrong et al. (2009) emphasize, implementing the IFRS agenda aims, among other things, to accomplishing wider capital market incorporation. Implementing the IFRS in financial reporting is according to Florou and Kosi (2009) also contributing in “higher comparability data, lower transaction costs and greater international investment”. The benefits of the IFRS implementation are accounted for by Iatridis (2010) as well, who suggests that in terms of investors’ decisions, the IFRS framework contributes in arriving at “informed financial decisions and predictions of firms’ future financial performance and gives signal of higher quality accounting and transparency”. Consequently, implementing the IFRS framework is likely to result in minimized profit manipulation and significantly enhance the efficiency of stock markets. In addition, there seems to be a tendency to influence firm stock returns and other stock-associated financial progress measurements in a rather positive light (Garefalakis et al. 2015a, b).

At a global level, using additional narrative information in CARs and having regulators emphasizing the importance of MD&A section of annual reports, might essentially improve the financial reporting quality of businesses. As a matter of fact, certain jurisdictions already are implementing or revising their current financial reporting practices, others yet adopt the mandatory disclosure framework for financial reporting. After the Enron scandal, US companies started witnessing a fortification of their MD&A guidelines, one example being SEC in 2003. The CICA, the Canadian Institute of Chartered Accountants, similarly saw the publishing of more precise, in-depth MD&A guidelines. In addition, a disclosure agenda issued by CICA (2002, 2009) integrates and refers to five key aspects including of financial reporting information: key performance driving forces, capacities, results, risks and strategies. Lastly, the CICA also promoted a set of six disclosure principles for companies to follow and implement. The Accounting Standards Board (IASB 2005) in Britain, re-introduced a revised OFR heavily based on the Jenkins framework. With regard to EU member-states, the IASB (2006) Framework introduced the MC reporting in its agenda. The requisite for MC type of reports are outlined in a plethora of legal instruments that the EU is already implementing. More specifically, the accounting directives Fourth and Seventh Company Law Directives which have been revised in view of the Modernization Directive and improved on the basis of the Directive for Transparency. Back in 2004, the EU agreed to have the Transparency Directive implemented by all member states within a two-year period. The implementation of the Transparency Directive suggests that issuers of securities are obliged to produce annual and mid-term financial statements and are also expected to include a management report that follows the Accounting Directives too.

2.2 Prior research

Previous studies have sought to assess narrative reports employing a series of different tools. Many are the practitioners that employ a ‘disclosure index’ for their narrative report assessments. In a widely referenced study by Botosan (1997), the latter suggests a comprehensive index for determining the level of voluntary disclosure in the machinery industry for 122 different companies. His study was largely constituted upon the examination of narrative disclosures. CICA published the Jenkins Report (CICA 2009) a report outlining the principles for item selection to be used in the subsequent analysis and examination of annual reports. The types of information covered in the Report were; background-related information, historical-results-oriented summaries, key non-financial statistical information, projected data and MD&A. This paper employed 35 key, individual items from these five categories.

Robb et al. (2001), conducted a topic-based examination on disclosure of non-financial nature in the US, Canada and Australia. Their study followed the recommendations laid out by the Jenkins Report. In their study, they utilized a disclosure score sheet with categories addressing items of non-financial nature desired by users and were part of the resource database implemented by the Jenkins Committee. These information items were subsequently classified into six categories, half of which were future-oriented, and the other half were historical topics based, in total, the items included in the database were 65.

Apart from topic-based analysis in the lines of Robb et al. (2001), other studies employed a Readability and Linguistic analysis approach for the assessment of their narrative information projects. Jones and Shoemaker (1994) using Readability analysis reviewed 32 studies, with 26 of them directly referring to annual report narratives. Using readability indexes like the Flesch one, the goal was to measure how cognitively demanding the discourse in such financial reports is. The Flesch Index for instance, is anchored in the combination of word-syllable count and sentence length. Text difficulty level is derived out of a comparison of the calculated score with external points of reference, providing a picture of the text’s readability level. More often than not, narratives in annual company reports are hard, to extremely hard to read. Despite the objectivity and reliability of the Flesch index, the latter is still not flawless, rather it is laden with issues regarding how the readability scores are to be implemented for the evaluation of accounting narratives.

Sydserff and Weetman (2002) using Linguistic Analysis conducted a to-this-day, significant piece of research with their recommendation and introduction of a new evaluation method, the texture index. This index, different form the readability formula ones, is capable of identifying a more dense set of information and is not in any way related to the scores issued by readability formulas. For their study’s purpose, Sydserff and Weetman (2002) utilized theoretical and applied linguistics in order to create a text-oriented strategy for calculating narrative information and ultimately provide a tool that can be extensively used by fellow researchers for the purposes of financial report analysis (Garefalakis et al. 2017; Card et al. 1999).

Since the IFRS framework has been issued and adopted in 2002, and given there’s an enhanced computational knowledge today, it is being observed that (more) studies focus on evaluation tools that assess specific items’ quality, including both financial and non-financial information. Such measurement tools gauge the impact the presence of information in annual reports might have on the decisions its users will subsequently make. Castellano and D’Ecclesia (2013) prioritized the quality level of financial reports as well as the implementation of fair-value accounting practices. On the other hand, Gaeremynck and Willekens (2003) sought to better understand how audit reports might be related to and be useful in decision-making about financial report information. Garefalakis et al. (2016) probe into the relationship that exists between the quality of financial reports and business management practices, and at the same time Beretta and Bozzolan (2004) preferred to examine the nature of internal control quality along with information pertaining to risk disclosure. However, as these studies show, they choose to examine individual elements of annual reports and thus the insights are limited by this partial, incomplete examination. On the contrary, a more detailed, in-depth analysis of financial report quality will yield significantly more wide-ranging and applicable results (Garefalakis and Dimitras 2016; Gaeremynck and Willekens 2003).

In order to counteract for this inadequate, fragmentary approach on narrative reporting quality assessment, the present study has developed and tested a complex index in order to thoroughly and lengthily evaluate non-financial report information quality of CARs, while ensuring to implement all factors on decision usefulness, as these have been outlined in the Management Commentary (IASB 2010).

3 The Ma.Co.I and the AIQAD

3.1 Sample and data

This study developed a complex measurement tool for the analysis of narrative report quality in the 129 largest companies worldwide for the periods before (2002–2004) and after (2005–2007) the IFRS implementation (Appendix 1). We deliberately divided our sample in two parts in order to make the examination easier and more accurate. In the first part are the countries that disclosure the narrative information (MD&A) mandatorily (e.g., the US and Canada) while in the second part those countries in which narrative portion are voluntarily (EU countries). Subsequently, EU countries were divided into Western Europe Countries (Switzerland, France, Germany, Netherlands, and Belgium), Northern Europe Countries (Norway, Denmark, Sweden, and the UK) the Southern and Eastern European countries were not included (Fig. 1).

Fig. 1
figure 1

Europe by regions source: https://study.com/academy/lesson/european-countries-map-facts.html

In blue, Northern Europe, in light blue Western Europe, in red color Eastern Europe and in green color Southern Europe (source: Ecom Ratings). The companies selected constitute a representative sampling based on stock market significance and geographical and industrial variety. Our sampling was determined by five criteria:

  1. A.

    Company position.

  2. B.

    Market, with its financial and commercial performance and issues considered.

  3. C.

    International presence.

  4. D.

    Peer groups.

  5. E.

    Previous report performance.

3.2 Management Commentary Index

A project team staffed with standard-setting national representatives from the UK, New Zealand, Germany and the CICA was constituted by the IASB in 2002. This team’s goal was to explore the possibility of developing a set of principles or guideline standards on management commentary practices. Three years later in October 2005, IASB publishes the project’s conclusions in a paper titled Management Commentary, therein, the project team members carefully outline their viewpoints on what management commentary comprises of, including the practitioners involved, its objectives and the qualitative features it should possess.

The Financial Accounting Standards BoardFASB and the International Accounting Standards BoardIASB together published a draft version of “An improved The International Financial Reporting Standards-IFRS Practice Statement of Management Commentary (MC)” (IASB 2009), which delineates the conceptual context upon which accounting standards are to be based upon. The goal outlined in this 2009 draft was to implement objectives and other criteria in order to achieve a high level of accounting standards. The latter would then be used in producing first-class financial report data, essential for subsequent, corporate decision-making (FASB 1999; IASB 2009). In addition, it also suggests that the said framework was to be employed in decision making especially for corporate practices or operations that currently are not informed by any accounting standards. MC refers to decision-shaping information to be the most significant aspect of financial reporting, MC describes decision-useful information as the information regarding the “reporting entity that is useful to present and potential equity investors, lenders and other creditors in making decisions in their capacity as capital providers” (IASB 2009, 2010).

The amount of narrative information revealed in an annual report is what determines an MC’s quality. The Ma.Co.I has been developed by Garefalakis et al. (2016), to evaluate the amount of narrative information revealed in annual reports and it complies with the “The IFRS Practice Statement of Management Commentary” as this has been recommended by the IASB (2010). The 70 indicators of the Ma.Co.I are classified into 5 categories (Garefalakis et al. 2016) (Table 1).

Table 1 Key elements of Ma.Co.I (Source: Garefalakis et al. 2016)

Each one of the five categories seek to explore a different aspect of a company. Category 1 deals with the nature or structure of a company, for instance competition issues, matters of regulatory nature, its macro context that illustrates its own market impression etc. For category 2, the strategy plans and goals of a company are included, prioritizing strategy structure and directions. Category 3 comprises of the company’s key resources, as well as the involved risks and relationships with other institutions or entities. Special focus is given to the inter- and intra- associations and the managing of crucial risk parameters. Category 4 emphasizes the company’s outcomes and prospects and deals with the financial and non-financial progress of a company, lastly, Category 5 provides insights into the historical or diachronic company progress as this is derived out of performance measures and other indicators.

3.3 An intelligent qualitative analysis database—(AIQAD)

Once the process of evaluating financial report quality started, we soon realized how time-consuming and complex the evaluation process actually was. In fact, the sheer amount of narrative information analysis that each financial report produced led to a series of errors. In point of illustration, the volume of information that emerged from the analysis was on average 240 pages per report for the years 2002–2007. The massive accumulation of narrative information might be a promising possibility, but nonetheless, this doesn’t constitute it as a capability as well. Given the human brain’s limited cognitive capacity, it was virtually impossible to efficiently manage such a large amount of data for each financial report. Given this hindrance, we developed a keyword list per KPI category, for which we considered all possible alternative renderings for each question and its definition. Using a synonym generator, we quickly and effectively gathered all the necessary information.

The given approach by AIQAD model is based on automatic separation of word tokens that exist in management commentary reports in four basic categories which can be occasionally furthermore analyzed in many subcategories (Tsamis et al. 2016; Garefalakis et al. 2016). As shown in Fig. 2 this categories are described as:

Fig. 2
figure 2

Mapping of the AIQAD report reading process

(A) No Meaning Words: Words which are continuously repeated many times and in many documents. Their existence doesn’t play major importance on understanding of the meaning of text and they will simply be ignored by the application. (B) Simple Meaning Word: This category contains the biggest amount of words which have secondary role in understanding of the content of documents. (C) Special Meaning Words: Is the reference of the most interesting category and it is self-initialized by a specific file of words with special and distinct meaning. This package has been constructed thanks to years of research and each word can improve significantly the evaluation and understanding of the documents content. (D) Arithmetic Variables: Unlike other computer programs that refer to text processing, this application considers numbers to be not just an arithmetic value but variables holding a number value. This optimization is made in order system to be able to make comparisons and not just equality tests.

Cases A and C are initialized through the appropriate storing information file of type XML (Bray et al. 1997). This information remains unchanged and may change only after the intervention of its creator. In contrast, a category B and D constructs their complete included information in real time while parsing the documents. Their data depends only on the content of the parsed files. After the previous stage, the system builds a dictionary which includes all the different words mentioned in all reports and an inverted file with pointers from words to documents and the specific points that they appear. The procedure is considered to be necessary in order to achieve quick searching mechanism and also having enough data to correctly calculate all statistics computation of the next step. Statistic calculations step involves a variety of techniques whose mission is to automatically interpret and evaluate the content of reports. These techniques are based on the absolute and relevant position which words tokens holds inside the text, the occurrence or not of their synonyms, their occurrence frequencies in a single document or a variety of files and many other characteristics (Tsamis et al. 2016; Garefalakis et al. 2016).

Using information retrieval techniques we consider documents to be bags of word tokens which separately don’t have significant impact on text meaning (except case of category C) but their position and coexistence can help us considerably in extracting new information that can assist in content evaluation (Tsamis et al. 2016; Garefalakis et al. 2016).

In order to be able to properly evaluate the content of documents we make use of the Vector Space model and we consider the reports to be vectors in this multidimensional space (Manning et al. 2008; Raghavan and Wong 1986). The amount of dimensions is equal to the multitude of different word tokens detected in all documents and the measure of vector is based on the given position of the word inside the report and its appearance frequency. Special emphasis is given to the existence or not of specific word tokens contained in category C “Special Meaning Words” since as we have observed they influence significantly the meaning of management commentary reports to the mentioned text points as well as the meaning of the whole document.

As described visually in Fig. 3, on a theoretical level, the words that belong in category Special Meaning Words are divided in three subcategories which are the Good Words, the Bad Words and KPIs. This separation is made due to the specific meaning that each word holds and this is why we can not exclude the probability that the meaning of a word belongs to more than one sets depending on the way it is used. Based on a word synonyms generator application we are able to discover all possible synonyms of Good, Bad or KPIs words. It is obvious that synonyms sets are bigger than the initial sets (Tsamis et al. 2016; Garefalakis et al. 2016).

Fig. 3
figure 3

Words types analysis

Using this capability, application is able to discover all the possible ways that authors can use to formulate their thoughts in their reports. In order to produce synonyms we use MIT Java Wordnet Interface (JWI) which is a Java library for interfacing with the Wordnet electronic dictionary (Manning et al. 2008). It features API calls to retrieve index words, synsets, and morphological exceptions from the Wordnet data files. It also has calls that allow following lexical and semantic pointers, and contains classes that can perform simple morphological processing. The library has no GUI elements, is freely available for all purposes, as long as proper acknowledgement is made.

By the end of the process illustrated in Figs. 2 and 3, the program allows the extraction of the data in a word file, which summarizes the data acquired from one, or from a variety of financial reports. Each piece of information is accompanied by the page and paragraph it was mentioned in, to ensure result validity and also to allow us to have instant access to the data for future reference. This largely contributes in the transition from a sheer quantitative measurement analysis to an essentially qualitative one.

Then, used the binary method to quantify the narrative information according to Cooke (1992) and Beattie et al. (2004), where each question-indicator of the Ma.Co.I to be included in the CARs will be rated with 1, while it will be rated with 0 if it is not included. Consequently, the MC total score is measured with the following formula:

$$ {\text{MCscore}} = \frac{{{\text{sum }}\;{\text{of}}\;{\text{indicators}}\;{\text{rated}}\;{\text{as}}\;1 }}{{{\text{total}}\;{\text{indicators}}}} \times 100 $$
(1)

The MC score represents Ma.Co.I quality, measured as the total amount of narrative information included in CARs. The quality of Ma.Co.I for each company is between 0 and 100 (percentage-Appendix 2) and 0–70 (KPIs) respectively.

4 Analysis per geographical region

The following Tables 2 and 3 identify the change in the quality of the Ma.Co.I and thus the quality of MC following the adoption of IFRS. This demonstrates that the implementation of IFRS has had positive general results in the quality of the CARs, but perhaps not to the extent that some could expect (IASB 2010). This is why many modifications are still being made even today to find the golden mean between the quantity and the quality of the economic and narrative information that each company must disclose, regardless of the country, size, region, applicable law and language. A second important element resulting from this analysis has to do with the geographical dimension of the firms in the sample. It is obvious that the mandatory disclosure of narrative information applied by the US firms does not give much difference in the quality of the results to their CARs compared to the voluntary disclosure applied up to today by EU in its own firms. Therefore, the discussions on the problems that arise in the CARs do not appear to derive to a significant extent from the voluntary disclosure, as several surveys in the bibliographic review report (Garefalakis and Dimitras 2016; Hou et al. 2017; Lin et al. 2014). If a company desires to disclose its information (whether it concerns financial information or includes narrative information), in order to provide better and more reliable information to its investors and shareholders, this does not appear to be affected by the mandatory or optional implementation framework of the countries.

Table 2 Mα.Co.I scores per geographical area
Table 3 Analysis per category for periods P-IFRS and A-IFRS

However, it should be noted that the percentage of the attributable quality of the financial statements in the USA shows a value of 51% just above the average of the sample, while in the Northern Europe and the Western Europe it is 48% and 47% respectively, which shows, among others, that the financial statements of firms still have considerable room for improvement.

Also, in five categories of the Ma.Co.I, an increase in the highest quality of information is noticed in all the companies of the research (Table 3). Specifically, Category 1 recorded a percentage equal to 68%, which is also the highest percentage of attributable quality of the Ma.Co.I. This category includes low-risk disclosure information related to the investors’ update, i.e., the business sector, the markets targeted, its products and its structure in general.

Subsequently, Category 3 is the one with the highest increase in the percentage of disclosed information by 65% after the application of the IFRS, compared with the period before the IFRS, where the percentages were 57%. The increase in quality recorded at this level is significant as this category describes information about the risks assumed, the available resources and the business relationships. Categories 4,5 show a significant increase at a level of 4–5%, while here it is noted that the information in these categories are lacking compared to the rest categories, as their percentages are below 50%. Finally, Category 2 finds that the quality of the information remains stable for both periods.

5 Conclusion

Comprehensive assessment of the quality of financial reports is important as it may provide appropriate information to improve disclosure quality of economic decision making and may in parallel enhance overall market efficiency (IASB 2006, 2009). The massive accumulation of narrative information might be a promising possibility, but nonetheless, this doesn’t constitute it as a capability as well. Given the human brain’s limited cognitive capacity, it was virtually impossible to efficiently manage such a large amount of data for each financial report.

Given this hindrance, we developed the G&T AIQAD tool that can be used to translate qualitative information provided in the MC framework into a quantitative measure that can be used in a diversity of experimental and other research contexts.

The G&T AIQAD contains a keyword list per KPI category, for which we considered all possible alternative renderings for each question and its definition. Using a synonym generator, we quickly and effectively gathered all the necessary information. The given approach is based on automatic separation of word tokens that exist in management commentary reports in four basic categories which can be occasionally furthermore analyzed in many subcategories.

Likewise, the G&T AIQAD tool licenses evaluation of any publicly-traded firm, unlike the size-biased and limited analyst’s ratings provided through the AIMR reports.

Additionally, International Standards are a set of accounting principles, rules, methods and procedures, the introduction of which leads to the uniformity of the preparation of the financial statements and thus to accurate, true and uniform information for users (investors, shareholders, creditors, employees, tax authorities). Also, the strengthening of the narrative information framework in every case reinforce the quality of the CARs. However, the mandatory disclosure of narrative information applied by the US firms does not give a great difference in the quality of results in their CARs, compared to the voluntary disclosure applied up to today by the EU in its own firms. Finally, Firms need to change their way of thinking and quickly adopt the new facts of the era by reducing the past disclosure provided by the financial statements (but not reaching the other end) and increasing the future information provided by the narrative indicators. So, there is a lot of room for improvement for the narrative indicators, it is enough to find the golden mean.