1 Introduction

Corporate valuation based on multiples is very popular among financial analysts (for equity analyst reports, see Asquith et al. 2005). Thus practitioners and scientists alike are interested in the accuracy of the estimates provided by this valuation method. Research on the accuracy of multiple-based valuation covers processes to select appropriate groups of comparable firms (see, e.g., Herrmann and Richter 2003; Henschke and Homburg 2009), methods to aggregate single multiples calculated for the comparable firms into one estimator to be applied upon the value drivers figure of the target firm (see, e.g., Kaplan and Ruback 1996; Dittmann and Maug 2008) and the application of estimated future value driver figures compared to realized ones (Kim and Ritter 1999; Schreiner and Spremann 2007). Additionally researchers were interested which of the many different combinations of value drivers and market prices that constitute a multiple produces the most accurate valuation results (see, e.g., Alford 1992; Schreiner and Spremann 2007; Liu et al. 2002 and 2007).

Multiples used in valuation relate a figure reflecting corporate value (equity, enterprise or firm value) to a value driver figure measuring the surplus being earned by the firm (as e.g. sales, EBITDA, EBIT or EBT). Value and value driver definitions need to be aligned in their components with respect to the accounts and corresponding cost of capital on these. We use the term “consistency” to describe this requirement and analyze the impact of consistency upon the accuracy of value estimates derived by multiple-based valuation methods. Despite the huge variety of potential combinations of value driver figures and market prices not all multiples make sense. Some inconsistent combinations can easily be detected by “borrowing” theory from related DCF-valuation models: Obvious cases are confusing equity- and entity models, e.g. by relating an entity before-interest value driver figure as EBITDA to an equity value variable as the market capitalization. Some authors directly address the consistency requirement: Schreiner and Spremann (2007) point out that the entity/equity consistency requirement is quite often violated by multiples used in practice (p. 7) and analyze in general the performance of entity and equity multiples. They also compare the accuracy of earnings-based against cash flow based multiples (Schreiner and Spremann 2007 p. 16, see also, Liu et al. 2007). Our analysis goes beyond the straight entity/equity consistency requirement covered by other studies: We investigate the consistency requirement and its impact of several other, not straightforward debt positions such as pension reserves, non-interest bearing debt, finance leases, minority interests as well as investments in associates and joint ventures. Conversations with financial analysts support the view that there is considerable degree of disagreement on the proper treatment of these positions in multiple-based valuations. Borrowing some theory from DCF models, we apply the rule of thumb distinguishing entity and equity based multiples upon the debt positions mentioned above: if the (explicit or implicit) cost of capital for this debt position is not yet deducted from the value driver figure, then the corresponding debt position itself is still part of the (net) debt definition of this multiple. Applying this rule yields some surprising insights on well-known multiple definitions: the implicit cost of capital on accounts payables are higher prices for goods and services delivered by suppliers reflected in higher cost of goods sold. Consequently multiples based on revenues have to take accounts payables into account as a part of their net debt definition if the consistency requirement shall be met. In contrast, for multiples based on EBITDA cost of goods sold and thus the implicit cost on accounts payable is already deducted and therefore the value of accounts payable is not part of the debt/net debt definition. To the best of our knowledge none of the empirical studies takes this requirement into account.

In order to analyze the impact of consistency upon valuation accuracy we employ a standard hold-out routine to produce value estimates. Comparing these estimates against observable market values allows to calculate the deviation between the two and to derive different error measures. Dittmann and Maug (2008) have pointed out to potential biases stemming from inappropriate combinations of error measures and aggregation methods applied upon the comparable firms’ multiples. Taking this potential bias into account we combine four different multiple aggregation methods (arithmetic mean, harmonic mean, median and geometric mean) with two different error measures (log-scaled error and absolute log-scaled error).

Our main finding is that in most cases consistency is increasing the valuation accuracy of multiple-based valuation. Moving from inconsistent multiple definitions mismatching entity- and equity-figures to consistently defined multiples reduces the median absolute log-scaled valuation error between 2 and 14 percentage points. For sales—and net profit—based multipliers definitions meeting the highest consistency requirements display the highest valuation accuracy. However for EBITDA- and EBIT—based multiples we find two out of five consistency adjustments (adjusting net debt for accounts payables and investments in associates/joint ventures) to yield a lower valuation accuracy. Our results also support the findings of Dittmann and Maug (2008) and provide evidence on biases for different combinations of error and aggregation measures. Arithmetic mean aggregation still being the most common aggregation in practice produces significantly upwards biased results: mean log errors are above 50 % for many (consistent and inconsistent) multiples.Footnote 1 Combining log-scaled errors with geometric mean aggregation of peer multiples provides unbiased estimates. Additionally we are interested in the accuracy of the multiples within the consistent group, especially on the comparison between enterprise value based multiples against equity value based multiples. We find that consistently defined EBITDA multiples have the highest valuation accuracy followed by EBIT, net income and sales based multiples.

Transferring market price relations from observable markets to other, not regularly traded assets, multiple based valuation in general relies on the assumption that there are no systematic deviations between market prices and the respective intrinsic values. A general critique on this valuation procedure is thus, that potential over- und undervaluations on these markets is transferred to other segments by applying the market multiples on earnings figures of not traded assets and pricing them. As our analysis aims to reproduce market prices and measures accuracy by the deviation between value estimates and prices, it rests on the same assumption.

The rest of the paper is organized as follows: Sect. 2 analyzes consistency requirements and their impact upon multiple definitions. Section 3 gives an overview over other issues related to accuracy of multiple-based valuation and Sect. 4 discusses some measurement problems. Our database will be described in Sect. 5, whereas our results are presented in Sect. 6. Section 7 concludes.

2 Consistency in multiple-based valuation

2.1 General consistency requirements

General requirements with respect to consistency can be recognized by borrowing theory from the discounted cash flow valuation method. As such, we define the enterprise value as being the value of equity plus gross debt minus financial assets (thus reflecting the value of the operating assets) and the firm value being the value of equity plus gross debt (thus reflecting the value of all firm assets). The distinction between entity and equity models can be transferred into multiple definitions: value drivers reflecting the operating surplus of the firm shall be related to enterprise values as the value of the operating assets. Entity values or firm values (gross debt and equity) shall be combined with value driver figures reflecting the surplus of all (financial and operating) assets before interest expenses. Finally, equity values shall be related to value driver figures after interest expenses. Table 1 gives an overview over consistent and inconsistent multiple definitions following these rules:

Table 1 Consistent and inconsistent multiple definitions

As can be seen, some quite popular multiples already display inconsistencies on this general level: equity value to sales multiples combine operating surplus with equity values thus being distorted either by additional financial assets being part of the equity value (whereas interest income not being reflected in the revenues) or by corporate debt generating interest expense not being deducted from the revenues (whereas equity reflects the net of debt wealth of the shareholders).

2.2 Consistent enterprise value definitions

Beyond the general consistency requirements there are additional requirements which relate to different net debt definitions in enterprise value calculations. Benchmark case is the simplest version netting out interest bearing debt against cash&equ. holdings providing interest income. As follows we analyze whether certain liabilities/financial assets shall be included into the net debt definition (i.e. be added to debt or cash&equ.). Starting point to tackle this question is again the distinction between equity and entity discounted cash flow valuation models: if the cost of capital for the particular funding source have not yet been deducted from the cash flow or earnings figure, the market value of the liability, as present value of the future cost and the redemption attached to it, has to be deducted from the enterprise/firm value to get to the equity value. Applying this rule of thumb upon interest bearing debt is straightforward: as the interest expense on debt has not yet been deducted from sales, EBITDA, EBIT etc., the debt value itself (as the present value of future interest and redemption) is still part of the enterprise value and thus its market value has to be deducted from the enterprise value to calculate the value of equity. We will now apply this rule upon different balance sheet items in order to analyse the consistency requirement.

2.2.1 Non interest—bearing debt

Applying the rule of thumb from above upon non-interest bearing debt, the true “cost” of those categories of debt have to be determined. As the cost of capital of those liabilities is not directly reflected as interest expense in the profit and loss statement, one has to derive the implicit cost of the funds reflected by these liabilities.

  • Advance payments: Customers make advance payments and by doing so provide the firm with credit; the redemption of this credit happens by netting the amount against the revenues when realized. Of course the customer credit is not costless, despite the fact that its costs are not directly reflected in the profit and loss statement: a customer required to make an advance payment, will charge a discount on the price. Thus, the implicit costs of advance payments are foregone revenues: the firm would have realized higher prices and revenues without advance payments.

  • Accounts payable: A similar argument applies for suppliers’ debt; if the firm’s suppliers of goods and services have to wait for their money, they will charge higher prices. The implicit cost of accounts payables are reflected in higher expenses for raw materials, services and thus in higher cost of goods sold in the firm’s profit and loss statement.

Applying the rule of thumb upon advance payments, we find that for any operating earnings measure to be related to enterprise values, advance payments should never be part of the net debt. As the calculated multiples rely on the revenues the firm realizes under an advance payment regime, the cost of capital as foregone revenues has already been deducted from the sales, EBITDA, EBIT, or cash flow figure. For accounts payables the answer is not that simple: as the cost of capital are part of the cost of goods sold in the profit and loss statement, the result of the rule of thumb application depends on the operating earnings measure related to enterprise values:

  • For enterprise value/sales multiples the implicit cost of accounts payable as part of the cost of goods sold are not yet deducted from the sales. Consequently the debt and net debt definition to be applied has to include the value of accounts payable.

  • For enterprise value/EBITDA-, enterprise value/EBIT- and enterprise value/NOPLAT multiples cost of goods sold and thus the implicit cost on accounts payable is already deducted. For these multiples the value of accounts payable is not part of the debt/net debt definition.

Therefore there is no unique definition of debt or net debt in enterprise value based valuation. The appropriate debt/net debt definition depends on the operating earnings measure and the corresponding enterprise value multiple chosen.Footnote 2

2.2.2 Pension reserves

As German corporate law does not require firms promising pension payments to separate pension assets and liabilities from its balance sheet, the firms are directly liable for the pension and allowed to keep the funds on corporate level. Besides the mandatory insurance against bankruptcy with the Pensions-Sicherungs-Verein Versicherungsverein auf Gegenseitigkeit there is no restriction for the firm with respect how to use the funds.Footnote 3 Therefore corporations in Germany can use pension reserves as a financing tool. As under this regime pension reserves are a liability of the firm, the question arises whether it is part of the debt/net debt definition for the enterprise value-based valuation. Applying again the rule of thumb from above, the answer depends on whether the cost of capital on the pension reserves are already deducted from the operating earnings measure used. This, in turn, requires a closer look on the cost of pension reserves and pension obligations of the firm: the profit and loss statement shows interest cost and service cost as expenses directly related to the pension obligation. The service costs in general represent the annualized amount of funds the firm has to set aside in order to cover future pension payments. Unfortunately, it is difficult to properly estimate the “true” costs of pension obligations (see on this problem, e.g., Drukarczyk 1990; Schwetzler 2003a). These costs depend on how much cash wage is substituted by the service cost as the current expense to cover future pension payments of the firm. The most convenient assumption for financial analysts is a 100 % cash wage substitution: In this case, the service cost is completely financed by the employee via substituted wage and thus the pension payment is not a “gift” from the firm. Under this assumption the pension reserve is a long term credit fully financed by the employee; the costs for this credit equal the interest costs in the profit and loss statement.Footnote 4

Finally, the consistency requirement for pension reserves depends on where the interest cost is included in the firm’s profit and loss statement. Under US-GAAP, interest costs on pension obligations are seen as part of the labour costs and represented in the cost of goods sold of firms’ profit and loss statements. Under IFRS, interest costs on pension obligations are either part of the firm’s cost of goods sold or financial earnings. Under the current German GAAP, interest costs on pension obligations are part of the financial earnings. Before the introduction of the Bilanzrechtsmodernisierungsgesetz, companies could either report the interest cost on pension obligations as part of the cost of goods sold or financial earnings. Assuming the interest costs on pension obligations are reported as part of the financial earnings, the consistent treatment of pension reserves takes them as part of the debt definition for enterprise value-based multiples: the interest cost of pension reserves is not deducted from operating earnings figures that are commonly used for enterprise value-based multiples (revenues, EBITDA, EBIT or NOPLAT).Footnote 5

2.2.3 Finance lease

Financial theory suggests that lease contracts have cash flow and risk properties similar to a long term debt contract. Thus IAS 17 requires finance lease contracts in the annual statement of the firm to be treated like a purchase of the corresponding asset being fully financed by an amortising loan. The lease payments are split into interest payments and the repayment of capital. The payment obligation is shown as liability on the balance sheet. The assets leased are getting depreciated as other assets on the balance sheet would be depreciated. As the interest part of the lease is separated from the operating profit figures as Sales, EBITDA and EBIT, the finance lease obligation on the balance sheet is to be considered as debt for the purpose of calculating consistent multiples. Around 48 % of our sample’s firm years apply to IFRS as accounting standards. For the about 31 % of our data relating to German GAAP (HGB) the treatment of finance lease is less clear: Commentaries to the standards recommend also separating the interest component from lease payment into interest expenses. We assume this to be the case for the majority of our cases and proceed like in the IFRS case, adding financial leases to debt in our enterprise value definitions.

2.2.4 Minority interest

If a company has a direct and indirect shareholding of less than 100 % in another company, but has control over decisions/voting rights, the parent company has to fully consolidate its subsidiary company. As a consequence, the parent company accounts for 100 % of the subsidiary’s profit and loss statement financials as well as net debt, but only for the shareholding of its equity stake. Therefore, the parent company is required to account for the minority interest, which is the remaining equity value over which the company has control but not the ownership. As usually multiple-based valuation rests on consolidated annual statements all value driver figures used relate to the group’s total liabilities (and thus equity) figures, whereas the resulting equity value shall reflect the majorities equity position.

As a consequence, minority interest in subsidiaries has to be included to net debt for enterprise value multiples whenever consolidated profit and loss financials are used for the calculation of multiples.

2.2.5 Investments in associates and joint ventures

When calculating multiples, usually the starting point for calculating the enterprise value is the market value of equity. The commonly used market capitalization reflects the equity value for the entire company and therefore includes the equity value of all the company’s financial assets. If the firm holds a non-controlling stake in another firm’s equity as associate and/or joint venture, this stake is not consolidated, but shown “at equity” as a financial asset on its consolidated balance sheet. The income from associates and joint ventures is treated as financial income and included in the profit and loss statement below the operating income. The firm’s equity stake in associates and joint ventures thus has to be treated as financial asset and to be included into the net debt definitions when deriving the firm’s enterprise value.

2.3 Consistent multiple definitions

Table 2 below gives an overview over consistent multiple definitions for different enterprise value and equity value-based multiples:

Table 2 Consistent multiple definitions

3 Impact factors on the accuracy of multiple-based valuation

A multiple relates a value driver figure assumed to be proportional to value as e.g. sales or EBIT to a price figure as e.g. the market capitalization or the enterprise value. Selecting “comparable” firms being as similar as possible to the firm to be valued and relating their value drivers to their observable market values allows to calculate multiples. These multiples are transferred upon other firms by applying them upon their value driver figures. Being a convenient and easily applicable valuation procedure, practitioners and scientists alike have been concerned with the accuracy of the value estimates produced by it. Several empirical studies have been analyzing ways to improve the quality of this estimates; research has been performed along several dimensions of the valuation procedure:

3.1 The selection of comparable firms

The basic idea of multiple-based valuation is that similar assets should trade at similar prices. Thus the degree of similarity should have an impact upon the accuracy of the valuation method. The standard procedure of financial analysts takes the industry affiliation of the target firm as a starting point: the “peers” that serve as comparables are doing business in the same industry, thus sharing a similar operating risk and similar growth perspectives. Several research studies take the SIC or other industry affiliation codes as a classification measure for the comparable firms (see, e.g., Alford 1992. Schreiner and Spremann (2007) use a different system that also refers to an industry classification). Herrmann and Richter (2003) and Henschke and Homburg (2009) provide evidence on the relevance of additional firm specific information when selecting the comparable firms: the accuracy of their value estimates increases significantly when incorporating certain financial ratios in their selection procedure.

3.2 The timeliness of the value driver figures

When calculating multiples for the comparable firms financial analysts in many cases have to rely on available data with respect to the value driver figures employed as e.g. sales, EBIT etc. In many cases only realized figures from the past, e.g. from the most recent annual statement are available and allow for the calculation of trailing multiples. Bigger firms are covered by financial analysts who write research reports which often include forecasts of financials. Financial service providers as I/B/E/S collect and aggregate these forecast to consensus data. These data allow the calculation of forward multiples. Kim and Ritter (1999) (for the case of IPOs) and Schreiner and Spremann (2007) provide evidence for the superiority of forward over trailing multiples, providing value estimates with a higher accuracy.Footnote 6

3.3 The value driver figure chosen for the multiple

There is still a large number of multiples that meet the above consistency requirement. Research has also already addressed the issue which of the consistent multiples offers the highest valuation accuracy. Schreiner and Spremann (2007) and Liu et al. (2007) show that earnings-based multiples provide higher accuracy then multiples that are based on cash flow figures. Additionally bottom-line value driver figures outperform top-line ones. Schreiner and Spremann (2007) also found “knowledge-based” value drivers to produce more accurate results than “traditional” ones and equity-related multiples to outperform entity related ones.

3.4 The aggregation measure applied on the comparables’ multiples

In order to improve valuation accuracy researchers and financial analysts usually collect several similar firms to serve as comparables and compile them in a peer group. As a perfect match of all comparables does not exist there will be a distribution of different realized values for the peer’s multiples that has to be aggregated to a single figure in order to serve as an estimator for the multiple to be applied upon the value driver figure of the target firm. For this aggregation there are several statistical measures at hands as central moments of the comparable’s multiple distribution.

3.4.1 Arithmetic mean

The formula for the aggregated multiple of a peer group of firms in this case is

$$\bar{X}^{A} = \mathop \sum \limits_{i = 1}^{n} \left[ {\frac{{P_{i} }}{{\gamma_{i} }}} \right]\frac{1}{n}$$
(1)

Pi denotes the value/price variable, γi the value driver of firm i and i = 1…n are the firms in the peer group. In practice analysts widely use the average of the peers’ multiples to calculate the aggregated multiple to be applied on the firm’s value driver variable.

3.4.2 Harmonic mean

The harmonic mean of the peer’s multiple distribution is defined as

$$\bar{X}^{H} = \frac{1}{{\left[ {\mathop \sum \nolimits_{i = 1}^{n} \frac{1}{{\left[ {\frac{{P_{i} }}{{\gamma_{i} }}} \right]}}} \right]:n}}$$
(2)

In a first step the reciprocal value of the single multiples is calculated. Then the average of this variable is computed; finally again the reciprocal value of the average is taken. The harmonic mean is always lower than the arithmetic mean. The reason for this result is the convexity of the inverse.

3.4.3 Median

Median of the peer group’s multiple distribution is the value separating the upper 50 % observations from 50 % lower observations. The definition is

$$\bar{X}^{M} = { \inf }\left\{ {\frac{{P_{i} }}{{y_{i} }}:F\left( {\frac{{P_{i} }}{{y_{i} }}} \right) \ge \frac{1}{2}} \right\}$$
(3)

3.4.4 Geometric mean

The geometric mean is defined by

$$\bar{X}^{G} = \left[ {\mathop \prod \limits_{i = 1}^{n} \left[ {\frac{{P_{i} }}{{\gamma_{i} }}} \right]} \right]^{\frac{1}{n}}$$
(4)

The geometric mean of the multiple’s distribution is equal to the exponent of the arithmetic mean of the log-scaled multiple’s distribution:

$$\bar{X}^{G} = { \exp }\left\{ {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} ln\left[ {\frac{{P_{i} }}{{\gamma_{i} }}} \right]} \right\}$$
(5)

When judging the different aggregation measures there is general agreement that arithmetic mean is heavily affected by outliers.

Thus research in general relies on median or on harmonic mean as an aggregate multiple (see, e.g. Baker and Ruback 1999; Schreiner and Spremann 2007; Henschke and Homburg 2009; Liu et al. 2002; Herrmann and Richter 2003 additionally use the geometric mean).

4 Measuring the quality of multiple based value estimates

4.1 Producing value estimates

In order to calculate estimates for the values we rely on a standard holdout procedure used in research: Starting from a sample of n comparable firms defined by an industry classification one firm is excluded, while multiples for the remaining n-1 firms are calculated and aggregated. The aggregate multiple is then applied on the value driver figure of the n-th firm producing an estimate for its value. By comparing the estimate against the observable equity market value for equity values (or in the case of enterprise values with the observable equity market value plus net debt) of this firm the accuracy of the estimate can be observed. This hold out procedure is applied on all n firms in the peer group, providing an accuracy measure for all firm observations.

4.2 Error measures

Comparing the calculated value estimate \(\hat{V}\) against the observed market value P four different error measures may serve as indicators for the accuracy of the estimate.

4.2.1 The percentage difference between estimated and market value (percentage error)

$$e_{\text{perc}} = \frac{{\hat{V} - P}}{P}$$
(6)

The percentage error is the scaled difference between the estimated and the market value. This error measure applies a different scale upon positive and negative deviations: As negative deviations lower than −100 % are not possible, whereas positive one above 100 % are, this error measure displays a systematic positive bias if it is combined with the mean aggregation of the peer groups multiples. The positive skewness of the error measure itself makes it difficult to properly assess the accuracy of the estimate: the mean error is different from zero. Thus value estimates are systematically biased and dispersed at the same time, making it impossible to judge the accuracy by a measure of dispersion alone.

4.2.2 The logarithm of the ratio of the estimated value to the market value (log-scaled error)

$$e_{ \log } = \ln \left( {\frac{{\hat{V}}}{P}} \right)$$
(7)

Log-scaling the error by taking the log of the ratio between estimated and market value of the firm removes the skewness of the error measure by allowing deviations from −∞ to +∞. On the other hand, this measure puts different weights on absolute deviations with different size and thus might be interpreted carefully.

4.2.3 The absolute value of the percentage error (absolute percentage error)

$$e_{{{\text{abs,}}\,{\text{perc}}}} = \left| {\frac{{\hat{V} - P}}{P}} \right|$$
(8)

Taking the absolute value of the percentage error treats positive and negative values equally and avoids positive and negative deviations to net out against each other. The benefit of this measure is that it allows for a one-dimensional figure when judging the accuracy of the valuation method. On the other hand, still relying on percentage error this measure is still exposed to the bias stemming from the error measure. Additionally the resulting figure cannot easily be interpreted as being the average deviation from the observable market value produced by a certain multiple based valuation.

4.2.4 The absolute value of the log-scaled error (absolute log-scaled error)

Using the absolute value of the log-scaled error avoids the upwards bias of the percentage error measure and the netting effect of positive and negative deviations.

$$e_{{{\text{abs,}}\,{ \log }}} = \left| {{ \ln }\left( {\frac{{\hat{V}}}{P}} \right)} \right|$$
(9)

In empirical studies researchers have used a variety of error measures: The majority of studies relies on percentage errors as accuracy measure.Footnote 7 Alford (1992) and Schreiner and Spremann (2007) use the absolute percentage error, whereas Herrmann and Richter (2003) rely on the absolute log-scaled error as an accuracy measure. Recognizing the ambiguity of a biased and at the same time dispersed error measure Henschke and Homburg (2009) use two different measures: “bias” is captured by the mean percentage error whereas “accuracy” is measured by the absolute percentage error. The authors propose this measure because positive and negative signed estimation errors do not cancel out when being aggregated over all observations.

Dittmann and Maug (2008) have pointed to the interaction between aggregation measure and the error measure reflecting the accuracy of the value estimate: as some error measure themselves are also skewed to the right,Footnote 8 their combination with right-hand skewed aggregated multiples is amplifying the upwards bias, thus clouding the accuracy judgement of the analysis. Using percentage errors in general imposes an upwards bias to the results of all aggregation measures. The combination of geometric mean aggregation and log-scaled errors is yielding unbiased value estimates and thus a mean (log) error of zero. The same results hold for the combination of median aggregation and log errors if the number of observations is sufficiently high (see Dittmann and Maug 2008 p 14). Another way to avoid the ambiguity problem trading bias against dispersion is using absolute (unsigned) errors; on the other hand this procedure produces results that cannot easily be interpreted as some (average) deviation from observed prices.

The above measures have all their benefits and shortcomings:

  • Percentage errors display systematically upwards biased valuation errors, but can easily be interpreted as the average deviation between estimated value and market value.

  • Log-scaled errors are not systematically biased, but (as percentage errors) net positive against negative deviations. Under certain aggregation methods mean log-scaled errors are equal to zero for all multiples, thus making it impossible to rank order different multiples by their mean errors.

  • Absolute errors avoid the netting effect of positive and negative deviations, but cannot be easily be interpreted; absolute percentage errors additionally carry the upwards bias of percentage errors whereas absolute log-scaled errors do not suffer from this shortcoming.

For our study we rely on the following error measures:

  • Investigating the performance of the different aggregation methods we rely on log-scaled errors, as we are also interested in potential systematic over- and/or undervaluation of certain aggregation methods.

  • When analyzing consistency requirements within a certain value driver figure and when comparing the different consistent multiple definitions we concentrate on the absolute log-scaled error. By doing so, we avoid the upwards bias of the percentage error and the “netting” effect of the signed error measures.

Finally, we measure performance within the same multiplier category, i.e. by comparing estimated enterprise values against observed enterprise values and estimated equity values against observed equity values. Measuring enterprise value-based multiplier performance based on equity values would expose our findings to a potential bias by different leverages.

5 Descriptives

5.1 Sample

The sample collected for this study is based on German headquartered companies with primary listing in Germany between 1998 and 2011.Footnote 9 Firms in the financial sector were not considered in the sampleFootnote 10 resulting in an initial sample of 654 firms or 9156 firm year observations. These data, extracted from DATASTREAM, include historical accounting figures and market values for each firm.Footnote 11 In addition, the criteria for selecting the firm-years observations for the analysis are: (1) Firm market values and at least one earning figure are available for each firm-year observation; (2) nominator and denominator of multiples are strictly positive, and (3) peer groups based on three-digit Industry Classification Benchmark (ICB) categorization have a minimum of seven observations. The ICB classification, developed by FTSE International Limited and Dow Jones & Company, has four levels that increase in fineness: Industry, Supersector, Sector, and Subsector. A three-digit ICB classification, or “sector level”, allows us to analyze relatively homogenous peer groups, while keeping a sensible amount of observations per peer group.Footnote 12

This industry classification was chosen over the widely used Standard Industrial Classification (SIC) because the number of firm-years observations with available ICB codes were considerably higher compared to those with SIC codes, given our relatively small sample.

After subtracting the observations that do not meet the abovementioned criteria, the sample was reduced to 6030 firm-year observations. This resulting sample includes firm-year observations from 23 sectors (as defined by three-digit ICB codes), ranging from software to utility companies. Table 3 summarizes the number of resulting observations per sector, and the relative size of each sector to the total sample:

Table 3 Sample, observations per sector

Moreover, we collected balance sheet items for each firm-year that includes short and long term debt, accounts payable, pension and healthcare reserves, minority interests, investments in associates and joint ventures, finance leases, as well as cash and cash equivalents. Not available (N.A.) values for these balance sheet items were assumed to be zero.

Table 4 summarizes the descriptive statistics of the sample with respect to these items. The median firm has annual revenues of €87.9 million, net income of €1.5 million, a market capitalization of €53.7 million and a level of total debt of €24.3 million.

Table 4 Sample characteristics and descriptive statistics

It is worth mentioning that the financial data in this sample are highly positively skewed, as observed in the significantly higher means in comparison with the medians of all financial figures.

5.2 Multiples

In order to analyze the impact of consistency upon valuation accuracy we calendarized the financials to the financial year end of 31st December,Footnote 13 set the valuation date 88 days after the financial year endFootnote 14 and employed different enterprise value definitions and different multiples. The different versions of enterprise values follow along the discussion in chapter 2; we stepwise added different interest-bearing and non-interest bearing financials to the market capitalization. Table 5 gives an overview of all enterprise value definition employed in our analysis:

Table 5 Enterprise value definitions employed

Finally combining the different enterprise value definitions and market capitalizations with the different value drivers: sales, EBITDA, EBIT and net income, we calculate 132 different multiples of which only four multiples meet our consistency requirement. Table 6 shows the multiple definitions meeting our consistency requirements:

Table 6 Consistent multiple definitions

The remainder 128 inconsistent multiple definitions were established by either combining equity-based value driver figures with entity-based enterprise value definitions or vice versa or by applying inappropriate enterprise value definitions with entity-based value drivers. Table 7 below displays the descriptives of the four consistently defined multiples:

Table 7 Descriptives of consistent multiples

The median equity value multiple is 17.2× for the net income multiple. The median enterprise value multiple is 0.8× for revenues multiples, 6.8× for EBITDA multiples, and 10.7× for EBIT multiples.

Note that a heavily positively skewed distribution is also observed in these multiples, as in the case of the underlying financial figures. In all cases the mean values of the multiples exceed the third quartile value.

6 Results

6.1 The setting

Our analysis is working on three different layers:

  1. a.

    Aggregation: We analyze the impact of different aggregation methods on valuation accuracy.

  2. b.

    Consistency: We are interested in potential differences in accuracy between consistent and inconsistent multiple definitions.

  3. c.

    Value drivers: Within the group of consistently defined multiples we analyze the performance of different value drivers as sales, EBITDA, EBIT and net income.

6.2 Aggregation methods

In the first step we apply our hold-out procedure to analyze the four different aggregation methods arithmetic mean, harmonic mean, median and geometric mean with respect to their valuation accuracy. For this procedure we rely on the consistently defined multiples from Sect. 5.2. As we are interested in the potential systematic over- and/or undervaluation caused by the aggregation methods we use the log-scaled error as an accuracy measure, following our discussion in chapter 4. In the case of negative values for EBITDA, EBIT and net income we had to remove the firm year observation from our analysis in order to get meaningful results. Beyond this adjustment we decided not to remove any outliers or to winzorize our sample as we aimed to get a clear picture on the magnitude of the potential bias. Table 8 displays the results for the four different aggregation methods.

Table 8 Log-scaled errors for consistent multiple definitions based on different aggregation methods

We find the results to support our hypotheses from Sect. 3:

  • For all multiples arithmetic mean displays the lowest valuation accuracy, i.e. the highest mean and median log-scaled errors and the highest volatilities. Additionally this aggregation yields a significant over-estimation of the corporate values; mean errors are between plus 59 and 97 % and 74 % of all observations are overvalued.

  • Results for harmonic mean also support the theoretical findings by Dittmann and Maug (2008) and the hypothesis from Sect. 3: We find a significant and systematic undervaluation, with means between minus 45 % and minus 57 %. 67 % of all observations are undervalued. Valuation accuracy is higher compared to the arithmetic aggregation method, but lower than the accuracy reached in median and geometric mean aggregation methods.

  • Median and geometric mean aggregation display the highest valuation accuracy, i.e. the lowest mean and median log-scaled errors and the lowest volatilities. Mean log-scaled errors are equal to zero for all multiples, when geometric mean is used as aggregation method; as we observe median errors to be greater than zero for all multiples definitions log-scaled errors are negatively skewed here. Median aggregation yields the second lowest mean valuation errors and the lowest median errors over all four aggregation methods. Looking at volatilities of the log-scaled errors we find both aggregation methods yielding similar valuation accuracy.

Thus our results support earlier findings by Dittmann and Maug (2008) and others: In order to avoid systematic over- and/or undervaluation financial analysts should rely on median or geometric mean as a method for aggregating peer group multipliers. Both methods produce log-scaled errors displaying a significantly lower mean and median error and a lower standard deviation than arithmetic mean and harmonic mean aggregation. We find these results to be supported by our findings based on absolute log-scaled errors.

6.3 Consistency

For the purpose of analyzing the consistency of multiple definitions and their impact on the accuracy, we ranked the performance of the different multiple definitions. Following our discussion in chapter 4 and the results from above, we apply the absolute log-scaled error as error measure in combination with the median as aggregation method and rank order the different multiple definitions by their median error. For each multiple category (sales, EBITDA, EBIT and net income) we analyze 33 different multiple definitions. Table 9 shows the most important findings:

Table 9 Ranking of absolute log-scaled errors

We start by looking at the most obvious consistency requirement, the appropriate match of entity and equity figures in the multiple definitions. The line “simple mismatch” in Table 9 displays the results of multiple definitions mismatching market capitalization (equity) with operating (entity) profit and loss figures. All “mismatched” definitions have lower accuracy ranks, higher median errors and higher error dispersions than the corresponding consistent enterprise value based multipliers: Market capitalization/sales multiples, market capitalization/EBITDA multiples, market capitalization/EBIT multiples are all ranked with worst ranking 33, whereas in the case of market capitalization/net income this consistent multiple is ranked best with one.

With respect to higher level consistency, we find for sales based enterprise values and for equity values our consistently defined multiples to have the highest valuation accuracy: EV04 and market capitalization/net income show the lowest median error over all 33 different multiple definitions.

However, we get a different picture for EBITDA- and EBIT- based enterprise value definitions. Here, our consistently defined multipliers do not show superior valuation accuracy as Table 9 displays: consistent multiples based on EV03 rank at 19 for EBITDA and EBIT. The highest accuracy is achieved for both cases by using an inconsistent multiple definition based on EV02.Footnote 15 Compared to the consistent definitions of EV03, EV02 has two differences:

  • It (inconsistently) does not include associates and JV’s into net debt.

  • It includes accounts payables into net debt, which is inconsistent for EBITDA and EBIT based multiples.

For a deeper analysis of the impact on the valuation accuracy for the different enterprise value definitions we will follow our discussion in Sect. 2.2 and analyze the impact of the different balance sheet items on the accuracy. In order to analyze the 32 different enterprise values we use a pairwise comparison matching a particular enterprise value definition against the corresponding definition including/excluding the position under consideration. Counting the number of superior ranks over all pairs of definitions allows a judgment of the superiority of the proposed treatment.Footnote 16 As the benchmark multiple definition only including interest bearing debt and cash&equ. is part of this pairwise analysis our results presented below contain the impact of the resp. adjustment on the unadjusted multiple definition. In order to reduce complexity, we restrict our attention to the combination of the median aggregation method with the mean absolute log error in this paper. The analyses for the other combinations are available upon request from the authors.

6.3.1 Accounts payable

Accounts payable are required to be included in net debt in order to define consistent EV/sales multiples, but to be excluded from net debt for EV/EBITDA and EV/EBIT multiples. The analysis summarized in Table 10, however shows mixed results with respect to valuation accuracy: Including payables yields increasing accuracy for all enterprise value multiple definitions. In the case of EV/EBITDA and EV/EBIT multiples thus inconsistent enterprise value definitions yield a higher valuation performance than consistent ones. These results raise the question whether capital markets are efficiently pricing the stocks of firms holding significant accounts payables. We additionally analyzed this question by exploring the impact of the (scaled) size of payables upon the (signed) log scaled valuation error of our consistent multiple definition; using a simple regression model we did not find evidence for a significant relationship.Footnote 17

Table 10 Impact of the consideration of accounts payable on the improvement of the absolute log-scaled error ranking

6.3.2 Pension reserves

Pension reserves are required to be included in net debt in order to define consistent enterprise value multiples. The analysis summarized in Table 11, confirms that the valuation accuracy is in all cases higher if pension reserves are included.

Table 11 Impact of the consideration of pension reserves on the improvement of the absolute log-scaled error ranking

6.3.3 Finance leases

Finance leases are also required to be included in net debt in order to define consistent enterprise value multiples. The analysis summarized in Table 12, also confirms that the valuation accuracy is in all cases higher if financial leases are included.

Table 12 Impact of the consideration of financial leases on the improvement of the absolute log-scaled error ranking

6.3.4 Minority interest

Results for minority interest are summarized in Table 13. It shows that the inclusion of minority interest into net debt improves multiple performance in all cases.

Table 13 Impact of the consideration of minority interest on the improvement of the absolute log-scaled error ranking

6.3.5 Investment in associates and joint ventures

Results of our pairwise comparison with respect to associates and joint ventures are summarized in Table 14. They are not in line with our hypotheses: According to our consistency requirements investments in associates and joint ventures are financial assets and to be included into net debt when deriving the firm’s enterprise value. However looking at the pairwise comparisons higher valuation accuracies are achieved if enterprise value does not adjust for investments in associates and joint ventures. As for payables we again ran a regression analysis using the (signed) log scaled valuation error of our consistent multiple definition as dependent and the impact of the (scaled) size of assocs&JV’s as independent variables; we did not find evidence for a positive relationship between the size of this position and the log scaled valuation error.

Table 14 Impact of the consideration of investment in associates and JVs on the improvement of the absolute log-scaled error ranking

6.4 Value drivers

Beyond the general consistency requirements we are interested in the accuracy of consistently defined multiple definitions resting on different value driver figures. We compare sales, EBITDA, EBIT and net income-based multiple definitions based on the valuation results measured by the absolute log-scaled errors combined with the median and geometric mean aggregation method. As compared to sales EBITDA, EBIT and net income can take negative values, we additionally have to eliminate combinations with negative figures from our analysis for these multiples. By doing so, we loose 1217, 1813 and 2130 observations from our sample. Estimations are calculated based on the remaining observations, accuracy is also measured based on the reduced sample. Thus the interpretation of our results for EBITDA, EBIT and net income is limited to the case of positive realizations of these earnings figures. Table 15 displays our results:

Table 15 Absolute log-scaled errors for consistently defined multiple definitions based on different value drivers

We find EV/EBITDA and EV/EBIT multiples display the smallest mean and median absolute log-scaled errors as well as standard deviations. EV/EBITDA multiples are in most cases slightly better performing than EV/EBIT multiples for mean and median errors and have slightly lower standard deviations. The EV/EBITDA and EV/EBIT based multiples are followed by the market cap/net income multiple in terms of the accuracy of valuation error. The lowest valuation accuracies have the consistently defined EV/sales multiples, i.e. the highest mean and median errors as well as the highest standard deviations.

7 Conclusions

This paper analyzes the accuracy of multiple based valuation approaches with respect to peer group aggregation and consistency requirements based on a large German sample. With respect to peer group aggregation we find arithmetic mean to systematically overestimate and harmonic mean to systematically underestimate corporate values. Thus without eliminating outliers by hand, this aggregation methods will yield biased results, when applied in practice. In order to avoid systematic over- and/or undervaluation financial analysts should rely on geometric mean and median as aggregation methods. Geometric mean as an aggregation method produces unbiased value estimates with a mean of zero. For a large number of observations this will also hold, if median is used as aggregation method.

The center of this study is the analysis of consistency requirements; we use absolute log-scaled errors to measure the impact on valuation accuracy. First, we find consistent multiple definitions display in most cases higher valuation accuracy than inconsistent ones. The first layer consistency requirement of properly matching entity figures and equity figures increases the valuation accuracy in all cases. In a deeper analysis with respect to consistent enterprise value definitions, in the majority of cases consistent definitions still outperform inconsistent ones. We find that consistent treatment of financial leases, pensions and minority interest increases the valuation accuracy. However for the balance sheet variable accounts payables we find mixed evidence: the inclusion results in a higher valuation accuracy for all enterprise value multiples, whereas under our hypothesis it should only do so for the EV/sales multiplier. For investments in associates and joint ventures, an inconsistent treatment also results in a higher valuation accuracy.

Finally this study is interested in the valuation accuracy within the class of consistently defined multiples. Here we observe that EBITDA multiples have the highest valuation performance followed by EBIT, net income and sales based multiples.