1 Introduction

The assessment of the performance of mutual funds has been the normal procedure to choose where to invest and many papers have explained that historical performance record is the main criteria for fund selection (e.g. Bergstresser and Poterba 2002; Deaves 2004; Busse and Irvine 2006). The question whether mutual funds exhibit performance persistence has received considerable attention in the academic finance literature during the last two decades (e.g., Hendricks et al. 1993; Carhart 1997; Bollen and Busse 2005, and Vidal-García and Vidal 2016). These studies examine whether it is possible to forecast future returns based on past performance. This is an important question for the mutual fund industry as if past performance has no forecasting power for future performance, investors would not benefit of data collection. However, the mutual fund industry is growing at a fast pace and expanding to more markets. Leading mutual fund companies such as Lipper and Morningstar publish mutual fund rankings that receive coverage from the press around the world.

Many studies have examined mutual funds efficiency. The most common measures of portfolio performance are Jensen’s alpha, the Sharpe ratio and the Treynor ratio. However, there are several problems when using these measures that have been thoroughly explained in the literature. Treynor (1965), Sharpe (1966), Jensen (1968, 1969) create portfolio evaluation models that are derived from the capital asset pricing model (CAPM). Roll (1978), Reilly and Akhtar (1995), and Grinblatt and Titman (1994) explain that capital asset pricing models could be sensitive to the selection of the benchmark when examining fund efficiency. They point out that these performance measures are statistically biased against market timing ability. Lehmann and Modest (1987) explain that mutual fund performance evaluation can vary with the selection of benchmarks. They show statistically significant abnormal performance using several benchmarks. Matallín-Saez (2007) examines the difference between factors and benchmarks for portfolio performance evaluation. The author shows that there are similar biases regardless of using either factors or benchmarks and that the selection of an appropriate benchmark is more important than the multifactor model employed.

There are a number of papers that examine mutual fund performance and its relation with fees. Mutual fund fees cover the service offered by funds. They should be influenced by fund performance as the principal service offered by mutual funds is portfolio management. However, the mutual fund literature shows that high fees are the main reason of equity funds underperformance as fund performance improves when before-fee returns are used (see, Grinblatt and Titman 1989; Malkiel 1995; Gruber 1996; Droms and Walker 1996). Carhart (1997), for instance, shows that net returns are negatively correlated with fees, and that these fees are much larger for actively managed mutual funds than for passive ones. Gil-Bazo and Ruiz-Verdú (2009, hereafter GBRV) find that funds with worse before-fee performance charge higher fees. The authors focus on the link between before-fee performance and fees and assess whether variations in fund fees explain the variations in performance. They find a negative relation between fees and before-fee performance for US equity mutual funds. Their results confirm earlier findings of Gruber (1996), who showed that high fees are related to fund underperformance instead of being related to the ability to outperform the market. Supporting this evidence, Berkowitz and Kotowitz (2002) find a negative relation between fees and performance for low-quality managers. Vidal et al. (2015) show robust evidence of forecasting power for mutual fund fees. They explain that funds showing either a positive or negative relation with their fees present significant evidence of a negative return predictability using fees as forecasting variable. Vidal-García and Vidal (2015) find that only 4% of US funds reduce their fees in case of poor performance. The authors define underperformance as negative alpha estimates in three consecutive years. Thus, the percentage of fees charged by mutual funds is a relevant aspect in performance valuation (Ramos 2009; Khorana et al. 2009).

Several additional studies examine the relation between performance-sensitive investors and mutual fund fees. Christoffersen and Musto (2002) explain that fees are determined based on the elasticity of the demand and that funds with less elastic demand incur higher fees. They argue that performance-sensitive investors sell their shares in the fund after a bad performance, thus funds presenting worse past performance attract less investors. Similarly, Gil-Bazo and Ruiz-Verdú (2008) explain that top-performing funds compete for sophisticated investors (performance-sensitive investors), which reduces their fees. Underperforming funds leave, however, that portion of the market, hence attracting unsophisticated investors who are willing to pay higher fees.

An interesting debate in the mutual fund industry is the differential effect of active and passive management on fund performance. Indexed equity funds have lower fees compared with active funds. Basak and Pavlova (2013) explain that active fund managers are concerned about their performance in comparison to benchmark indices, thus increasing the percentage of stocks within their benchmark indices to reduce the risk of underperformance. However, Cremers et al. (2016) explain that investors in active funds benefit from the presence of passive funds. They point out that active funds incur lower fees and include higher active stocks in markets with more explicit indexing. In comparison, active funds increase their fees in markets with more closet indexing. Thus, they conclude that the presence of explicit indexing increases the level of competition in a fund market and closet indexing indicates the opposite.

Management fees are the main component of mutual fund expenses representing around 90% of the total expenses. These fees are charged to investors for portfolio management. Management fees usually represent a fixed percentage of total assets under management. In this sense, managers are compensated for asset growth instead of for performance. Fund management companies do not seem to be willing to establish performance-based fee funds as these fee structures are only suitable for qualified investors. Furthermore, the fund industry has an important lack of competition and is mainly dominated by banks, which allows fund companies to charge fees not linked to performance. Following the economic efficiency theory, fund managers showing performance ability should be remunerated for the costs involved in information acquisition and the costs of trading (see, Grossman and Stiglitz 1980). Otherwise there would be no incentive for becoming informed. Thus, funds offering better services to investors should charge higher fees to cover their information gathering role, leading to a positive relationship between fund expenses and risk-adjusted fund returns before expenses (see, Ippolito 1989; Díaz-Mendoza et al. 2014). The results of GBRV (2009) contrast this argument and point out that the negative link between fund expenses and fund returns does not exist for the funds that show better governance, which charge fees in relation to performance. In line with the result of GBRV (2009) is the study of Berkowitz and Kotowitz (2002), who show a positive relation between fees and performance for the best managers.

In this study, we examine the relation between expenses and performance using parametric and non-parametric methodologies for all countries around the world with available data. Our paper aims to provide insightful results on the global efficiency of mutual fund performance, which has not been studied so far. We evaluate the market efficiency of the mutual fund industry around the world in performance using a sample of domestic mutual funds for 35 countries for the period 1990–2015 that includes daily returns of 16,085 actively managed equity mutual funds. Our objective is to test whether mutual funds are efficient, employing different measurement methodologies. We use the one-factor CAPM, the Fama and French (1993) three factor model and the Carhart’s (1997) four-factor model. We also use the data envelopment analysis (DEA) non-parametric methodology to examine the efficiency of mutual funds.

The paper provides new interesting results. Using the four-factor Carhart model, we show that premiums are positive, suggesting that more risky, small, value-focused, and previous-winner stocks achieve higher returns. We also examine fund performance by presenting the percentage of positive and negative values of performance measures and find that less than half of the funds exhibit positive values of performance. The estimation of the performance-expenses relation shows that the coefficients of expenses are always significantly negative for all countries and all risk-adjusted performance measures. Finally, we present the percentage of efficient funds for every variation of the DEA model using gross returns, and alphas from CAPM, Fama–French and Carhart models as output measures. The results show that most of the funds are efficient.

We make several contributions to the literature. First, we find a statistically significant negative relationship between expenses and risk-adjusted performance across countries, indicating that higher expenses are likely to lead to bad performance. Second, the use of DEA shows, in contrast to our previous result, strong evidences that equity mutual funds around the world are almost mean–variance efficient. Thus, DEA confirms the mean–variance efficiency theory. Finally, we depict the areas of operational inefficiency that improve performance of mutual funds. Third, our research would allow us to test the agency theory and the potential agency conflict between managers and investors of mutual funds. As Jensen and Meckling (1976) explain, assuming that both mutual fund managers (the agents) and investors (the principals) focus on maximizing their individual utility, it is possible that the fund managers will not take decisions in the best interest of the fund investors. With our results, we provide evidence whether the manager’s and the investor’s wealth increase at the same time or if one increases at the expense of the other.

The rest of the paper is organized as follows. Section 2 describes the data, details the construction of the variables used in the analysis and presents descriptive statistics for the sample. Section 3 presents the parametric and non-parametric methodologies used to estimate the results. Section 4 provides the empirical results. Section 5 concludes.

2 Data and summary statistics

2.1 Data

Our sample of mutual funds includes daily returns of 16,085 actively managed equity mutual funds. The funds are domiciled in 35 countries around the world from Asia-Pacific (Australia, China, Hong Kong, India, Indonesia, Japan, Malaysia, New Zealand, Singapore, South Korea, Taiwan, and Thailand), Europe (Austria, Belgium, Denmark, Finland, France, Germany, Italy, Netherlands, Norway, Poland, Portugal, Spain, Sweden, Switzerland, United Kingdom, Ireland, and Luxembourg), North America (Canada and United States), and the rest of the world (Brazil, Chile, Israel, and South Africa). All returns are expressed in local currencies and include dividends. We only include the primary share, as some funds present multiple share classes and might have multiple observations. Fund returns incorporate management and distribution fees, but not sales loads (fee paid when shares are purchased or sold). Our time period is a 26-year interval from January 1st, 1990 to December 31st, 2015. Our final sample of global funds includes over 90% of the total net assets of equity funds around the world as of December 2015 (see, Investment Company Institute (ICI) 2015 aggregate statistics). We download data from Morningstar Direct database. Morningstar is a leading provider of mutual fund data, which includes a global coverage for a large number of fund variables. We consider time-series observations on total net assets, turnover, funds’ fees, loads, and fund age.

Our sample includes the most important countries in terms of world market capitalization. It is similar to other papers using global mutual funds data (see, for instance, Ferreira et al. 2012). We eliminate some types of funds from our database, namely, index funds, sector funds, bond and money market funds, funds investing internationally, and funds that invest in financial instruments such as convertible debt. We also apply some filters to the fund return data. First, we limit our sample to open-end domestic equity funds to consider only funds that invest in the same country. Second, we only include equity funds with 24 months of returns, as a sufficient return data period is necessary for multifactor regressions.

To the best of our knowledge, our sample is the largest sample currently available for daily mutual fund returns. We incorporate funds that come into existence at any point during the sample period, making our data not limited and reducing the extent of the selection bias. To address the problem of survivorship bias in our sample, which is the result of including in a sample only surviving funds (see, Elton et al. 1996a, b; Carhart 1997), we incorporate all funds in our sample until they disappear.

Table 1 Summary statistics

2.2 Variable construction

In this section, we describe the methodology employed to design the regression models for each country. We create a national version of the multifactor models. For U.S. funds, we download the factors from Fama and French (1993) website.Footnote 1 For the rest of the countries, we create daily factors, implementing the methodology of Fama and French (1993). For this purpose, we use all stocks from the Worldscope database (Thomson Financial Company) available in each country. This database covers above 98% of total market capitalization on a country level. Following Fama and French, we create 6 value-weighted portfolios formed on size and book-to-market. The market excess return is estimated as the value-weighted return of all stocks in the Worldscope database per country minus the Treasury bill rate for that country. We obtain a daily Treasury bill rate dividing the 1-month Treasury bill return by the number of days in the month. The Small minus Big (SMB) factor is the average return on the three small portfolios minus the average return on the three big portfolios. High minus Low (HML) factor is the average return on the two value portfolios minus the average return on the two growth portfolios. We also create a daily version of the momentum factor of Carhart (1997). We use six value-weighted portfolios formed on size and prior returns. The momentum factor is estimated as the average return on the two high prior return portfolios minus the average return on the two low prior return portfolios.

2.3 Descriptive statistics

Table 1 shows the descriptive statistics of the sample on a country-by-country basis. As explained previously, we have selected the countries with the largest market capitalization. The table presents the sample of funds as obtained from the Morningstar database. The second column of Panel A (Table 1) shows the number of mutual funds from each country. The third column presents the mean raw returns, whereas the fourth column reports the mean asset values and the fifth column shows the average number of years since the fund was created The US presents the largest amount of funds with 6501, and Hong Kong shows the smallest number with only 14 funds. Raw returns fluctuate significantly between countries, from \(-1.077\)% for Sweden to +1.007% for Brazil. Total net assets under management are valued at 267 million dollars, on average per country, and on average, a fund has lost \(-0.062\)% per day. Average asset values per country vary from 18 million dollars in Portugal to 2327 million dollars in the United Sates. The last column is the fund age variable, which indicates that, on average, a mutual fund has been in existence for a period of 11 years.

Panel B (Table 1) shows summary statistics of the daily and monthly return distributions. Daily returns present larger excess kurtosis and higher negative skewness than monthly observations. The negative skewness might be the result of the stock market collapse in 2000 and other smaller crashes.

3 Methodology

3.1 Models of mutual fund performance

To estimate fund performance, we use the one-factor CAPM, the Fama and French (1993) three factor model and the Carhart’s (1997) four-factor model:

$$\begin{aligned}&\displaystyle \hbox {R}_{pt} = \mathop {\upalpha }\nolimits _{pt }+ {\beta }_{1,pt} (\hbox {RM}_{\mathrm{t}}-\hbox {RF}_{t})+\mathop {\upvarepsilon }\nolimits _{pt } \end{aligned}$$
(1)
$$\begin{aligned}&\displaystyle \hbox {R}_{pt} = \mathop {\upalpha }\nolimits _{pt }+ {\beta }_{1,pt} (\hbox {RM}_{\mathrm{t}}-\hbox {RF}_{t}) + {\beta }_{2,pt}\hbox {SMB}_{t} + { \beta }_{3,pt} \hbox {HML}_{t}+\mathop {\upvarepsilon }\nolimits _{pt } \end{aligned}$$
(2)
$$\begin{aligned}&\displaystyle \hbox {R}_{pt} = \mathop {\upalpha }\nolimits _{pt }+ {\beta }_{1,pt} (\hbox {RM}_{\mathrm{t}}-\hbox {RF}_{t}) + {\beta }_{2,pt} \hbox {SMB}_{t} + { \beta }_{3,pt} \hbox {HML}_{t} + {\beta }_{4,pt} \hbox {MOM}_{t}+\mathop {\upvarepsilon }\nolimits _{pt } \end{aligned}$$
(3)

where \(\hbox {R}_{\mathrm{pt}}\) is the return on fund p for month t, \(\hbox {RF}_{t }\) is the risk-free rate and \(\hbox {RM}_{\mathrm{t}}\) is the market return, \(\hbox {SMB}_{t}\) and \(\hbox {HML}_{\mathrm{t}}\) are the Fama and French (1993) size and book-to-market factors and \(\hbox {MOM}_{t}\) is the period t value of the Carhart (1997) momentum return, \(e_{it}\) is the residual from the regression, and \(\alpha _{i}\) is the average return above the benchmark. Regression (1) is the CAPM model, regression (2) is the Fama–French three-factor model, and regression (3), including the \(\hbox {MOM}_{t }\) factor, is the Carhart’s four-factor model. Following the recommendation of Dimson (1979), we add lagged values of the four-factor model to control for the influence of infrequent trading of stocks on daily fund returns. To estimate each fund t-statistic, we apply the Newey and West (1987) heteroskedasticity and autocorrelation consistent estimator of the standard deviation.

Carhart (1997) defends that the four-factor model is more suitable to explain the differences in performance of past winners and past losers. The author points out that this momentum factor accounts for most of this difference. We use a set of control variables that were frequently employed in the relevant finance literature: (i) AGE: The number of years that the fund is in existence. (ii) VOLAT: Volatility is estimated as the standard deviation of the previous 12 month returns. (iii) ASSETS: Fund size measured as total net assets. (iv) Expenses: The total fund expenses. FLOW: Is the net inflows into the funds as defined by Sirri and Tufano (1998). It is estimated as the net growth in assets in excess of returns: \(\hbox {FLOW}_{p,t}= [\hbox {ASSETS}_{p,t }- \hbox {ASSETS}_{p,t-1}(1+\hbox {NRET}_{\mathrm{p},t})]/\hbox {ASSETS}_{p,t-1}\).

3.2 Performance-expenses relation

From an efficiency perspective, higher expenses should be associated with better fund performance, which is in contrast with earlier studies such as GBRV (2009), Elton et al. (1993) and Carhart (1997). Otten and Bams (2002) and Vidal-Garcia (2013) find consistent results with these studies for the European countries whereas Ippolito (1989) shows that returns are not linked to expenses for US funds. The aim of this paper is to provide evidence regarding the relationship between expenses and performance of mutual funds using different measurement methodologies.

We use the following estimation model to empirically examine the expense-performance relation:

$$\begin{aligned} \hbox {PERF}_{pt }= & {} {\lambda }_{0} + {\lambda }_{1} \hbox {EXP}_{p,t} + + \mathop {\uplambda }\nolimits _{2 }\hbox {LOWPERF}_{p,t-1} \hbox {EXP}_{p,t }+\mathop {\uplambda }\nolimits _{3 } \hbox {MIDPERF}_{p,t-1} \hbox {EXP}_{p,t}\nonumber \\&+\, \mathop {\uplambda }\nolimits _{4 }\hbox {HIGHPERF}_{p,t-1} \hbox {EXP}_{p,t }+\mathop {\updelta }\nolimits _{2 }\hbox {LOWPERF}_{p,t-1 }D_{p,t-1} \hbox {EXP}_{p,t}\nonumber \\&+\, \delta _{3 }\hbox {MIDPERF}_{p,t-1 }\hbox {D}_{p,t-1} \hbox {EXP}_{p,t }+\mathop {\updelta }\nolimits _{4 }\hbox {HIGHPERF}_{p,t-1 }\hbox {D}_{p,t-1} \hbox {EXP}_{p,t}\nonumber \\&+\, \hbox {CV}_{pt} + {\upsilon }_{pt } \end{aligned}$$
(4)

where \(\hbox {PERF}_{pt}\) are the different performance measures: gross returns (including the expenses), net returns (excluding the expenses), and the values of the risk-adjusted returns (Jensen’s alpha) from the CAPM, Fama and French (1993) model, and Carhart (1997) multifactor models; \(\hbox {EXP}_{pt}\) is total expenses; and \(\hbox {CV}_{pt}\) is a group of control variables including age, volatility, the natural logarithm of assets, and net inflows into the fund. \(\hbox {LOWPERF}_{p,t-1}\), \(\hbox {MIDPERF}_{p,t-1}\), and \(\hbox {HIGHPERF}_{p,t-1 }\) are the performance fractional ranks of fund p in period t-1, and \(\hbox {D}_{p,t-1}\) is a dummy variable to account for the effect of small funds; \(\upsilon _{pt }\) is the error term.

We divide the excess returns into partial ranks to control for the asymmetric flow-to-performance relation reported in the finance literature. Following Sirri and Tufano (1998), we sort funds from 0 (bottom) to 1 (top) as a result of their performance in the previous year. We set the following fractional classification: \(\hbox {LOWPERF}_{i,t-1}\) is the worst return quintile, determined as Min\((\hbox {RANK}_{t-1}\), 0.2); \(\hbox {MIDPERF}_{i,t-1}\) represent the middle three return quintiles, determined as Min(0.6, RANK - LOWPERF); and \(\hbox {HIGHPERF}_{i,t-1}\) is the best return quintile. By separating fund returns into different sorts, we divide the sensitivity of the mutual fund fees to the performance ranks.

3.3 The data envelopment analysis model

We measure the efficiency of domestic equity funds using the Data Envelopment Analysis (DEA) non-parametric methodology employed in the resolution of production functions. This technique was initially developed by Charnes et al. (1978) to evaluate the performance of educational institutions and since then it has been widely used to evaluate the performance of decision-making units (DMUs) determined by several inputs-outputs structures. It is a useful methodology for examining performance as it is possible to include multiple inputs and outputs that can be measured in different units. The DEA evaluates the highest potential output for a certain number of inputs. It sets an efficiency measure for each decision-making unit related to the best operating unit within a given sample. The procedure examines how efficiently a decision-making unit uses the available resources to create the outputs. The performance of these decision-making units is examined in DEA applying the concept of efficiency described as a ratio of total weighted outputs to total weighted inputs. Castelli et al. (2010) explain that the targets of a DMU are the levels of outputs (inputs) that a given DMU should reach by increasing (decreasing) its yield (consumption) to become efficient.

Efficiencies estimates using DEA are relative to the top performing DMUs. The most efficient DMUs are assigned an efficiency score of unity (100%) and the performance of the rest of DMUs ranges from 0 to 100% compared to the top performers. In this sense, Angulo-Meza and Lins (2002) note that the advantage of this model is that it gives a set of efficient solutions compared to the linear programming model that gives only one optimal solution.

The DEA methodology has been implemented to examine mutual fund performance in the U.S. by several authors, including Murthi et al. (1997), Morey and Morey (1999) and Basso and Funari (2001). To the best of our knowledge, our study is the first one to examine mutual funds’ performance around the world using DEA. The DEA procedure might be used to describe mutual fund indexes with various inputs as risk measures and fees.

If the efficiency is unity, then the DEA technique represents a Pareto efficiency measureFootnote 2 and the efficient units are located on the efficiency frontier. As Chen and Zhu (2003) point out DEA models have been proven as an effective methodology for estimating efficient frontiers.

As described by Charnes et al. (1994), to estimate the DEA efficiency model for a DMU, we should find the optimal solution to the following fractional linear programming problem:

$$\begin{aligned} \hbox {max} \{v_i ,u_r\} \quad h_{jo}=\frac{\mathop \sum \nolimits _{r=1}^t \mu _{r\gamma _{rjo} } }{\mathop \sum \nolimits _{i=1}^m \nu _{i} x_{ijo} } \end{aligned}$$
(5)
$$\begin{aligned}&\hbox {subject to }\quad \frac{\mathop \sum \nolimits _{r=1}^t \mu _{r\gamma _{rjo} } }{\mathop \sum \nolimits _{i=1}^m \nu _{i} x_{ijo} } \le 1, \quad j=1,{\ldots } \hbox {n}\\&\mu _{\mathrm{r}}\ge \upvarepsilon \quad r =1, {\ldots }t\nonumber \\&\hbox {v}_{\mathrm{i}}\ge \upvarepsilon \quad i=1,\ldots m \nonumber \end{aligned}$$
(6)

where \(\upvarepsilon \) is a small positive number to make sure that the weights are not negative. From equation (6), we obtain the value of the optimal objective function, which is the efficiency measure for unit \(\hbox {j}_{\mathrm{o}}\). We can obtain an equivalent linear programming problem by converting the fractional problem explained above (Charnes and Cooper 1962). We set \(\mathop \sum \nolimits _{i=1}^m \nu _i x_{io} = 1\), obtaining the Charnes, Cooper and Rhodes (CCR) model:

$$\begin{aligned}&\hbox {max } \mathop \sum \nolimits _{r=1}^t u_r \gamma _{rjo} \end{aligned}$$
(7)
$$\begin{aligned}&\quad \hbox {subject to }\quad \mathop \sum \nolimits _{i=1}^m \nu _i x_{ijo} = 1 \end{aligned}$$
(8)
$$\begin{aligned}&\quad \mathop \sum \nolimits _{r=1}^t u_r \gamma _{rj} - \mathop \sum \nolimits _{i=1}^m \nu _i x_{ij} \le 0, \quad j=1,{\ldots }, \hbox {n}\\&\mu _{\mathrm{r}}\le -\upvarepsilon \quad r=1,{\ldots },t\nonumber \\&\hbox {v}_{\mathrm{i}}\le -\upvarepsilon \quad i=1,\ldots ,m\nonumber \end{aligned}$$
(9)

The optimization problem can be solved by estimating the values of t+m variables, which means the weights \(\hbox {u}_{\mathrm{r}}\) and \(\hbox {v}_{\mathrm{i}}\), conditional to n+t+m+1 restrictions.

3.4 Data envelopment analysis model for mutual fund performance

The DEA methodology has been applied to examine mutual fund performance in several studies. The DEA technique allows determining mutual fund performance while considering various inputs (variables) like risk measures and expenses. The costs related to mutual fund investment are an important element when examining the fund performance. However, the popular ratios and multifactor models used do not always take into account fund expenses.

Murthi et al. (1997) were the first to apply the DEA approach in a mutual fund efficiency index called DPEI (DEA portfolio efficiency index). They used fund returns as output and expense ratio, turnover, load and standard deviation of returns as inputs. Basso and Funari (2001) develop the \(I_{\mathrm{DEA-1}}\) index, which is similar to the DPEI with some difference in the investment costs taken into account in the model. They only consider subscription and redemption costs without other expenses as they have already been subtracted from fund returns.

Using the DEA methodology allows the inefficient funds to know which other fund they could imitate to improve their efficiency. The efficient fund could be a target benchmark and the inefficient fund could obtain better performance imitating the behavior of the efficient one as both have the same input-output characteristics.

An important advantage of the DEA is that it does not need any theoretical model as a measurement benchmark as it is based on a non-parametric analysis. DEA is also useful to account for the issue of endogeneity of transaction costs as it includes the expense ratio, loads, turnover and returns simultaneously in the analysis. The model can examine many outputs and inputs at the same time. For example, the usual case when managers are in charge of the returns and size of the fund.

The DEA model has some advantages over traditional performance measures as explained by Basso and Funari (2001). The DEA model results are not sensitive to the selected investment period using logarithmic returns and assuming stationarity and independence of returns over time. In contrast, the traditional performance models are affected by the investment period and it is possible to obtain different results depending on the frequency of the observations used (daily, monthly, etc). Thus, there is a systematic bias using a time horizon which is not the same as the one considered by the investors.

Another useful characteristic of the DEA model is the possibility to improve the inefficient units with the evidence from their peers. The inefficient units could imitate a unit on the efficient frontier in order to be more efficient as the ones on the efficient frontier have the same input variables as the inefficient units. This efficient unit could be a benchmark for the inefficient funds. In this sense, Morey and Morey (1999) point out that this benchmark is a fund of funds that investors could buy.

4 Results

4.1 Performance evaluation

Table 2 shows the daily statistics for the four-factor Carhart (1997) models for the period from January 1990 to December 2015. Most premiums (alphas) are positive, suggesting that more risky, small, value-focused, and previous-winner stocks achieved higher returns. This fact indicates that the Carhart factors could explain most of the cross-sectional variation in average daily returns around the world over the period under consideration. The size (SMB) factor is always positive, the book-to-market (HML) factor presents mixed results, being significantly positive for most countries, while the momentum factor (MOM) is also significant for most countries, showing mixed evidence. The size factor (SMB) suggests that small stocks have had a satisfactory performance and fund returns are driven by smaller stocks during the sample period. The book-to-market factor (HML) suggests that funds follow a value-oriented style. Momentum strategies only add value in 14 out of 35 countries, while 9 countries present contrarian strategies. The average alpha across the different countries is positive in 21 out of 35 countries. This is in contrast to most mutual fund literature presenting persistent underperformance (e.g. Carhart 1997).

Table 2 Summary statistics for the Carhart model

Table 3 reports performance results from the estimation of regressions (1) to (3). Most countries present a positive performance (alpha) irrespective of the multifactor model used in the analysis. The before-expenses measures of performance do not change the sign of the coefficients. As expected, the gross risk-adjusted returns are positive and larger for all models than the net risk-adjusted returns (see, Table 3). The best result is obtained when we use the Carhart (1997) model to measure fund risk-adjusted performance. The daily mean gross risk-adjusted return (for the Carhart model) ranges from 0.178% for Brazil to \(-0.041\)% for Sweden, while the daily mean net risk-adjusted return varies from 0.089% for Brazil to \(-0.082\)% for Sweden. In Table 3 we do not find significant differences when considering fund expenses across countries. Our results suggest similar behavior for mutual funds in terms of performance assessment.

Table 3 Risk exposure estimates
Table 4 Performance measure distribution

Next, we examine funds per country using risk-adjusted returns (CAPM, Fama–French and Carhart) and non-risk adjusted performance evaluation (net and gross returns). In Table 4, we examine fund performance by presenting the percentage of positive and negative values of performance measures. Panel A reports the proportion of positive values, while Panels B and C show the proportion of statistically significant (at the 5%) positive and negative values, respectively. Less than half of the funds exhibit positive values of performance as shown in Panel A. We use a paired t-test to reject the null hypothesis that the true mean difference is zero. Considering the gross risk-adjusted performance (Carhart (G)), the estimations range from 40.4% for Denmark (Carhart model) to 61.6% for Indonesia. There is no significant difference when we compare the risk-adjusted values after expenses (Carhart (N)) with the results before expenses (Carhart (G)). Panel B of Table 4 confirms the previous results. Before-expenses measures of performance (gross returns (G)) obtain a larger percentage of positive alphas compared to results after expenses (net returns (N)). Panel B of Table 4 presents the percentage of funds with significant positive values of performance (gross risk-adjusted) vary from 4.9% for Indonesia to 15.4% for Austria (Panel B, last column). Panel C presents the percentage of significantly negative risk-adjusted estimations. The proportion of significantly positive alphas (Panel B) is larger than the proportion of significantly negative ones (Panel C) only before expenses (gross returns (G)). For instance, 10.2% performance values in the Carhart model (Carhart (G)) for the US are significantly positive, while only 2.2% are negative. From Panel C we can appreciate that opposite estimates are shown if net risk-adjusted values are estimated (Carhart (N)).

4.2 Performance-expenses relation

This section analyzes the economic efficiency of the mutual fund industry per country. We empirically test the relationship between performance and the fees paid by investors. Grossman and Stiglitz (1980) explain that there should be a positive link between fees and before-expenses performance. Thus, we expect the performance-expenses relation to be positive. From an efficiency perspective, higher expenses paid by investors should be related to greater performance. Efficiency means that fund services to investors would cover their costs, and thus net performance should be similar between funds after expenses. The mutual fund industry should show a robust link between expenses and gross performance.

Table 5 shows the results for the performance-expenses relation for the Carhart (1997) model using net returns. The coefficients of expenses are always significantly negative (at better than the 5% level) for all countries and risk-adjusted performance measures. Very similar results are founds using other multifactor modelsFootnote 3. We also find a negative relationship between returns and expenses for fractional ranks of past performance \((\hbox {LOWP}_{(\mathrm{t}-1)})\). This negative relation between returns and expenses from previous performance is linked to the bottom performing funds \((\hbox {LOWP}_{(\mathrm{t}-1)})\) while it is not significant for the medium and high fractional ranks. Moreover, to examine whether our findings are influenced by the performance of small funds, we add a dummy variable to each fractional rank that is equal to one if the fund size is in the lowest 10% of the fund size distribution, and zero otherwise (L \(\times \) Small, M \(\times \) Small, H \(\times \) Small). We do not obtain a relevant significance for our sample of countries.

As previously documented in the literature, we find that funds with poor risk-adjusted performance incur higher expenses (see Table 5, column \(\hbox {LOWP}_{(\mathrm{t}-1)})\) which means that funds charging high expenses tend to underperform, in contrast to the indication of the efficiency theory.

Table 5 Fund performance-expenses relation

Table 5 allows us to examine the effect of mutual fund characteristics on risk-adjusted performance. In line with prior studies of fund performance, younger funds obtain higher performance than older ones. We show a negative link between fund volatility and fund performance, indicating that more volatile funds present worse performance. We also find a significant positive relation between fund assets and fund risk-adjusted performance, which indicate possible economies of scale for mutual fund markets around the world. Finally, we document a negative relation between net inflows and fund performance, which shows an improvement in performance when net inflows are negative.

4.3 Data envelopment analysis (DEA)

We use DEA methodology, a technique used in operations research to evaluate relative measures of efficiency, to examine mutual fund performance. Using this methodology, we address some of the main problems in portfolio evaluation since it does not require a benchmark and allow considering mutual fund expenses. DEA is a flexible methodology for performance evaluation as it permits a model with many inputs and outputs. This technique allows us to examine whether the fund manager uses the available resources to achieve the maximum level of output (scale efficiency). It is a suitable methodology to use for portfolio evaluation, as investors are willing to invest in a fund that maximizes returns and minimizes expenses.

The DEA compares each fund to the best available funds in the same country. With known expenses and risk taken, we examine for each country the most profitable fund. Investors are interested to find the fund return net of expenses at a given level of risk. Thus, we consider return as the only output, and four inputs, namely expense ratio, turnover, load and standard deviation of returns. DEA examines the efficiency of a fund compared to the group of funds that consider the same inputs to achieve the same outputs. DEA separates the efficient funds from the inefficient ones depending on whether they are on the Pareto-efficient frontier or not. The separation from the efficient frontier provides an estimation of its relative inefficiency. This supports the mean–variance theory of Markowitz (1952), who explains that market portfolios are efficient when they have the highest expected return for a given variance. Thus, the DEA shows whether a fund can improve its performance compared to a group of similar funds. When the efficiency is 100% and the slack variables (input factors which represent performance inefficiencies) are zero, the output of a fund cannot be expanded without increasing its inputs. In this case, the inputs cannot be lowered with the current output level. Then, the fund would be Pareto-efficient with output efficiency of 1, while a DEA measure below 1 show that the fund is inefficient. A fund’s inefficiency is estimated as the difference between the efficiency value and 1.

Table 6 Percentage of efficient funds
Table 7 Efficiency scores

We examine the variation in average DEA across countries. For this purpose, we estimate the DEA score for each fund within each country. Table 6 presents the percentage of efficient funds across countries for our sample period 1990–2015 for every variation of the DEA model using gross returns, and alphas from CAPM, Fama–French and Carhart models as output measures. It is clear that most funds are efficient using any return measure as output and across countries. It is worth noting that our sample period is long enough to include extreme market events like recession periods and financial crisis across countries, which increases volatility in stock markets. We address this issue in a subsequent analysis. Gross returns have higher efficiency rates, while the proportion of efficient funds tends to decrease when using multifactor models, although it increases with larger factor models. Using gross returns, the percentage of efficient funds ranges from 64% for Australia to 85.2% for Finland, while using the four-factor model of Carhart the proportion of efficient funds ranges from 58.8% from Australia to 79.1% for Spain. We do not find evidence of any relation between number of funds per country and efficiency, as the country with the largest number of funds, the U.S., shows an efficiency percentage using gross returns of 70.5%, while the country with the lowest number of funds, Hong Kong, presents an efficiency percentage of 78.4%. Our results support the evidence shown by Ippolito (1989) that in an efficient market fund returns allow to cover loads and expenses.

Table 7 shows the average DEA efficiency scores by country. We present the score varying between 0 and perfectly efficient funds scoring 1. We find that funds from all countries present a degree of efficiency above 0.60 using resources, although they are not Pareto-efficient funds on the efficiency frontier as their value is below 1. The degree of inefficiency is measured as the difference between one and the score. To be on the efficiency frontier funds would need to reduce some of its inputs. However, if the efficiency is 1 and the slack variables are zero, the input level would need to increase to expand the output. Although most funds are efficient (score of 1) as we have seen in Table 6, the average is below this value due to the inefficient ones in each country. The efficiency scores differ depending on the output measure. Gross return and CAPM give higher efficiency scores. The efficiency scores of funds tend to decrease when using multifactor models and when the performance measure is more sophisticated. The efficiency scores, using gross returns as output measure, range from 0.66 for France to 0.88 for Denmark, Canada and Poland. When considering the Carhart model, the efficiency scores range from 0.58 for France and Indonesia to 0.76 for Canada. As with previous results, we do not find evidence of any relation between the number of funds per country and the efficiency score. We interpret our results of high efficiency scores as the mutual fund industry is a competitive market with investors having access to any fund around the world. Inefficient mutual funds can follow their efficient peer group as they have achieved an efficiency level in the same conditions as the inefficient ones. A challenge for fund managers in increasing the efficiency is to deal with the dilution effect (Greene and Hodges 2002), which is a result of money inflows that funds receive due to their good performance. Large new inflows dilute the overall performance of the fund until fund managers can efficiently invest this new available cash to match the fund’s performance record. Another potential issue to consider in increasing the efficiency scores is related to regulatory obligations and the open-end structure of most funds to have an important part of their assets in cash to timely meet investor redemptions.

We can determine the sources of inefficiency by analyzing the slacks (performance inefficiencies) of the cost variables. The slacks evaluate where funds use resources inefficiently and indicate the degree that each input can be reduced to obtain an efficiency score of one. Panel A of Table 8 shows the average of the absolute slacks, while Panel B presents the relative average slacks (absolute average slack in input divided by the average value of the inputs). Relative slacks are useful to compare the marginal impact of inputs on fund returns across countries. We report the results only for the Carhart output measure,. The other output measures present qualitatively similar results and have been omitted for brevity. An interesting result is that the fund risk (standard deviation of returns) presents only small slacks across countries, which supports the idea that most funds are mean–variance efficient. The risk variable shows that fund portfolios are properly diversified and equity funds have eliminated the non-systemic part of their portfolio risk. Turnover and loads present larger slacks, suggesting that more funds are inefficient on these aspects. This is consistent with the fund literature using the DEA model. For instance, Murthi and Choi (2001) show that loads and turnover are the main sources of inefficiencies across all fund categories. The slacks from fund loads suggest that investors should not consider funds that charge any load as this reduces profitability. These slacks vary significantly across countries from 0.075 for Australia to 1.556 for Indonesia. Taiwanese funds are much more inefficient in turnover since their fund managers spend more than double in turnover activity than any other manager. Expense ratios do not show significant slacks, which indicates that funds that charge higher fees appear to earn enough returns to compensate the expenses. This confirms earlier empirical findings of Ippolito (1989), who employs a CAPM model to explain that, in an efficient market, mutual funds are expected to obtain enough risk adjusted return to cover expenses for information gathering.

Table 8 Mean slacks in inputs

DEA confirms the mean–variance efficiency theory, which is defined as the ability of a fund to obtain the maximum return for a given level of risk. Markowitz (1952) and Tobin (1958) developed the concept of optional portfolio selection, which states that investors aim to maximize their utility by selecting among the possible mean–variance efficient portfolios according to their risk tolerance. We use a DEA model based on a mean–variance framework, in which the return is the output of the model and the variance of the fund is used as input. Our results confirm previous evidence (see, Murthi et al. 1997) of the mean–variance efficiency theory for mutual funds around the world, as most funds are on the efficient frontier and the standard deviation shows only small slacks.

Return distribution for some assets might show excess kurtosis (fat tails). The commonly used measure of risk (standard deviation) relies on a normal distribution of returns and this distribution does not account for the fat tails of possible stock market returns. To solve this problem, the DEA model projects the efficient frontier from the sample through non-linear forms of data. This methodology allows examining the mutual fund performance by measuring the distance of the optimal projection from the efficient frontier.

Some authors explain that investors might be more interested in extreme values (skewness and kurtosis) than in central tendencies (standard deviation) as they argue that an investor’s expected utility depends positively on skewness and negatively on kurtosis (See, Scott and Horvath 1980). However, we do not think that there is a significant recent empirical evidence to consider that most investors have these expected utility preferences across countries.

4.4 Robustness Analysis

We examine the robustness of our results from the regression and DEA analysis using the methodology of Fama and MacBeth (1973). Coval and Stafford (2007) point out that the estimates from Fama-MacBeth regression are more accurate than OLS results. We check whether our OLS estimates are similar to those from the Fama-MacBeth regressionfor both the performance-expense regression and the DEA approach.

We estimate the performance-expenses relation in equation (4) and the DEA model using the coefficients of the performance measures obtained from the Fama-MacBeth two-step approach to avoid cross-fund correlation in the residuals as a result of systematic misspecification that affect performance estimates. First, we regress risk-adjusted returns against the factors from the different multifactor models, then we take a time series average of the estimates. We find statistically and qualitatively similar results for the performance-expense regression (listed in Table 5) and the DEA model (Tables 6, 7).Footnote 4

We also test whether recessions influence our results about the efficiency of mutual funds. For this purpose, we use a recession variable that is equal to one if the economy at a given day is in a recession as defined by the OECD, and zero otherwise. We include this recession indicator in equation (4) and as an input in the DEA model (equation (11)). We find very similar results to our previous ones across countries. The recession indicator is significantly positive for all countries using equation (4) but does not significantly change the coefficient estimates of the expense variable (Table 5). For the DEA model, the inclusion of the recession variable only shows slight variation in the efficiency scores (Table 6) and the percentage of efficient funds (Table 7). Thus, considering recessions do not present significant differences in mutual fund efficiency across countries.

4.5 Limitations

The results of this research, as in all studies that use the DEA methodology, should be interpreted with precaution. In fact, DEA presents some limitations that might create biased results. The DEA model has high sensitivity to data errors and outliers and takes for granted that the inputs and outputs are free of errors. An issue to consider is that the sample should include a minimum number of funds to obtain reliable estimations. Although the main limitation of DEA is that it can show how a unit is doing in comparison to its peers but it cannot be compared to a theoretical maximum. In this sense, Miller and Noulas (1996) point out that examining whether hedge funds are inefficient compared to others does not result in a maximum output. Another problem of DEA is the ranking of units based on efficiency scores from different DEA models. Ranking of units is a difficult task when several markets are considered as the inputs from the models are affected by different macroeconomic situations. Another nontrivial limitation of using DEA models to measure efficiency is that efficient units do not show any difference using DEA. Additionally, outside influence is not considered to estimate DEA models as efficiency scores are estimated only in comparison to their peers in the sample. Taylor and Harris (2004) state that a problem with DEA is that measures relative efficiency instead of absolute efficiency, as the model assumes that each input and output variable is considered identical.

5 Conclusion

This paper examines the market efficiency of the mutual fund industry around the world in short-term mutual fund performance. Employing a unique database of worldwide domestic equity funds, the paper uses parametric and non-parametric evaluation approaches where a relation between cost (input variables) and benefit (output measure) is established.

The results show a negative and statistically significant relationship between expenses and risk-adjusted performance across countries after controlling for the effect of volatility, age, net inflow, and fund size. This result indicates that higher expenses are associated with poorer performance. The study reexamines this negative relation, which is usually widely documented in prior literature, with a non-parametric approach (DEA). The advantages of the DEA methodology are that it solves the benchmark problems in multifactor models and helps determine the causes of inefficiency. The DEA technique can consider different factors in the examination of fund performance. It can use different inputs and outputs and does not assume a functional form in the relationship between them. In contrast to our previous result, the use of DEA shows strong evidence that equity mutual funds around the world are approximately mean–variance efficient, which means that are close to the mean–variance efficient frontier. Thus, DEA confirms the mean–variance efficiency theory. We also show the areas of operational inefficiency for funds to improve their performance. Turnover and loads present larger slacks, suggesting that funds could improve their performance on these aspects.

Our results have important implications. First, there is a strong incentive to increase fees among funds in order to improve performance. Thus, our results support the agency theory explanations. Second, our results have practical relevance for potential investors, as they might consider some of the fund’s characteristics in their investment decisions.