1 Introduction

The mean-variance analysis derived by Markowitz (1952) is a milestone in modern finance theory for optimal portfolio construction, asset allocation and investment diversification. According to his theory, the investor selects his optimal portfolio depending on his risk aversion level on the Markowitz efficient frontier, i.e. the set of efficient portfolios with minimum risk for a given level of the average portfolio return. Two mean-variance efficient portfolios play an important role in asset allocation: the global minimum variance portfolio, i.e. the efficient portfolio that corresponds to the fully risk-averse investor and the Sharpe ratio optimal portfolio, i.e. the portfolio which corresponds to the tangency point between the efficient frontier and a line drawn from the origin in case of no riskless asset or when the portfolio analyses is based on excess returns.

However the success of the efficient frontier as a conceptual framework, the practical implementation leads to unworthy optimal portfolios (e.g., Jobson and Korkie 1981; Michaud 1989) due to the uncertainty about the expected returns, variabilities and correlations among returns and estimation errors. In practice, the parameters of the efficient frontier are unknown and thus researchers, replacing the parameters by sample mean and covariance matrix, solve the sample approximation problem instead. This first source of estimation errors is analyzed by Xu and Zhang (2012), Bai et al. (2008) and Leung et al. (2012). Xu and Zhang (2012) show that under some mild conditions the optimal solution of the sample mean-variance problem converges to its true counterpart at an exponential rate and Bai et al. (2008) and Leung et al. (2012) measure the overprediction problem for plug-in allocation when the sample size ratio is large and propose a method to improve optimal portfolio estimation. The second and third source of estimation errors are statistical and refer to the econometric analysis of the weights and characteristics of the optimal portfolios in- and out-of-sample respectively. The second, i.e. the finite sample distributional properties of the sample efficient frontier are studied by Okhrin and Schmid (2006), Bodnar and Schmid (2009, (2011) and Knight and Satchell (2010) among others. Okhrin and Schmid (2006), Bodnar and Schmid (2009, (2011) and Knight and Satchell (2010) prove several distributional properties of the global minimum variance portfolio weights, the estimated portfolio return and variance, and other summary measures of optimal investment for benchmarked portfolios respectively assuming normally distributed returns, whereas Bodnar and Schmid (2008a) and Bodnar and Zabolotskyy (2010) derive a test for the weights and asymptotic distributions of weights and sample characteristics of this optimal portfolio in elliptical and conditionally heteroscedastic elliptical models respectively. However, despite the important role of the Sharpe ratio optimal portfolio in asset allocation and asset pricing tests (e.g., MacKinley and Pastor 2000), the literature fails in the attempt to provide a formal procedure to construct tests and confidence intervals for the Sharpe ratio portfolio weights and its characteristics. Britten-Jones (1999) presents a popular test for testing hypothesis about the weights of the Sharpe ratio portfolio implemented using the standard OLS regression procedures but does not provide its confidence intervals. Related to the finite distributional properties of Sharpe ratio weights and characteristics, Okhrin and Schmid (2006) show that all moments higher or equal to 1 of the Sharpe ratio portfolio weights do not exist concluding (page 247) that “the use of the common estimator leads to untreatable results”, and Bodnar and Schmid (2008b) prove that neither the first moment of the tangency portfolio return estimate nor the one of variance estimate exist. Finally, despite its key role in portfolio management, the third source of estimation errors has rarely been studied by academics. Focusing on the out-of-sample performance characteristics of optimal portfolios, Michaud (1998) proposes and Michaud and Michaud (2008) among others develop a group of Resampled Efficient™techniques to compute the optimal portfolio and define trading and monitoring rules. The pros and cons of Michaud’ approach have been studied in several papers see such as Scherer (2002), Markowitz and Usmen (2003), Harvey et al. (2010) and Becker et al. (2010).

In this paper we propose a bootstrap resampling methodology to estimate the densities and the required confidence intervals of weights and characteristics for any mean-variance efficient portfolio (e.g. the global minimum variance and Sharpe ratio portfolios), the sample efficient frontier, the confidence region of the efficient frontier in the mean-variance space and the prediction densities of the future optimal portfolio returns without distributional assumptions on the asset returns. Our proposal has several advantages over the classical mean-variance estimation and Resampled Efficient™techniques. First of all, our procedure is based on a statistical model so the optimal (in-sample) portfolios incorporate investor’s knowledge about which is the most accurate financial model of returns. Moreover, it can be easily modified to focus on out-of-sample efficient portfolios and construct predictive intervals for their weights and sample characteristics from the predictive series of returns. Therefore, the second and third source of estimation errors could be measured and controlled. Second, the optimal portfolios are obtained by solving a sample mean-variance problem that could be subject, as in many cases of practical interest, to a budget constraint (sum of the proportions of invested wealth equal to one), no short-selling (non-negative proportions of invested wealth), trading costs and other linear constraints. Finally, the method is very easy to implement and it helps to describe multivariate features of optimal portfolio returns that would be missed by the standard marginal normal procedures. Moreover, it does not rely on distributional assumptions and incorporate the variability due to parameter estimation in the bootstrap predictive intervals.

To illustrate the properties and limits of this alternative approach, we implement an extensive simulation study to compare the finite sample behavior of the bootstrap intervals with the empirical and the alternative classical ones. In addition, we apply the procedure to construct the confidence region for an international efficient frontier. It is shown that the proposed bootstrap methodology provides bootstrap confidence intervals that perform as well as the classical ones (see Okhrin and Schmid 2006; Bodnar and Schmid 2009) and marginal intervals based on multivariate estimation that outperform the mentioned classical confidence intervals. Moreover, it also provides confidence intervals for the mean-variance efficient portfolios (e.g. Sharpe ratio portfolio) other than the global minimum variance one, i.e. it has success in which the classical distribution theory fails.

The rest of the paper is organized as follows. Section 2 outlines the proposed bootstrap resampling procedure. Its finite sample behavior is analyzed in Sect. 3, which reports the results of an extensive Monte Carlo simulation study. Section 4 illustrates the method with real financial data and Sect. 5 concludes the paper with a summary.

2 Bootstrap portfolio optimization

In this section, we describe the bootstrap procedure to obtain the densities and confidence intervals for the weights and characteristics of any mean-variance efficient portfolio, the global minimum variance (GMV) and Sharpe ratio (SR) optimal portfolios, and the slope coefficient of the sample efficient frontier. We consider an investor who holds a portfolio consisting of k assets and denote by \(\mathbf{x}_\mathrm{i}=(\mathrm{x}_{\mathrm{i}1},\ldots ,\mathrm{x}_\mathrm{ik})^\mathrm{t}\), i = 1, ..., n a sample of the assets returns. Let us consider, to emulate the lake of modelling on returns in current mean-variance literature and comparative purposes and without lack of generality, the simplest model of returns

$$\begin{aligned} \mathbf{x }_\mathrm{i} ={\varvec{\upmu }} +\mathbf{u }_\mathrm{i} \end{aligned}$$
(1)

where \(\mathbf{u}_\mathrm{i}\) is a sequence of zero-mean independent k-vector variables with common distribution function \({\hbox {F}}_\mathbf{u}\). So, let also assume that the second moment of u, \({\varvec{\Sigma }}\), exists and \({\varvec{\upmu }}\) and \({\varvec{\Sigma }} \) are unknown parameters. Freedman (1981) provided the theoretical support for bootstrap replication in the regression models and Pascual et al. (2004) for bootstrap strategy to obtain prediction intervals for autoregressive integrated moving average processes, thus this one between other more suitable regression models or ARIMA processes for developing the following mean-variance analysis application.

For the sample approximation of the expected quadratic utility criterion, the expected utility (EU) portfolio weights, w \(_\mathrm{EU}\), are chosen to maximize \(\mathbf{w}^\mathrm{t}\hat{\varvec{\upmu }}-({\alpha /2)}{\mathbf {w}}^\mathrm{t}\hat{\varvec{\Sigma }}{\mathbf {w}}\) subject to \(\mathbf{1}^\mathrm{t}{\mathbf {w}}={1}\). Where 1 denotes the k-vector of ones, \(\alpha >0\) describes the risk aversion of the investor, and \(\hat{\varvec{\upmu }}\) and \(\hat{\varvec{\Sigma }}\) are the usual sample estimates of \({\varvec{\upmu }}\) and \({\varvec{\Sigma }}\).

$$\begin{aligned} \hat{{\varvec{\upmu }}}=\frac{{1}}{\mathrm{n}}\sum \limits _{\mathrm{i}=\mathrm{1}}^\mathrm{n} {{\mathbf{x}}_\mathrm{i} } \quad \mathrm{and}\quad \hat{{\varvec{\Sigma }} }=\frac{\mathrm{1}}{\mathrm{n-1}}\sum \limits _{\mathrm{i}=\mathrm{1}}^\mathrm{n} {({{\mathbf{x}}}_\mathrm{i} -\hat{\varvec{\upmu }})(\mathbf{x}_\mathrm{i} -\hat{\varvec{\upmu }}} )^\mathrm{t} \end{aligned}$$
(2)

Thus, the EU and GMV portfolio weights \((\upalpha \rightarrow \infty )\) are given by

$$\begin{aligned} \mathbf{w}_{\mathrm{EU}} =\frac{\hat{{\varvec{\Sigma }}}^{-1}{\mathbf{1}}}{\mathbf{1}^\mathrm{t}\hat{{\varvec{\Sigma }}}^{-1}{} \mathbf{1}}+ {\upalpha }^{-1}{} \mathbf{R} {\hat{{\varvec{\upmu }}}}\quad \hbox {and} \quad \mathbf{w}_{\mathrm{GMV}} =\frac{\hat{{\varvec{\Sigma }} }^{-1}{} \mathbf{1}}{\mathbf{1}^\mathrm{t}\hat{{\varvec{\Sigma }}}^{-1}\mathbf{1}}, \end{aligned}$$
(3)

the sample mean \((\mathrm{R}_{\mathrm{GMV}})\) and variance \((\mathrm{V}_{\mathrm{GMV}} )\) of GMV portfolio are given by

$$\begin{aligned} \mathrm{R}_{\mathrm{GMV}} =\frac{\mathbf{1}^\mathrm{t}\hat{\varvec{\Sigma }}^{-1}\hat{{{\varvec{\mu }}}}}{\mathbf{1}^\mathrm{t}\hat{{\varvec{\Sigma }}}^{-1}{} \mathbf{1}}\quad \hbox {and}\quad \mathrm{V}_{\mathrm{GMV}} =\frac{\mathrm{1}}{\mathbf{1}^\mathrm{t}\hat{{\varvec{\Sigma }} }^{-1}{} \mathbf{1}}, \end{aligned}$$
(4)

and the sample efficient frontier in the mean-variance space (see Merton 1972) is the upper part of the parabola given by

$$\begin{aligned} \mathrm{(R-R}_{\mathrm{GMV}} )^2=\mathrm{S(V}-\mathrm{V}_{\mathrm{GMV}} ) \quad \mathrm{with} \quad {\mathrm{S}}=\hat{{\varvec{\upmu }}}^\mathrm{t}{} \mathbf{R} \hat{\varvec{\upmu }} \end{aligned}$$
(5)

where the quantity S denotes the slope coefficient and \(\mathbf{R}=\hat{\varvec{\Sigma }}^{-1}-\frac{\hat{\varvec{\Sigma }}^{-1}\mathbf{11}^\mathrm{t}\hat{{\varvec{\Sigma }}}^{-1}}{\mathbf{1}^\mathrm{t}\hat{\varvec{\Sigma }}^{-1}{} \mathbf{1}}\).

In case of no riskless asset or when the portfolio analyses is based on excess returns the tangency portfolio (tangency point between the efficient frontier and a line drawn from the origin) coincides with the Sharpe ratio optimal portfolio i.e. the solution of the problem of maximizing \({\mathbf{w}^\mathrm{t}\hat{\varvec{\upmu }}}/{\sqrt{\mathbf{w}^\mathrm{t}\hat{\varvec{\Sigma }}{} \mathbf{w}} }\) subject to \(\mathbf{1}^\mathrm{t}{} \mathbf{w}=\mathrm{1}\). Thus, the weights and characteristics of SR portfolio are given by

(6)

Our proposal to obtain bootstrap replicates of the weights and characteristics of the GMV and SR portfolios and the slope coefficient is as follows.

  • Step 1. Compute the residuals of (1) \(\hat{\mathbf{u}}_\mathrm{i} =\mathbf{x}_\mathrm{i} -\hat{{\varvec{\upmu }}}\) and let \({\hat{\mathrm{F}}}_{\mathbf{u}} \) be the empirical distribution function of the residuals.

  • Step 2. Generate the bootstrap series by the following equation

    $$\begin{aligned} \mathbf{x}_\mathrm{i}^{{*}} =\hat{{\varvec{\upmu }}}+\hat{\mathbf{u}}_\mathrm{i}^\mathrm{{*}} \end{aligned}$$
    (7)

    where \(\hat{\mathbf{u}}_\mathrm{i}^{{*}} \), i = 1, ..., n are random draws from \({\hat{{\mathbf {F}}}}_{{\mathbf {u}}} \).

  • Step 3. Calculate \(\hat{{\varvec{\upmu }}}^{*} \) and \(\hat{{\varvec{\Sigma }}}^{*} \) for the bootstrap series (\(\mathbf{x}_\mathrm{i}^{{*}} )\), and compute the GMV and SR portfolio weights, \(\mathbf{w}_{\mathrm{GMV}}^{{*}} \) and \({\mathbf{w}_{\mathrm{SR}}}^{{*}} \), by expressions (3) and (6), and the characteristics of these optimal portfolios and the slope coefficient, \(\mathrm{R}_{\mathrm{GMV}}^{{*}}\), \(\mathrm{V}_{\mathrm{GMV}}^{*} \), \(\mathrm{R}_{\mathrm{SR}}^{{*}}\), \(\mathrm{V}_{\mathrm{SR}}^{*} \) and \(\mathrm{S}^{{*}}\), by expressions (4), (5) and (6).

  • Step 4. Obtain a bootstrap future value of asset returns, \(\mathbf{x}_{\mathbf{n+1}}^{*} \) by the equation

    $$\begin{aligned} \mathbf{x}_{\mathbf{n+1}}^{*} =\hat{{\varvec{\upmu }}}^{{*}} +\hat{\mathbf{u}}_{\mathrm{n+1}}^{*} \end{aligned}$$
    (8)

    where \(\hat{\mathbf{u}}_{\mathrm{n+1}}^{*} \) is a random draw from \({\hat{\mathbf{F}}}_{\mathbf{u}} \).

  • Step 5. Repeat the last four steps B times to obtain a set of B replicates for \(\mathbf{w}^{{*}} \), \(\mathrm{R}_{\mathrm{GMV}}^{{*}}\), \(\mathrm{V}_{\mathrm{GMV}}^{{*}}\), \(\mathrm{S}^{{*}}\) and \(\mathbf{x}_{\mathbf{n+1}}^{{*}}\).

From these B independent draws of the joint distribution of \({\mathbf {w}}_{\mathrm{GMV}}^\mathrm{t} \), \({\mathbf {w}}_{\mathrm{SR}}^\mathrm{t} \), \(\mathrm{R}_{\mathrm{GMV}}\), \(\mathrm{V}_{\mathrm{GMV}}\), \(\mathrm{R}_{\mathrm{SR}}\), \(\mathrm{V}_{\mathrm{SR}}\) and \(\mathrm{S}\) the 100(\(1-\upalpha \)) % marginal confidence interval of each variable is given by

$$\begin{aligned} (\mathrm{L}_\mathrm{B}^{{*}} ,\mathrm{U}_\mathrm{B}^{{*}} )=\left( {\mathrm{Q}_\mathrm{B}^{{*}} \left( {\frac{{\upalpha }}{\mathrm{2}}}\right) ,\mathrm{Q}_\mathrm{B}^{{*}} \left( {1-\frac{{\upalpha }}{\mathrm{2}}}\right) }\right) \end{aligned}$$
(9)

where \(\mathrm{Q}_\mathrm{B}^{{*}} =\mathrm{G}_\mathrm{B}^{{*^{ - 1}}}\) and \(\mathrm{G}_\mathrm{B}^{{*}} \mathrm{(h)}={{\# }( {\mathrm{z}^{{{*} (\mathrm{b})}}\le \mathrm{h}})} / \mathrm{B}\), \(\mathrm{z}^{{{*} (\mathrm{b})}}\in \{ \mathrm{w}_{\mathrm{GMVj}}^{{*}}, \mathrm{w}_{\mathrm{SRj}}^{{*}} , \mathrm{j}=\mathrm{1,\ldots ,k}, \mathrm{R}_{\mathrm{GMV}}^{{*}} \mathrm{,V}_{\mathrm{GMV}}^{{*}} , \mathrm{S}^{{*}} \mathrm{, R}_{\mathrm{SR}}^{{*}}, \mathrm{V}_{\mathrm{SR}}^{{*}}\}\) is the Monte Carlo estimates of each marginal distribution function and #() denotes the cardinal function.

In addition, to get “resistant” estimates to the most extreme portfolio observations (outliers) of the efficient portfolios and the 100(\(1-\upalpha \)) % confidence region of the efficient frontier, we suggest apply the minimum volume ellipsoid (MVE) method introduced by Rousseeuw (1985) on the vector \(({{\mathrm{R}}_{\mathrm{GMV}}^{{*} (\mathrm{b})} ,{\sqrt{\mathrm{V}}}_{\mathrm{GMV}}^{{{*} (\mathrm{b})}} ,{\mathrm{R}}_{\mathrm{SR}}^{{{*} (\mathrm{b})}} ,{\sqrt{\mathrm{V}}} _{\mathrm{SR}}^{{*} \mathrm{(b)}}})^\mathrm{t}\), b = 1,..., B of draws of the joint distribution of the mean-variance characteristics of the GMV and SR portfolios (i.e., the characteristics which define the efficient frontier). The MVE method examines minimum subsamples of approximately 50 % of the observations to find, using the Mahalanobis distance, the subset that minimizes the volume of the covariance matrix associated to the subsample. The MVE estimate is the center and the covariance of this minimum subsample and it is commonly used to compute the Mahalanobis distance of the sample observations and thus identify outliers. This multivariate procedure provides an easy procedure for getting a resistant estimation to outliers and monitoring the whole efficient frontier as follows: (i) compute the MVE from \(\left( {{\mathrm{R}}_{\mathrm{GMV}}^{{{*} (\mathrm{b})}} ,{\sqrt{\mathrm{V}}} _{\mathrm{GMV}}^{{{*} (\mathrm{b})}} ,\mathrm{R}_{\mathrm{SR}}^{{{*} (\mathrm{b})}} ,\sqrt{\mathrm{V}} _{\mathrm{SR}}^{{{*} (\mathrm{b})}} }\right) ^\mathrm{t}\), b = 1,..., B; (ii) calculate the Mahalanobis distance of the bootstrap sample; (iii) define the MVE estimate of efficient portfolios as \(({\mathbf{w}^{*(\mathrm{b}_{\mathbf{m}})}}_\mathrm{GMV})^\mathrm{t}, ({\mathbf{w}^{*(\mathrm{b}_{\mathbf{m}})}}_\mathrm{SR})^\mathrm{t}\), \(\left( {\mathrm{(R}}_{\mathrm{GMV}}^{^{*({{\mathrm{b}}_{\mathrm{m}}}{\mathrm{)}}}},\,\sqrt{{\mathrm{V}}} _{\mathrm{GMV}}^{^{*({{\mathrm{b}}_{\mathrm{m}}})}},\,{\mathrm{S}}_{}^{^{*({{\mathrm{b}}_{\mathrm{m}}})}},\,\mathrm{R}_{{\mathrm{SR}}}^{^{*({{\mathrm{b}}_{\mathrm{m}}})}},\,\sqrt{ {\mathrm{V}}} _{{\mathrm{SR}}}^{^{*({{\mathrm{b}}_{\mathrm{m}}})}}\right) ^{\mathrm{t}}\) with b\(_{\mathrm{m}}\) the bootstrap observation associated to the minimum of the Mahalanobis distances; and (iv) compute the 100(\(1-\upalpha \)) % MVE region of the efficient frontier and the marginal MVE intervals as the region/interval delimited by the minimum and maximum value of the observations with Mahalanobis distances fewer than 100(\(1-\upalpha \)) % quantile of Mahalanobis distances.

Finally, given the draws of \({\mathbf{x}_{{\mathbf{n+1}}}^{*}}\) and the bootstrap estimate of the efficient portfolio weights, \(\mathbf{w}^{{*}(\mathbf{b}_{\mathbf{m}})} \), predictive estimate (from the median), confidence interval and p-value of the predictive efficient portfolio returns could be similarly computed from the draws of \(\mathrm{p} =(\mathbf{w}^{{*} (\mathrm{b}_{\mathbf{m}})})^\mathrm{t} \mathbf{x}_{\mathrm{n}+\mathrm{1}}^{{*}} \).

To conclude, it is worthy of notice that this methodology is derived from the simplest model of returns and can be easily extend, changing equations (1), (7) and (8), to introduce a more elaborated suitable econometric model for predicting returns. Furthermore, this in-sample procedure is, bootstrapping (by the equation (8)) a future sample of asset returns, the starting point of a statistical out-sample procedure that estimate and test the efficient frontier from the predictive distribution of returns.

3 Finite sample properties

The finite sample performance of the confidence intervals built by the bootstrap and classical procedures is now analyzed by means of Monte Carlo experiments. We generate the series of two base populations of k = 11 assets with normal and bivariate mixture of normal return distributions. The first one because it is the assumed distribution in classical results, the second one because a bivariate mixture of normal distribution is a flexible distributional assumption (under certain regularity conditions, any probability density can be estimated as a mixture of normal distributions; see e.g. Ghosal and van der Vaart 2001) to reflect the distribution of returns when the sample contains mixed data from contraction and expansion phases of market cycle. The parameters mean \({\varvec{\upmu }}\) and covariance matrix \({\varvec{\Sigma }}\) of the normal and the parameters \(\left( \varvec{\mu } _{0},{\varvec{\Sigma }}_{0}, \mathrm{p}_{0}, {\varvec{\mu }_{1}}, {\varvec{\Sigma }_{1}}\right) ^{\mathrm{t}}\) of the bivariate mixture of normal populations respectively, where \(\mathrm{p}_{0}\) represents the probability of contraction period, have been chosen to resemble the parameters estimated in a real series of financial monthly excess returns (we make use of monthly data from Kenneth R. French Data Library for the excess total returns in dollar currency of 11 developed countries: Australia, Belgium, Canada, France, Germany, Italy, Japan, United Kingdom, United States (US), Spain and Switzerland from January 1977 to December 2006). For the base bivariate mixture of normal population we choose \(\mathrm{p}_{0}=0.18\); additionally, we simulate one more bivariate mixture normal population of k = 16 assets with equal \(\mathrm{p}_{0}=0.18\) and two more of k = 11 assets with a lower (\(\mathrm{p}_{0}=0.07\)) and a higher (\(\mathrm{p}_{0}=0.40\)) probability of recession data.

An important point to consider is the sample size because it is related to the first source of estimation errors and also the proportion of recession data in the sample, the higher the more likely the heterogeneity in database. We use a sample size of n = 2500 observations to guarantee the convergence of the solution of the sample mean-variance problem to its true counterpart so in real world we can expect a relatively high p\(_{0}\). To determine this sample size, we generate 1000 Monte Carlo samples of sizes 500, 1000, 2000, 3000, 5000 and 7500 for each population, calculate the maximum distance between the weights of exact and sample GMV and SR portfolios, and draw in Fig. 1 median and 95 % confidence interval for each sample size and population. This graph shows the convergence of the sample solution, though the differences are clearly higher for SR portfolios, from a sample size of n = 2500 for all the populations excepting the SR portfolio case for the mixture of normal population with p\(_{0}=0.40\) (convergence achieved at n \(\ge 5000\)). Moreover, the differences are clearly higher in magnitude for the SR portfolios as yet the convergence is achieved. Table 1 reports the probability that these differences will be less than 10 % for GMV portfolios and 25 % for SR portfolios. Notice that for a sample size of n = 2500 this probability will be greater than 99.2 % for every GMV portfolio.

Fig. 1
figure 1

Maximum distance between the weights of exact and sample portfolios

Table 1 Convergence in probability of the solution of the sample mean-variance problem

We compute the marginal 99, 95 and 90 % confidence intervals. For each population, we generate 1000 Monte Carlo samples of size 2500, compute the statistics and obtain the empirical, classical and bootstrap confidence intervals and the marginal MVE interval at 99, 95 and 90 %. For each Monte-Carlo sample, classical confidence intervals are computed using Okhrin and Schmid (2006) and Bodnar and Schmid (2009)’s papers. Bootstrap confidence and MVE intervals are constructed following the proposed method based on B = 499, 999, 2999, 3999 and 4999 replicates. For each Monte-Carlo sample, the coverage, left and right tails of each classical and bootstrap interval are computed from the proportion of replicate observations of statistic lying in, out to the left and out to the right. Then, we compute for each interval the average and standard error of the interval length, the coverage, and the average proportion of observations lying out to the left (below) and to the right (above) through all Monte Carlo samples. Finally, classical and bootstrap intervals are compared in terms of these summary statistics for each simulated population.

The results about the classical and bootstrap confidence intervals and the marginal MVE intervals are summarized in three tables. Tables 2 and 3 report the summary statistics: average of coverage, standard error of the coverage and average of the proportion of observations below/above of the minimum, average and maximum of GMV and SR portfolio weights at the 95 % confidence intervals (classical and bootstrap with B = 499, 999 and 2499 replications) and marginal 95 % MVE intervals (B = 999 replications), and the same summary statistics plus the average and standard error of the interval lengths for the sample mean, standard deviations of returns and value-at-risk (VaR) of GMV and SR portfolios, the slope (S) and the Sharpe ratio (IR) at the marginal 95 % confidence intervals (classical and bootstrap with B = 499, 999 and 2499 replications) and marginal 95 % MVE intervals (B = 999 replications) when the simulated returns are normal (Table 2) and bivariate mixture of normal with p\(_{0}=0.18\) (Table 3). In addition, Table 4 shows the summary statistics of the minimum, average and maximum of GMV and SR portfolio weights at the 99 % confidence intervals (classical and bootstrap with B = 999 replications) and marginal 99 % MVE intervals (B = 999 replications), and the summary statistics for the sample mean, standard deviations of returns and value-at-risk (VaR) of GMV and SR portfolios, the slope (S) and the Sharpe ratio (IR) at the marginal 99 % confidence intervals (classical and bootstrap with B = 999 replications) and marginal 99 % MVE intervals (B = 999 replications) for all the simulated populations.

Table 2 Monte Carlo results for the 95 % confidence and marginal MVE intervals in the normal returns population
Table 3 Monte Carlo results for the 95 % confidence and marginal MVE intervals in the bivariate mixture of normal returns population (k = 11, p\(_{0}=0.18\))

The bootstrap and classical confidence intervals have a similar marginal performance for all the statistics of interest in the two base populations. The bootstrap and classical confidence intervals of GMV and SR weights, and sample mean and standard deviations of GMV portfolio returns are unbiased in the two base populations. Moreover, the performance of the bootstrap confidence intervals is never worse than the classical approach and it is achieved with a low number of replications (the coverage data increase with the number of simulations but with a very slow rate in the normal and mixture of normal populations). In addition, our bootstrap procedure obtains 95 % confidence intervals for weights and characteristics of both GMV and SR portfolios with a similar degree of accuracy. However, the simulation study also reveals that the degree of accuracy in the confidence interval estimation is not enough good, neither for the classical confidence intervals nor for the bootstrap ones. Not surprisingly, both classical and bootstrap confidence intervals have similar acceptable accuracy when the simulated returns are normal with average coverages for a 95 % nominal coverage in the range from 79 to 85 %. But, the similar accuracy reduces to a range from 51 to 82 % for the 95 % nominal coverage when the distribution of returns is the bivariate mixture of normal with p\(_{0}\) = 18. Moreover, the bootstrap and classical intervals of GMV and SR weights, and sample mean and standard deviations of GMV portfolio returns are unbiased in the two base populations. Whereas, bootstrap and classical intervals of slope coefficient and bootstrap intervals of sample mean, standard deviations and Sharpe ratio of GMV portfolio returns are rather (mainly right) biased one-side intervals (see the average of observations lying out to the left and to the right of the interval) in the two base populations.

Next, it is interesting to note that the distributional features of the efficient frontier that might not line up with the axis when taking marginal might be in the background of this poor degree of accuracy of classical and bootstrap confidence intervals, especially when the normal hypothesis fails. Figure 2 plots the empirical, classical and bootstrap (for a particular sample and replicate) densities of US GMV portfolio weight, the empirical and bootstrap densities of US SR portfolio weight densities and the scatterplot with 95 % confidence ellipse of the empirical and bootstrap bivariate distributions of GMV and SR portfolio mean returns. Notice that whereas the classical and bootstrap densities of US GMV portfolio weight for the normal returns population are quite close to the empirical density for the normal population, both densities fail in capture the bimodality of the empirical density of GMV portfolio weight for the bivariate mixture of normal returns population. However, it is rather clear that both the empirical and bootstrap bivariate densities of GMV and SR portfolio mean returns are closer than the univariate US GMV and SR portfolio weights counterparts in the two base populations. Consequently, we can expect some improvement over the marginal bootstrap intervals due to the multivariate approach (proposed in this paper) to the estimation problem. Effectively, the average coverage of marginal MVE intervals is larger and closer to the nominal value than for classical and bootstrap confidence intervals in the two base populations. The marginal MVE intervals increase the degree of accuracy of confidence intervals with average coverages for a 95 % nominal coverage in the range from 93.3 to >95 % when the simulated returns are normal and from 60.7 to >95 % when the distribution of returns is the bivariate mixture of normal with p\(_{0}\) = 18.

Fig. 2
figure 2

Densities of the estimator for the US weights and the mean returns of the GMV and SR portfolios

Results reported in Table 4 do not change the general picture. Classical and bootstrap intervals increase the degree of accuracy with average coverages for a 99 % nominal coverage in the range from 90 to 94 %, moreover marginal MVE intervals outperform them with coverages from 95.9 to 97.8 % respectively in the normal population. The degree of accuracy of confidence intervals when the distribution of returns is the bivariate mixture of normal with p\(_{0}=0.18\) is slightly higher than when the distribution of returns is the bivariate mixture of normal with p\(_{0}=0.40\) and slightly lower than when the distribution of returns is the bivariate mixture of normal with p\(_{0}=0.07\). The moderate increase of number of assets (from 11 to 16 assets) when the distribution of returns is the bivariate mixture of normal with p\(_{0}=0.18\) does not change the performance results. The marginal MVE intervals outperform the confidence intervals in terms of average coverage for all the simulated populations. The performance results of bootstrap intervals (not all of them included here but available from the author) were checked for B = 499, 999, 2999, 3999 and 4999 replicates for all the simulated populations with the conclusion that the coverage data for all intervals increase with the number of replicates at a negligible pace.

Table 4 Monte Carlo results for the 99 % confidence and marginal MVE intervals

4 Empirical illustration

We illustrate empirically the use of the suggested bootstrap procedure with a real data set of monthly excess total returns (2457 observations) in dollar currency yielded from the daily Morgan Stanley Capital International (MSCI) prices indices of eleven developed countries: Australia, Belgium, Canada, France, Germany, Italy, Japan, United Kingdom (UK), United States (US), Spain and Switzerland from January 1997 to December 2006. Data were downloaded from Thompson Database and the excess returns were calculated subtracting the one-month T-bill rate provided by International Financial Statistics (IFS).

We use the bootstrap procedures developed in Sect. 2 to (i) draw 999 bootstrap replicates of the joint distribution of \({{\mathbf {w}}}_{\mathrm{GMV}}^\mathrm{t}\), \({{{\mathbf {w}}}_{\mathrm{SR}}}^\mathrm{t} \), \(\mathrm{R}_{\mathrm{GMV}}\), \(\mathrm{V}_{\mathrm{GMV}}\), \(\mathrm{R}_{\mathrm{SR}}\), \(\mathrm{V}_{\mathrm{SR}}\), \(\mathrm{S}\), \({{{\mathbf {x}}^{\mathrm{t}*}}_{\mathrm{n+1}}}\) ; (ii) calculate the marginal 95 % confidence intervals for \({\mathrm{w}}_{\mathrm{GMVj}}\), \({\mathrm{w}}_{\mathrm{SRj}}\), j = 1,..., k, \({\mathrm{R}}_{\mathrm{GMV}}\), \({\sqrt{\mathrm{V}}} _{\mathrm{GMV}}\), \({\mathrm{R}}_{\mathrm{SR}}\), \({\sqrt{\mathrm{V}}} _{\mathrm{SR}}\) and \(\mathrm{S}\); (iii) obtain the MVE from the vector \(({\mathrm{R}}_{\mathrm{GMV}}^{*(\mathrm{b})},\,\sqrt{{\mathrm{V}}} _{\mathrm{GMV}}^{*(\mathrm{b})},\,\mathrm{R}_{{\mathrm{SR}}}^{*(\mathrm{b})},\,\sqrt{{\mathrm{V}}} _\mathrm{SR}^{*(\mathrm{b})})^\mathrm{t}\) , b = 1,..., 999; (iii) get the MVE estimate of efficient frontier; (iv) compute the 95 % MVE region for efficient frontier, the 95 % MVE region for \(({\mathrm{R}}_{\mathrm{SR}}, {\sqrt{\mathrm{V}}} _{\mathrm{SR}})^\mathrm{t}\) and the marginal 95 % MVE intervals for \(\mathrm{w}_{\mathrm{GMVj}}\), \(\mathrm{w}_{\mathrm{SRj}}\), j = 1,..., k, \(\mathrm{R}_{\mathrm{GMV}}\), \({\sqrt{\mathrm{V}}} _{\mathrm{GMV}}\), \(\mathrm{R}_{\mathrm{SR}}\), \({\sqrt{\mathrm{V}}} _{\mathrm{SR}}\) and \(\mathrm{S}\); and (v) calculate the predictive estimate and 95 % confidence interval for the predictive GMV and SR portfolio returns. Table 5 shows all these estimates and 95 % confidence intervals and regions for a US investor; Fig. 3 displays the bootstrap efficient frontier shape, the 95 % MVE region, the 95 % MVE region for \(({{\mathrm{R}}_{\mathrm{SR}},{\sqrt{\mathrm{V}}} _{\mathrm{SR}}})^\mathrm{t}\) and the estimated composition map; and Fig. 4 draws the bootstrap predictive GMV and SR portfolio return distributions. In addition, the estimates and 95 % marginal MVE intervals for the weights and characteristics of GMV and SR portfolios are calculated over a rolling window of ten years with (81) lags of a month up to September 2013 to analyse the portfolio weights and mean portfolio returns stability over time. Figure 5 plots the estimates and 95 % MVE region of the major GMV and SR portfolio weights and the sample mean of GMV and SR portfolio returns over the time for the extended sample.

Table 5 Estimates of the efficient frontier

The estimates of GMV portfolio weights contain several extreme positions such as the long positions of 48.73 and 32.10 % in the UK and US markets respectively and the short position of 32.30 % in the German market, but it is a diversified portfolio with only three weights (Canada, France and Switzerland indices) not significant at the 5 % level. The GMV portfolio has a positive, and significant at 5 % level, estimated monthly return of 0.29 % with a risk of 3.94 %, and a higher but no significant at 5 % level predictive return of 0.62 %. These optimal positions remain with slight readjustment for several months (up to June 2008). By and large, the UK and US GMV portfolio weights are significant (at the 5 % level) and positive, and the German GMV portfolio weight is significant (at the 5 % level) and negative in all the period. The estimated monthly GMV portfolio return also evolves over the time adjusting to the market cycle as follows, significant (at the 5 % level) and positive up to August 2006, significant (at the 5 % level) and negative from December 2008 to October 2010 and significant (at the 5 % level) and positive from June 2012.

Fig. 3
figure 3

Bootstrap efficient frontier

Fig. 4
figure 4

Densities of the predictive portfolio returns

Fig. 5
figure 5figure 5

Optimal portfolios over the time horizon

In the tangency point between the efficient frontier and a line drawn from the origin (see Fig. 3), i.e. the SR portfolio the optimal position is yet more extreme with significant (at 5 % level) positions such as the long positions of 200.44 and 132.86 % in the Switzerland and Canada markets respectively and the short positions of 286.4, 114.03 and 84.73 % in the UK, Germany and Japan markets. But some of them are not significant at 5 %, such as the long position of 119.93 % in the Spain market and surprisingly the short position of 37.9 % in the US market. These results are due to the large sampling error of the estimates as we can easily appreciate from the length of the marginal 95 % MVE interval. The average length of weights is of 382.77 % that is, assuming the normal approach, a standard error of more than 97.64 % and six weights of them are not significant at 5 % level as follows Australia, Belgium, France, Italy and the US and Spain indices. The SR portfolio has a positive and significant at 5 % level, estimated monthly return of 3.31 % with a risk of 13.36 %, and a lesser no significant predictive return of 3.2 %. Moreover, these positions in the markets and returns are changing in time with special large sampling errors when the uncertainty about economic situation increases, i.e. from September 2006 to November 2008 and from November 2010 to May 2012.

Interestingly, the marginal 95 % MVE interval applied for reducing the influence that outliers exert, correct the marginal 95 % confidence interval (included in Table 5 for comparative purposes) along the lines suggested in the simulation study i.e., increasing the length of interval and correcting bias.

It is worthy of notice that our bootstrap procedure produces the 95 % bootstrap predictive interval and density of returns so it gives a statistical measure (based on the model) of the out-sample performance of the GMV and SR portfolio and how risky the inversion could be. In this case, the probability of a predictive return less or equal to zero is 43.44 % for the GMV portfolio and 38.44 % for the SR portfolio.

Other interesting features of our procedure are that the estimated frontier lies inside the 95 % confidence region, the composition map includes all assets and there is a smooth transition from one risk level to another, and the estimated SR portfolio is a mean-variance efficient portfolio, that is, it is in the estimated frontier line.

5 Conclusions

In summary, the bootstrap resampling methodology proposed in this paper provides a new and easy of implement procedure that allows a more appropriate multivariate approach to the estimation problem of the sample mean-variance efficient frontier with interesting statistical features. Applying this methodology we can calculate marginal confidence intervals of weights and characteristics for any efficient portfolio of the sample efficient frontier, compute the confidence region of the efficient frontier in the mean-variance space and obtain the prediction densities of the future optimal portfolio returns of any mean-variance efficient portfolio without distributional assumptions on returns. Moreover, the finite-sample performance of the marginal bootstrap intervals (i.e., when the parameters are treated separately) equals or outperforms the performance of classical ones derived under normal assumption. Furthermore, this methodology is based on a statistical model and it can be extend to get the predictive optimal portfolio and make inferences over it.