1 Introduction

1.1 Multiobjective portfolio construction

Multiobjective portfolio optimization problem as discussed in this paper is the construction of a portfolio of stocks selected from a large pool of stocks which provide maximum return at minimum risk. Modern portfolio theory provides a well-developed paradigm to form such a portfolio; however, constructing an optimum portfolio with limited available capital is a challenge when a large pool of stocks is taken into account. In the capital market, there are thousands of equities and depending on the financial characteristics of the equity, the risk and return of the investment are dissimilar. In order to construct an optimum and profitable portfolio of stocks, risk and return should also be simultaneously considered and hence, portfolio optimization is a complex multiobjective problem of multistage decision-based. In this paper, the multistage decision-based genetic algorithm is proposed for the multiobjective portfolio optimization problem. On the basis of the application of the algorithm on S&P US 500, we shall show the effectiveness of the proposed algorithm is validated for solving this problem. The Sect. 1.2 shall explain about the Markowitz theory of portfolio optimization, considered as the cornerstone of the portfolio theory.

1.2 Markowitz theory of portfolio optimization

The multiobjective portfolio optimization problems have been an area of research since 1952. Markowitz (1952), a creator of modern portfolio theory, originally formulated the fundamental theorem of mean–variance portfolio framework, which explains the trade-off between mean and variance each representing expected returns and risk of a portfolio, respectively. The mean–variance approach proposed by (Markowitz 1952) was to deal with the portfolio selection problem which can determine the weights to be allotted to each equity on the basis of minimum required rate of return.

Mean variance theorem:

The formulation of the mean–variance method can be described as follows:

$${ \hbox{min} }\;\sigma_{\text{p }} = { \hbox{min} }\sum\nolimits_{i = 1}^{n} {\sum\nolimits_{j = 1}^{m} {\sigma_{ij} x_{i} x_{j} } }$$
(1)
$${\text{s}}.{\text{t}}.\;\sum\nolimits_{i = 1}^{n} {u_{i} \;x_{i} \ge E,\sum\nolimits_{i = 1}^{n} {x_{i} = 1} ,x_{i} \ge 0}$$
$${\text{i}} = 1, 2 { }, \ldots ,{\text{n}}$$

where σ p denotes the portfolio risk, σ ij denotes the covariance between the return of the ith security and the jth security, μ i denotes the expected return rate of the ith security, E denotes the acceptable least rate of the expected return, xi denotes the investment portion in the ith securities.

For a given specific return rate, one can derive the minimum investment risk by minimizing the variance of a portfolio; or for a given risk level which the investor can tolerate, one can derive the maximum returns by maximizing the expected returns of a portfolio. The main input data of the Markowitz mean–variance model are expected returns and variance of expected returns of these securities as presented by Ehrgott et al (2004). Although Markowitz’s theory uses only mean and variance to describe the characteristics of return, his theory about the structures of a portfolio became a cornerstone of modern portfolio theory. However as the complexities of financial markets increased, we have realized the model has some practical limitations as explained in the Sect. 1.3

1.3 Limitations of mean–variance model

The mean variance model requires either the acceptable least rate of return or the variance of the portfolio for which the expected return shall be computed. In the paper, we have developed a genetic approach which eliminates the need to estimate the above two parameters.

Mean variance model is not practical for a large number of securities because of the computational complexity involved which increases solving time, a key parameter for algorithmic trading. Hence we have proposed a genetic algorithm based solution to the problem of construction of multiobjective optimum portfolio as explained in the Sect. 1.4.

The mean variance model which involves minimizing risk as shown in Eq. 1 is based on the assumption that the returns are normally distributed. It considers simple linear correlations of returns which is equivalent of assuming that the ‘joint’ distribution of asset returns is (multivariate) normal. But it is found that returns of many assets are not normally distributed in today’s financial markets due to influence of political factors, market regulations, economic policies, and nonsystematic risk. Which violates the covariance conditions of mean–variance model. Hence the solution produced by the mean variance theorem is skewed, thus creating the need for an alternative approach. The paper proposes genetic algorithm as the solution to the problem as it is not based on the assumption of normal returns.

1.4 Decision to use genetic algorithm

The work presented in this paper is motivated by the need to develop an optimization algorithm which is efficient in managing portfolios consisting of a large number of stocks.

The algorithms earlier used require user inputs on either expected return or acceptable risk which is difficult to estimate for large portfolios.

This limits the solution to the present value of either return or risk as received from the user. In order to eliminate this predicament, the authors have presented in this paper the multistage decision-based genetic algorithm approach for dealing with the multiobjective portfolio optimization problem. Firstly, we select the short list of the stocks by the priority index (as explained in section) and then genetic algorithm is applied to decide the investment weight of the stocks.

The approach reduces the computational complexity and generates the most optimum portfolio implementing genetic theory.

Genetic algorithm uses an objective function, explained in Sect. 1.7.1 to calculate the fitness value. The stocks are optimized using the genetic evolution process determined by these fitness values. This eliminates the need to solve for minimum risk under a constraint of expected return. Thus genetic algorithm is valid for non normal return data as well.

1.5 Advantages of GA

Evolutionary algorithms offer a number of advantages over the traditional optimization methods. They can be applied to problems with a non-differentiable or discontinuous objective function, to which gradient-based methods such as Gauss–Newton would not be applicable. They are also useful when the objective function has several local optima. The basic feature of genetic algorithms is the multiple directional and global searches, in which a population of potential solutions is maintained from generation to generation as discussed by Gen M et al (1997).

The so-called schema theorem shows that a genetic algorithm automatically allocates an exponentially increasing number of trials to the best observed schemata (see Lee (2011)).

The population-to-population approach is beneficial in the exploration of the securities optimal portfolio selecting solutions. It is this property which allows the algorithm to find a global solution without the constraints of a minimum acceptable return thus providing the best and most acceptable solution.

Another useful feature of GA is to handle multiobjective function optimization. The multistage decision based process is explained in Sect. 1.6.

1.6 The proposed solution

The process of genetic evolution which has been verified by the laws of nature since the beginning of earth is proven to be the most intricate and beautiful optimization technique. The key elements of genetic algorithm are: creation of chromosomes, initiating parent species, creation of initial population, reproduction pool on the basis of fitness selection, genetic operators: crossover and mutation operation.

This study applies the very concepts of above genetic evolution in constructing an optimum portfolio of stocks selected from a large pool of stocks listed in the single market index. We have put in our utmost efforts to contribute significant analytical conclusion to the application of genetic algorithm on the issue of optimization of weights of stocks selected in the portfolio. The algorithm for portfolio construction involves two stages—selection of stocks by using a priority index function and optimization of the weights of the selected stocks The process initially selects stocks for the portfolio on the basis of fundamentals of the company using a priority index function and then it optimizes the weights of the selected stocks by a genetic approach, where the selected stocks were allowed to genetically evolve towards the fittest population taking into account both risk and return of stocks. A systematic and computational mode of Darwinian evolution has been applied in this research paper. The technique has been explained in Sect. 1.7. The optimization of stocks will be adjusted as per the stocks undergo a genetic evolution through the process of reproduction and mating using crossover and mutation.

The formulation of genetic algorithm is represented in Fig. 1. The elaborated explanation is presented in Sect. 1.7.

Fig. 1
figure 1

Genetic evolution mechanism

1.7 Genetic algorithm

Holland (1975) provides the theoretical foundation of genetic algorithms. The entire process of optimization by genetic algorithm is explained in the following steps:

The process was initiated by starting a random solution called ‘Population’ which consists of chromosomes. Each chromosome represents a solution of the problem with genes representing the weight given to each selected stock. The genes of the chromosome are string (not necessarily binary) of symbols, in this case it is representing the randomly assigned weights of selected stocks. These chromosomes are then made to evolve genetically through iterations giving rise to newer populations. We compare the fitness values of evolving chromosomes with the existing chromosomes, after each iteration. The objective function used to calculate fitness is described in the following section. In this study we evaluate the objective function over a large population size such that the objective function attains the maximum possible value.

After all the iterations are over, we calculate the fitness value of each of the chromosomes and select the chromosome which attains the maximum fitness value, in accordance with Darwin’s (1859) theory of survival of the fittest.

This selected chromosome shall represent the optimization of the weights of the stocks giving us genetically suitable return and risk.

This genetic procedure is designed to maximize the fitness value under the constraint that the sum of the weights is 1 and that all the weights are <1 and >0. The genetic algorithms are typically implemented as follows

  • Step 1: Randomly generate an initial population of chromosomes (solutions).

  • Step 2: Evaluate each chromosome in the population.

  • Step 3: Create new chromosomes by mating current chromosomes and apply mutation and recombination when the parent chromosomes mate.

  • Step 4: Delete members of the population to make room for the new chromosomes.

  • Step 5: Evaluate the new chromosomes, and insert them into the population.

  • Step 6: If a stopping criterion is satisfied, then stop and output the best chromosome (solution); otherwise, go to step 3.

Figure 1 explains the above genetic evolution mechanism:

The following various functions are used in the algorithm:

1.7.1 Objective function

The objective function is used to calculate the fitness value of the chromosomes which is used to find the fittest species.

The objective function is:

Maximize return and simultaneously minimize risk:

$${ \hbox{min} }\;\sigma_{\text{p }} = { \hbox{min} }\sum\nolimits_{i = 1}^{n} {\sum\nolimits_{j = 1}^{m} {\sigma_{ij} x_{i} x_{j} } }$$
$${\text{s}}.{\text{t}}.\;\sum\nolimits_{i = 1}^{n} {u_{i} \;x_{i} \ge E,\sum\nolimits_{i = 1}^{n} {x_{i} = 1} ,x_{i} \ge 0}$$
$${\text{i}} = 1, 2 { }, \ldots ,{\text{n}}$$

where xi is the weight of ith stock, ui is the average daily return of ith stock, Average daily return (ri) is the {(1 + r1/100)(1 + r2/100)(1 + r3/100)…(1 + rn/100)}1/252 – 1, r1 is the daily return as on day 1 of the period, r2 is the daily return as on day 2, rn is the daily return as on day n (read from Table 3)

$${\rm f}({\rm chromosome})= {\rm portfolio\, return}/{\rm portfolio\, standard\,deviation},$$

where f(chromosome) is the fitness value of the chromosome, σ(i,j) is the covariance of returns between the ith and jth stocks.

The fitness value is then used to identify the fitness of the population and the process is explained in the Sect. 1.7.2.

1.7.2 Evaluation function

This function shall be used to create the reproduction pool from the parent population. The evaluation function is similar to the probability which decides the fitness of the species. It is the ratio of the fitness value of the ith chromosome to the sum of fitness values of all the chromosomes.

$$f\left( {\text{evaluation i}} \right){ = }{{{\text{f}}\left( {\text{chromosome i}} \right)} \mathord{\left/ {\vphantom {{{\text{f}}\left( {\text{chromosome i}} \right)} {\varSigma {\text{f}}\left( {\text{chromosome i}} \right)}}} \right. \kern-0pt} {\varSigma {\text{f}}\left( {\text{chromosome i}} \right)}}$$

The value of the evaluation function is then used to determine the cumulative probability of that chromosome. The cumulative probabilities are used to designate the fittest chromosomes as shown in the algorithm mentioned in the Sect. 1.7.5. In Sect. 1.7.3 the paper discusses the initiation of a population of chromosomes in order to begin the genetic evolution. The process is demonstrated in Table 1.

Table 1 Generation of reproduction pool

1.7.3 Population initiation

A population shall consist of chromosomes which shall be made up of genes. A gene is representative of values which shall return weights of the selected stocks. The gene is constructed by random allocation of values using the normal probability distribution.

These obtained random values (V i ) are now used to calculate weights (w i ) of the selected stocks by calculating \({{{\text{V}}_{\text{i}} } \mathord{\left/ {\vphantom {{{\text{V}}_{\text{i}} } {\left( {\varSigma {\text{ V}}_{\text{i}} } \right)}}} \right. \kern-0pt} {\left( {\varSigma {\text{ V}}_{\text{i}} } \right)}}\) . These weights are the stock wise percentages of the total investment which the portfolio manager shall recommend to the user. A parent population of 10 random chromosomes representing 10 random initial solutions to the objective function is created. A reproduction pool if then created from the initial population.

1.7.4 Reproduction pool: the method of selection

The method shall decide the reproduction pool that is the chromosomes which shall undergo mating. There are several available techniques like: tournament selection, roulette wheel selection, stochastic based selection and reward based selection. In this paper we have used the roulette wheel selection (also known as the fitness proportionate selection) in order to select the species on the basis of their fitness values mainly.

1.7.5 Roulette selection

Each chromosome is evaluated on the basis of the fitness function (as mentioned above). A random number (ri) is generated from normal distribution and is compared with the chromosome’s cumulative probability. The chromosome i having cumulative probability such that \({\text{p}}_{{({\text{i}} - 1)}} < {\text{r}}_{\text{i}} < {\text{p}}_{\text{i}}\) selected into the reproduction pool (Table 1).The reproduction pool shall consist of species which have been selected by the roulette selection method defined above. The species in the reproduction pool which then undergo genetic mating via the processes of crossover and mutation as explained in Sect. 1.7.6.

1.7.6 Genetic mating

1.7.6.1 Crossover

Crossover can be performed by a variety of methods:

  1. (a)

    Single point crossover

  2. (b)

    Two point crossover

  3. (c)

    Uniform crossover

  4. (d)

    Heuristic crossover

In our research we have used arithmetic crossover due to the proven accuracy of the same.

$${\text{Offspring }}1 = \, a*{\text{CH}}1 + \, \left( {1 - a} \right)*{\text{CH}}2$$
$${\text{Offspring }}2 = \, \left( {1 - a} \right)*{\text{CH}}2 + \, a*{\text{CH}}2$$

where, a is the Any random number belonging to (0, 1)

$${\text{V}}^{\prime\prime} = {\text{a}}*{\text{V }} + \, \left( { 1- {\text{a}}} \right)*{\text{v}}^{\prime}$$
$${\text{V}}^{\prime\prime\prime} = \, \left( { 1- {\text{a}}} \right)*{\text{v }} + {\text{ a}}*{\text{v}}^{\prime}$$

The new values (v″ and v″′) can be calculated by the formula given above, where in a is random number between 0 and 1. Figure 2 above represents arithmetic point crossover and describes how two parent chromosomes mate by crossover to give the two off springs. Each of these values shall now give different weights to the stocks 1,2,3…, n, Hence allowing genetic evolution by crossover.

Fig. 2
figure 2

Arithmetic crossover

Mutation is explained in Fig. 3. As shown, the gene at the mutation point 1 shall move to the mutation point 2 while the others shall shift a position each towards the initial position of mutation point 1.

Fig. 3
figure 3

Mutation

1.7.6.2 Decision of crossover versus mutation

This is a question which has seldom been addressed when dealing with a large data set, the problem of how to understand which species shall undergo cross-over and which shall decide to mutate.

We considered the following points before deciding on the approach to tackle this issue:

The solution depends on the nature of problem to a large extent. The solution should be oriented in such a way that the genesis contributes to evolution of the species selecting the fittest species which is decided on the basis of fitness function.

Crossover is a primarily explorative procedure, it accommodates the features of both the parents and creates a chromosome somewhere in between the parents.

Mutation is exploitative; it only creates a slight diversion with the parent and hence alters the feature locally. Only crossover can combine information from two parents where as mutation shall introduce new information into the offspring. Hence we need a lucky mutation for a perfect genesis.

The paper has found on the basis of probabilistic approach that a mutation probability of Pm = 0.4 and a crossover probability of Pc = 0.6 shall yield the optimum solution. In case of identical chromosomes being selected in the reproduction pool, the stocks shall mutate in order to obtain a genetic advantage for the next population.

2 Literature review

Markovitz (1952), the father of modern portfolio theory, has established the role of combining different assets to minimize risk of the portfolio constructed via his publication. We have incorporated his conclusions in our research by keeping diversity a factor while constructing the portfolio. We achieved this by dividing the initial pool of stocks into sectors allowing the representation of each sector in the portfolio thus reducing the risk.

Melanie (1998) and Gen and Cheng (1997) explains the process of genetic algorithm in detail. They have presented various mathematical models for applying evolutionary genetic technique.

Lin and Gen (2007) stresses on the importance of taking risk as well as return into consideration while portfolio selection. The paper suggested that the multi stage genetic algorithm can be used for the portfolio optimization. Pereira (2000) has suggested that genetic algorithms are a valid approach to optimization problems in finance. Yang (2006) advocates that genetic algorithms can be used to improve the efficiency of the portfolio. Bakhtyar et al. (2012) discussed that the choice of crossover and mutation influence the genetic algorithm performance. Seflane and Benbouziane (2012) shows that arithmetic crossover is genetically better than single point and two point crossover using an example of five stocks, while considering both return and risk in the objective function.

Sinha and Goyal (2012) have developed an algorithm for the portfolio construction from a large pool of stocks listed in a single market index SP CNX 500 using MATLAB code.

2.1 Motivation

From review of the earlier research done in this area, it is concluded that the portfolio optimization was not done on an exhaustive genetic scale. An application of a detailed genetic evolutionary algorithm, for constructing an optimum portfolio of stocks selected from a large pool of stocks, is demonstrated in this study. A priority index function is designed to select the stocks on the basis of company fundamentals, and then genetic algorithm is used to optimize the weights of the selected stocks. We also test the effectiveness of the genetic algorithm for portfolio construction. Steps have been taken to address the practical issues faced during the detailed implementation of genetic algorithm. An effective MATLAB code has been developed for this purpose. The motivation of the work was to apply genetic programming in the field of portfolio optimization for a large number of stocks. The paper has created a novel multi stage solution by creating and applying an algorithm for solution of portfolio optimization problem by genetic algorithm for the first time.

3 Portfolio construction: algorithm

The two stage decision based algorithm to construct the optimum portfolio is explained in this section. The section consists of two parts: the explanation of the priority index in Sect. 3.1 and the Optimization of selected stocks as explained in Sect. 3.2. The MATLAB code for the entire process of portfolio construction, as described below, is given in the appendix of this paper.

3.1 Stock selection procedure

The stocks have been selected on the basis of a lot of factors which have been explained in this section:

3.1.1 Diversification of portfolio

The most vital logic behind our portfolio selection is the inclusion of Diversity in our portfolio. We have achieved this by dividing the pool of stocks into sectors namely IT, Telecom, Automobile et al. Now the stocks will be evaluated on the parameters mentioned below using the criteria of maximum and minimum of that sector. This implies that stocks will get selected if their performance score is high in that sector, and thus only best performers from each sector would comprise of our portfolio resulting in a diversified portfolio. The parameters for selection of stocks into the portfolio are specified below:

3.1.2 Parameters for selection

The fundamental factors are those which shall allow the user to construct a portfolio based on business performance principle rather than focusing merely on market sentiments. The following factors are essential for determining the stocks in the final portfolio:

3.1.3 Price/earnings (P/E)

The P/E ratio is an important parameter for understanding the earnings per money invested. A P/E ratio of x shall imply an investment of x units of money for unity profit. Generally, an investor shall prefer to choose stocks which have a lower P/E ratio.

3.1.4 Earnings/share (EPS ratio)

Earnings/Share is defined as the portion of company’s profit allotted to each outstanding share of common stock. From an investor’s perspective, a higher EPS is desirable.

3.1.5 Wealth creation

Wealth creation is defined as the difference between return on invested capital and the weighted cost of average capital. The investor shall choose stocks of those companies having a higher Wealth creation for investing into stocks.

3.1.6 Undervaluation

Undervaluation is defined as the situation when the stocks of the company are priced such that the market price is lower than the fair price. An investor shall always look to ‘pick up’ undervalued stocks. This is measured by the market capitalization to revenue ratio. If the value of this ratio is less than 1, the stock is considered to be undervalued.

3.1.7 (Price is to earning)/growth (PEG ratio)

We have used this parameter to represent a significant comparison between companies having different price/earnings ratios and growth percentages.

An investor shall always want to invest into businesses having lower PEG values.

The stocks are selected on the basis of their performance in all these parameters on the basis of historical data. The priority function is such designed that stocks having highest score shall be selected into the pool of stocks comprising the portfolio.

3.1.8 Calculation of priority index

The priority index function uses the factors mentioned above as the benchmark, and the stocks are ranked on the basis of their cumulative scores as generated by the priority index function. Equities are ranked and allotted a score as per linear interpolation which can be expressed as follows for values whose maximum is desired:

$${\text{S}}_{\text{ij}} = 100{{\left( {{\text{X}}_{\text{ij}} - {\text{Min}}} \right)} \mathord{\left/ {\vphantom {{\left( {{\text{X}}_{\text{ij}} - {\text{Min}}} \right)} {\left( {{\text{Max}} - {\text{Min}}} \right)}}} \right. \kern-0pt} {\left( {{\text{Max}} - {\text{Min}}} \right)}}$$

where Sij is the score of ith stock on jth parameter, Xij is the functional value of ith stock on jth function, Min is the minimum value of ith stock on jth function, Max is the maximum value of ith stock on jth function

The formula for values whose minimum is desired is:

$${\text{S}}_{\text{ij}} = 100{{\left( {{\text{X}}_{\text{ij}} - {\text{Max}}} \right)} \mathord{\left/ {\vphantom {{\left( {{\text{X}}_{\text{ij}} - {\text{Max}}} \right)} {\left( {{\text{Min}} - {\text{Max}}} \right)}}} \right. \kern-0pt} {\left( {{\text{Min}} - {\text{Max}}} \right)}}$$

The priority index (PIi) is calculated by the summation of all scores each stock has earned on all the parameters.

$${\text{PI}}_{\text{i}} = \varSigma {\text{ S}}_{\text{ij}}$$

The stock selection is based on the PIi and the algorithm shall create the portfolio from only those stocks which have a Priority Index greater than 3.8 on a 5 point scale.

3.2 Portfolio optimization

The stocks selected on the basis of PIi are added to the portfolio and their weights are operated genetically for optimization. The selected stocks comprise of the genes of a chromosome. Each chromosome has genes represented by stocks which are further allotted random values (Fig. 4). These weights are calculated by the formula \(\left( {{{{\text{v}}_{\text{i}} } \mathord{\left/ {\vphantom {{{\text{v}}_{\text{i}} } {\varSigma {\text{v}}_{\text{i}} }}} \right. \kern-0pt} {\varSigma {\text{v}}_{\text{i}} }}} \right)\) which represent the proportion in which the stock i is to be invested in. An initial generation of ten such chromosomes is created by the random function. The next step is the creation of the reproduction pool which can be understood by the explanation given in the earlier section. A tabular representation is shown in Table 1.

Fig. 4
figure 4

A chromosome with gene representation

The reproduction pool shall now consist of the chromosomes: CH1, CH2, CH3, and CH3 as shown in Fig. 5.

Fig. 5
figure 5

Example to show reproduction pool

After the selection of the reproduction pool, the chromosomes are operated upon by the crossover and mutation operators. This gives rise to the second generation such that each generation has the same number of chromosomes which have been modified genetically. For example, after running 100 generations of ten chromosomes each, we get a population size of 1,000 chromosomes. The chromosome with the best fitness value is selected and the composition shall be the solution to our portfolio optimization problem.

4 Applied example of genetic algorithm on SP 500 Index: US

In this section we will illustrate the applied example of the multi stage portfolio construction algorithm on the SP500 US Index. The algorithm is illustrated in two steps: the input required from the user and the output generated. We shall analyze the output in Sect. 5.

4.1 Input

The input shall consist of the stocks from SP 500 Index: US comprising of the daily closing prices, EPS ratios, PEG values, weighted average cost of capital, market capitalization/revenue and return on invested capital in the US market from the duration December 2011–December 2012. The input was taken from Bloomberg. The data should be divided sector wise as shown in the column ‘Sectors’ of the Table 2. A sample input data file obtained from BLOOMBERG is shown in Table 2.

Table 2 Input data file for MATLAB code as obtained from BLOOMBERG

And the data set shall continue in the same order for all the input stocks (SP500 INDEX: US in our study).

A second input file is a column of the daily market returns for the same duration for the SP 500 Index: US. It is taken as the return of the market portfolio for the calculation of raw Beta of each stock of the index. Adjusted Beta is calculated by using the following relation:

$${\text{Adjusted beta}} = 0.67*{\text{raw beta }} + \, 0.33*1$$

A third input file (Table 3) containing the daily returns of all the stocks listed in the single market index (SP500 INDEX: US) for the period December 2011–December 2012 is shown below:

Table 3 Input data file for MATLAB code as obtained from BLOOMBERG for daily returns of all the stocks for the period December 2011 to December 2012

And the data set shall continue in the same order for all the input stocks (SP500 INDEX: US in our study).

4.2 Output

The portfolio constructed for a threshold of 3.8 on a 5 point scale is shown in the Table 4 below along with the performance analytics. This threshold value has given us a portfolio of 25 stocks selected by the priority index function and optimized by genetic algorithm. The lower the threshold, the higher will be the number of stocks in the portfolio and vice versa.

Table 4 Output file showing weights of selected stocks in the optimum portfolio

5 Analysis of portfolio

We shall now analyze the constructed portfolio on the basis of the following parameters (Table 5):

Table 5 Performance parameters of the optimum portfolio constructed for the period (Dec’ 2011–Dec’ 2012)

5.1 Average annual return

The average annual return is calculated from the average daily return as per the following formula:

$${\text{AAR}} = \, \left( { 1+ {\text{ADR}}} \right)^{ 2 5 2} - 1$$

This shall give us the annual average return for the portfolio allowing the investor to make a decision. The AAR of the portfolio is 26.01 % on the basis of the historical data (Dec’ 2011–Dec’ 2012).The constructed portfolio beats the market return by a significant margin.

5.2 Beta of portfolio (βp)

The Beta of portfolio shall be an indicator of the correlated volatility of the asset in relation with the volatility of the index on which the stocks are benchmarked. The Beta of portfolio is 0.87 which is less than Beta of the market portfolio.

5.3 Treynor’s ratio

The return which would be earned in excess as compared to a risk free environment is referred to as the Treynor’s ratio. This shall be analyzed for the above portfolio. The mathematical expression for Treynor’s ratio is given as:

$${\text{Treynor's ratio}} = {\text{excess return}}/{\text{beta}}$$

The Treynor’s ratio is 0.2976. This means that the portfolio provides a 29.76 % return per unit of risk. The Beta of 0.87 states that it is only 87 % as volatile as that of the SP 500 thus establishing that not only does the portfolio has maximized return but it has also managed the risk of the portfolio.

5.4 Jensen’s alpha

It is used as an indicator to determine the unusual return on the portfolio as compared to the return on the market indicating the effectiveness of the selection and optimization of the portfolio algorithm. Jensen’s alpha of the portfolio constructed is 14.56 % which indicates that the portfolio constructed by the algorithm is very effective.

$${\text{Jensen's alpha}} = \left( {{\text{R}}_{\text{p}} - {\text{ R}}_{\text{f}} } \right) \, - {{\upbeta}}_{\text{p}} \left( {{\text{R}}_{\text{m}} - {\text{ R}}_{\text{f}} } \right)$$

Risk free return in the US market is taken as the yield on a US Treasury Bill for 1 year = 0.13 %.

5.5 Graphical analysis

5.5.1 Graph 1: the relation between stock weights assigned and %AAR (Fig. 6)

The stock weights assigned show an interesting trend, the genetic algorithm allotted weights which were initially decreasing with %AAR and then began increasing after a certain amount. This technique gives insight to the method of optimization of genetic algorithm.

Fig. 6
figure 6

Plot of weights of selected stocks against AAR

5.5.2 Inference

The reader can infer that the algorithm developed is a multi decision multi stage algorithm where the weights assigned are multi function of return and risk. Hence we do not see a direct correlation between weights and AAR.

5.5.3 Graph 2: variation of %AAR with Beta value (Fig. 7)

Fig. 7
figure 7

Plot of AAR against Beta Value of each selected stock

The annual return shows an increasing trend with Beta initially and then begins to decrease. The stocks have been chosen at such a manner that we get values with higher AAR and lower Beta values.

5.5.4 Inference

The reader can infer that the algorithm developed is a multi decision multi stage algorithm where the weights assigned are multi function of return and risk. Hence we do not see a direct correlation between weights and

5.5.5 Graph 3: weights versus beta (Fig. 8)

Fig. 8
figure 8

Plot of weight of selected stock against Beta Value

The weights increase with Beta value for lower Beta values but then begin to decrease with higher Beta values. This shows the optimization technique of the genetic algorithm.

5.5.6 Inference

It can infer that the returns are low for stocks with a low Beta value. Equities with higher Beta values have higher returns indicating that higher the risk, higher the return. However, the returns decrease for very high Beta values. Genetic algorithm optimizes these variations to yield the most optimum portfolio.

6 Performance of the portfolio over a 6 month holding period

We have analyzed the constructed portfolio on the basis of live data also. Please note that the portfolio was constructed using historical data (December 2011–December 2012). We analyze the constructed portfolio by using the futuristic data (January’13–June’13). The analysis shows that the portfolio has performed remarkably with an %AAR as high as 15.98 % which was calculated from the 6 month yield rate of 7.69 %. During this period market shows a %AAR if 10.14 %, hence the portfolio constructed above using genetic algorithm beats the market. This is presented in Table 6.

Table 6 The performance of portfolio on 6 month holding period

6.1 Performance parameters

The success of the proposed solution is gauged by comparison of the portfolio performance against the benchmark. The paper has analyzed the performance by gauging how strongly the portfolio performance against the benchmark of SP500 US. The performance on a 6 month holding period is analyzed as follows:

6.2 Average annual return

The analysis of the portfolio shows an %AAR of 26.01 % from the historical data (December’ 2011 to December’ 2012). Now when tested for a 6 month holding period (Jan’ 13–June’ 13), the portfolio performed reasonably well giving an %AAR of 15.98 % whereas the market return for the same period was 10.14 %. Thus the above two stage technique involving genetic algorithm is highly objective, useful and effective for portfolio construction and optimization.

6.3 Beta of portfolio (βp)

The Beta of portfolio shall be an indicator of the correlated volatility of the asset in relation with the volatility of the index on which the stocks are benchmarked. The Beta of portfolio is 0.87 which is less than beta of the market portfolio.

6.4 Treynor’s ratio

The Treynor’s ratio is 0.1819. This means that the portfolio provides a 18.19 % return per unit of risk. The Beta of 0.87 states that it is only 87 % as volatile as that of the SP 500 thus establishing that not only does the portfolio has maximized return but it has also managed the risk of the portfolio.

7 Conclusion

The study has shown the wide implications of the above two stage process used in portfolio construction:

Selection of stocks on the basis of business fundamentals rather than market sentiments is an important lesson for acquiring long term benefits at lower risks. Our portfolio is based on creating a genetically suitable return and risk at the same time. We have successfully reduced the risk by a selection procedure based upon company fundamentals. Thus the portfolio is said to be optimum giving considerable return by taking a calculated risk.

The priority index allows users to select the threshold above which the stocks shall enter the portfolio. Higher is the threshold, fewer are the stocks and vice versa. This indicates that as the threshold goes up, the decision making becomes stern and we chose stocks which are stronger in business fundamentals. This might compromise on the return but shall give us long term stability.

The research has aptly demonstrated the application of genetic algorithm for optimization of portfolio. This is a novel example of application of genetic algorithms in the field of portfolio optimization in finance.

The portfolio for a threshold of 3.8 on a five point scale gives an annual average return of 15.98 % with respect to a risk free return of 0.13 %.