<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">JDAIP</journal-id><journal-title-group><journal-title>Journal of Data Analysis and Information Processing</journal-title></journal-title-group><issn pub-type="epub">2327-7211</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/jdaip.2016.41002</article-id><article-id pub-id-type="publisher-id">JDAIP-63407</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Computer Science&amp;Communications</subject><subject> Physics&amp;Mathematics</subject></subj-group></article-categories><title-group><article-title>
 
 
  High Dimensionality Effects on the Efficient Frontier: A Tri-Nation Study
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>ituparna</surname><given-names>Sen</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Pulkit</surname><given-names>Gupta</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Debanjana</surname><given-names>Dey</given-names></name><xref ref-type="aff" rid="aff3"><sup>3</sup></xref></contrib></contrib-group><aff id="aff2"><addr-line>Indian Institute of Technology, Kharagpur, India</addr-line></aff><aff id="aff1"><addr-line>Indian Statistical Institute, Chennai, India</addr-line></aff><aff id="aff3"><addr-line>Indian Institute of Management, Calcutta, India</addr-line></aff><author-notes><corresp id="cor1">* E-mail:<email>rsen@isichennai.res.in(IS)</email>;</corresp></author-notes><pub-date pub-type="epub"><day>02</day><month>02</month><year>2016</year></pub-date><volume>04</volume><issue>01</issue><fpage>13</fpage><lpage>20</lpage><history><date date-type="received"><day>5</day>	<month>December</month>	<year>2015</year></date><date date-type="rev-recd"><day>accepted</day>	<month>12</month>	<year>February</year>	</date><date date-type="accepted"><day>15</day>	<month>February</month>	<year>2016</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  Markowitz Portfolio theory under-estimates the risk associated with the return of a portfolio in case of high dimensional data. El Karoui mathematically proved this in [1] and suggested improved estimators for unbiased estimation of this risk under specific model assumptions. Norm constrained portfolios have recently been studied to keep the effective dimension low. In this paper we consider three sets of high dimensional data, the stock market prices for three countries, namely US, UK and India. We compare the Markowitz efficient frontier to those obtained by unbiasedness corrections and imposing norm-constraints in these real data scenarios. We also study the out-of-sample performance of the different procedures. We find that the 2-norm constrained portfolio has best overall performance.
 
</p></abstract><kwd-group><kwd>High Dimensional Covariance Matrix Estimation</kwd><kwd> Minimum-Variance Portfolio</kwd><kwd> Norm Con-Strained Portfolio</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>The need for solutions to optimization problems in a high dimensional setting is increasing in the finance industry with huge amount of data being generated every day. Many empirical studies indicate that minimum variance portfolios in general lead to a better out-of-sample performance than stock index portfolios [<xref ref-type="bibr" rid="scirp.63407-ref2">2</xref>] [<xref ref-type="bibr" rid="scirp.63407-ref3">3</xref>] . Markowitz Portfolio theory, the most popular method for portfolio optimization, develops a serious drawback namely risk underestimation. When implementing portfolio optimization according to [<xref ref-type="bibr" rid="scirp.63407-ref4">4</xref>] , one needs to estimate the expected asset returns as well as the corresponding variances and covariances. El Karoui studied the Markowitz problem as a solution to quadratic problems in [<xref ref-type="bibr" rid="scirp.63407-ref1">1</xref>] and [<xref ref-type="bibr" rid="scirp.63407-ref5">5</xref>] to establish a relationship between the two types of solution viz. one computed using population data and another estimated from sample data. This relationship is important and particularly relevant for high dimensional data where one suspects that the difference between the two may be considerable.</p><p>There is a broad literature which addresses the question of how to reduce estimation risk in portfolio optimization. De Miguel et al. compare portfolio strategies which differ in the treatment of estimation risk in [<xref ref-type="bibr" rid="scirp.63407-ref6">6</xref>] and confirm that the considered strategies perform better than the traditional plug-in implementation of Markowitz optimization. Constrained minimum-variance portfolios have been frequently advocated in the literature (see [<xref ref-type="bibr" rid="scirp.63407-ref7">7</xref>] - [<xref ref-type="bibr" rid="scirp.63407-ref10">10</xref>] ).</p><p>The main aim of this paper is to compare the efficient frontier for real data based on corrected estimators of [<xref ref-type="bibr" rid="scirp.63407-ref5">5</xref>] and norm-constrained portfolios. One natural advantage of norm-constrained optimization is that it leads to sparse solutions, which many of the portfolio weights are zero. Such a portfolio is preferable in terms of transaction costs. On the other hand, if Gaussian assumptions are valid, then the corrected frontier is indeed the most efficient. Another advantage is that one can obtain a confidence interval for the variance at each value of return.</p><p>We carry out our analysis for three scenarios namely the Indian stock market, London Stock market and U.S stock market to facilitate a comparative study and to conclude about the uniformity of our results. We use constituent stocks of NSE CNX 100, FTSE 100 and S&amp;P 100 respectively for the three scenarios as our data base taking daily data from 1st Jan 2013 to 1st Jan 2014 time span. The daily returns data are publicly available from NSE India and yahoo finance. Thus we have at our disposal, 100 stocks for each country with 250 observations per stock. In other words, considering p to be the number of assets and n to be the number of observations per asset, we arrive at a large p, large n setting which in modern statistical parlance can be considered to be a high dimensional setting.</p><p>The rest of the paper is organized as follows. Section 2 is committed to explaining the modern portfolio theory. Section 3 deals with identifying the underestimation factors and the bias inherent in the plug in estimators and subsequently eliminating them from the empirical optimized portfolio, to arrive at the final error-free optimized weights. Section 4 deals with norm constrained models. In section 5, we present the empirical results of comparing the efficient frontiers obtained from Markowitz portfolio to error-free efficient frontier and norm constrained portfolio efficient frontiers. We present our conclusions in section 6.</p></sec><sec id="s2"><title>2. Markowitz Portfolio Theory</title><p>Markowitz portfolio theory [<xref ref-type="bibr" rid="scirp.63407-ref4">4</xref>] is a classic portfolio optimization problem in finance where investors choose to invest according to the following framework: one picks the assets in such a way that the portfolio guarantees a certain level of expected returns but minimizes the “risk” associated with it. In standard framework this risk is measured by the variance of the portfolio whereas the expectation by the mean of the portfolio. The set-up is as follows:</p><p> There is an opportunity to invest in p assets<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x6.png" xlink:type="simple"/></inline-formula>.</p><p> The mean returns are represented by a p-dimensional vector<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x7.png" xlink:type="simple"/></inline-formula>.</p><p> The covariance matrix of the returns is denoted by<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x8.png" xlink:type="simple"/></inline-formula>.</p><p> The aim is to create a portfolio with guaranteed mean return <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x9.png" xlink:type="simple"/></inline-formula> and minimize the risk as measured by the variance.</p><p> The problem is to find the weights or amount allocated to various assets of the portfolio.</p><p>Note that <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x10.png" xlink:type="simple"/></inline-formula> is positive semi definite and symmetric. In ideal situation the means, variances and covariance are known and the problem is the following quadratic programming problem:</p><disp-formula id="scirp.63407-formula100"><label>(1)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x11.png"  xlink:type="simple"/></disp-formula><p>Here 1<sub>p</sub> is a p-dimensional vector with one in every entry.</p><p>In practice, Σ and &#181; are unknown. The most common procedure known as plug-in implementation replaces them with their sample estimators as follows to obtain the optimal weights.</p><disp-formula id="scirp.63407-formula101"><label>(2)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x12.png"  xlink:type="simple"/></disp-formula><p>With <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x13.png" xlink:type="simple"/></inline-formula> is a p &#215; n matrix of the returns of the assets. It is assumed that the columns of X are independent multivariate Normal vectors</p><p>If <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x14.png" xlink:type="simple"/></inline-formula> is invertible with <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x15.png" xlink:type="simple"/></inline-formula> is representing the solution of the above quadratic problem then,</p><disp-formula id="scirp.63407-formula102"><label>(3)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x16.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x17.png" xlink:type="simple"/></inline-formula> a p &#215; 2 matrix is whose first column are all unity and second column are the estimated means. Also U is the 2 dimensional vector with first entry being 1 and the second entry being<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x18.png" xlink:type="simple"/></inline-formula>.</p><p>The curve <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x19.png" xlink:type="simple"/></inline-formula> seen as a function of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x20.png" xlink:type="simple"/></inline-formula> is called the efficient frontier.</p></sec><sec id="s3"><title>3. Corrected Frontier Using Gaussian Assumption</title><p>In the Markowitz setting, let us assume that the returns have normal distribution. We shall assume n and p both go to infinity and each X<sub>i</sub> ~ N<sub>p</sub> (&#181;, S) independently and identically. The parameters of the distribution are estimated using sample estimators defined in (2).</p><p>We have from Corollary 3.3 of [<xref ref-type="bibr" rid="scirp.63407-ref1">1</xref>] ,</p><disp-formula id="scirp.63407-formula103"><label>(4)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x21.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x22.png" xlink:type="simple"/></inline-formula> is the population quantity, k being the number of constraints in the quadratic problem we are solving which in our case will be equal to 2, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x22.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x23.png" xlink:type="simple"/></inline-formula>represents the weights obtained from the empirical data</p><p>at hand while <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x24.png" xlink:type="simple"/></inline-formula> is its population counterpart. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x24.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x25.png" xlink:type="simple"/></inline-formula>denotes the canonical basis vectors in<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x24.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x25.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x26.png" xlink:type="simple"/></inline-formula>. The corollary shows that the effects of both covariance and mean estimation are to underestimate the risk and the empirical frontier is asymptotically deterministic. The cost of not knowing the covariance matrix and estimating it is captured</p><p>in the factor<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x27.png" xlink:type="simple"/></inline-formula>. In other words using plug in procedures leads to over optimistic conclusions in this situation.</p><p>Also when <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x28.png" xlink:type="simple"/></inline-formula> and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x28.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x29.png" xlink:type="simple"/></inline-formula> and we denote <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x28.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x29.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x30.png" xlink:type="simple"/></inline-formula> the impact of the estimation of &#181; by <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x28.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x29.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x30.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x31.png" xlink:type="simple"/></inline-formula> will be risk underestimation by the amount<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x28.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x29.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x30.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x31.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x32.png" xlink:type="simple"/></inline-formula>. Hence rearranging (4) and subtracting the bias associated with mean and covariance estimation, from our variances obtained from sample data we get the error-free actual quantities of interest. In other words,</p><disp-formula id="scirp.63407-formula104"><label>(5)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x33.png"  xlink:type="simple"/></disp-formula><p>The estimator <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x34.png" xlink:type="simple"/></inline-formula> for proposed in [<xref ref-type="bibr" rid="scirp.63407-ref1">1</xref>] is a modified version of the optimal solution in equation (3). The modification is to replace M by<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x34.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x35.png" xlink:type="simple"/></inline-formula>.</p><p>It is also shown in Theorem 5.1 of [<xref ref-type="bibr" rid="scirp.63407-ref1">1</xref>] that the risk is indeed underestimated by the empirical frontier. Specifically,</p><disp-formula id="scirp.63407-formula105"><graphic  xlink:href="http://html.scirp.org/file/2-2870101x36.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x37.png" xlink:type="simple"/></inline-formula> and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x37.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x38.png" xlink:type="simple"/></inline-formula> are respectively the empirical frontier with Gaussian distributed data and the theoretical efficient frontier.</p><p>We use the 95% confidence intervals for the variance of a single Normal variable with unknown mean &#181; and standard deviation σ given by:</p><disp-formula id="scirp.63407-formula106"><graphic  xlink:href="http://html.scirp.org/file/2-2870101x39.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x40.png" xlink:type="simple"/></inline-formula> is the sample variance and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x40.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x41.png" xlink:type="simple"/></inline-formula> follows a <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x40.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x41.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x42.png" xlink:type="simple"/></inline-formula> distribution with <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x40.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x41.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x42.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x43.png" xlink:type="simple"/></inline-formula> degrees of freedom, the confidence coefficient being equal to 0.05.</p></sec><sec id="s4"><title>4. Constraining the Portfolio</title><p>The short sale constrained minimum-variance portfolio, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x44.png" xlink:type="simple"/></inline-formula>is introduced in [<xref ref-type="bibr" rid="scirp.63407-ref7">7</xref>] . This is the solution to problem (1) with the additional constraint that the portfolio weights be nonnegative.</p><sec id="s4_1"><title>4.1. 1-Norm Constrained Portfolio</title><p>The 1-norm-constrained portfolio, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x45.png" xlink:type="simple"/></inline-formula>, is the solution to the traditional minimum-variance portfolio problem (1) subject to the additional constraint that the L<sub>1</sub>-norm of the portfolio-weight vector be smaller than or equal to a certain threshold c; that is,</p><disp-formula id="scirp.63407-formula107"><label>(6)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x46.png"  xlink:type="simple"/></disp-formula><p>1-norm constrained portfolio problem can be summarized as</p><disp-formula id="scirp.63407-formula108"><label>(7)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x47.png"  xlink:type="simple"/></disp-formula><p>Markowitz risk minimization problem can be recast as a regression problem.</p><disp-formula id="scirp.63407-formula109"><label>(8)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x48.png"  xlink:type="simple"/></disp-formula><p>By using the fact that the sum of total weights is one, we have</p><disp-formula id="scirp.63407-formula110"><label>(9)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x49.png"  xlink:type="simple"/></disp-formula><p>where R = Return vector, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x50.png" xlink:type="simple"/></inline-formula>and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x50.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x51.png" xlink:type="simple"/></inline-formula> where<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x50.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x51.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x52.png" xlink:type="simple"/></inline-formula>.</p><p>Finding the optimal weight w is the same as finding the regression coefficient<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x53.png" xlink:type="simple"/></inline-formula>. The gross-exposure constraint <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x53.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x54.png" xlink:type="simple"/></inline-formula> can now be expressed as<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x53.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x54.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x55.png" xlink:type="simple"/></inline-formula>. Thus the problem (7) is similar to</p><disp-formula id="scirp.63407-formula111"><label>(10)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x56.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x57.png" xlink:type="simple"/></inline-formula> but they are not equivalent. The latter depends on choice of Y, while the former</p><p>does not. Efron et al. developed an efficient algorithm in [<xref ref-type="bibr" rid="scirp.63407-ref11">11</xref>] by using the least-angle regression (LARS), called the LARS-LASSO algorithm, to efficiently find the whole solution path<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x58.png" xlink:type="simple"/></inline-formula>, for all<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x58.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x59.png" xlink:type="simple"/></inline-formula>, to (10). The number of non-vanishing weights varies as c ranges from 0 to ∞. It recruits successively more assets and gradually all assets. The algorithm works iteratively as follows:</p><disp-formula id="scirp.63407-formula112"><label>(11)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x60.png"  xlink:type="simple"/></disp-formula><p>Here our objective is to minimize the out-of-sample portfolio variance. To choose c we use leave-one-out- cross validation (see [<xref ref-type="bibr" rid="scirp.63407-ref12">12</xref>] ).</p></sec><sec id="s4_2"><title>4.2. 2-Norm Constrained Portfolio</title><p>The 2-norm-constrained portfolio, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x61.png" xlink:type="simple"/></inline-formula>, is the solution to the traditional minimum-variance portfolio problem (1) subject to the additional constraint that the L<sub>2</sub>-norm of the portfolio-weight vector is smaller than or equal to a certain threshold c; that is,</p><disp-formula id="scirp.63407-formula113"><label>(12)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x62.png"  xlink:type="simple"/></disp-formula><p>2-norm constrained portfolio problem can be summarized as</p><disp-formula id="scirp.63407-formula114"><label>(13)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x63.png"  xlink:type="simple"/></disp-formula><p>Similar to the 1-norm constrained portfolio finding the optimal weight w in this case is the same as finding the regression coefficient<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x64.png" xlink:type="simple"/></inline-formula>.</p><p>The gross-exposure constraint <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x65.png" xlink:type="simple"/></inline-formula> can now be expressed as<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x65.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x66.png" xlink:type="simple"/></inline-formula>. Thus the problem (13) is similar to</p><disp-formula id="scirp.63407-formula115"><label>(14)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x67.png"  xlink:type="simple"/></disp-formula><p>where<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x68.png" xlink:type="simple"/></inline-formula>. But they are not equivalent. The latter depends on the choice of asset Y, while the former does not.</p><p>The whole solution pat <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/2-2870101x69.png" xlink:type="simple"/></inline-formula> to (14), for all c ≥ 0, can be efficiently obtained by the regularization algorithm of Ridge regression (see [<xref ref-type="bibr" rid="scirp.63407-ref13">13</xref>] ). The number of non-vanishing weights varies as c ranges from 0 to ∞. It recruits successively more assets and gradually all assets. The algorithm works iteratively as follows:</p><disp-formula id="scirp.63407-formula116"><label>(15)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/2-2870101x70.png"  xlink:type="simple"/></disp-formula><p>To choose c we use cross validation, as in the case of 1-norm constrained portfolio.</p></sec></sec><sec id="s5"><title>5. Practical Results</title><p>Below we provide an overview of our results of Markowitz efficient frontier, corrected frontier using Gaussian assumption, 1-norm and 2-norm constrained efficient frontiers for the 3 countries.</p><p>In Figures 1-3, we present the efficient frontiers using the different methods. The dashed lines represent the empirical 95% confidence intervals computed for a fixed expected return. The x-axis is variance and y-axis is</p><fig id="fig1"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref></label><caption><title> Efficient frontier of US data</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/2-2870101x71.png"/></fig><fig id="fig2"  position="float"><label><xref ref-type="fig" rid="fig2">Figure 2</xref></label><caption><title> Efficient frontier of UK data</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/2-2870101x72.png"/></fig><fig id="fig3"  position="float"><label><xref ref-type="fig" rid="fig3">Figure 3</xref></label><caption><title> Efficient frontier of Indian data</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/2-2870101x73.png"/></fig><p>expected returns. We have considered the same set of &#181;’s and Σ’s, for each individual country to keep the results comparable. It can be concluded from the relative positions of the corrected and uncorrected efficient frontiers that the risk is indeed underestimated in case of high dimensional data. But comparing to 2-norm and 1-norm constrained portfolios as they outperform the corrected frontiers. The constrained portfolios are, in general, less efficient than the corrected portfolio, in the sense that they have higher variance for each fixed level of return. Of course constrained portfolios have their own advantages due to sparsity that might out-weigh the loss in efficiency. For the 1-norm and 2-norm portfolios, the choice of the asset Y is important. We have chosen Y to be the no short sale portfolio in all our computations. For each country, the 2-norm portfolio is most efficient among the constrained portfolios and the 1-norm is not monotone.</p><p>The amount of shrinkage or regularization is directly related to the number of stocks included in the optimal portfolio. In <xref ref-type="fig" rid="fig4">Figure 4</xref> we present this for the 1-norm constrained portfolio. As expected, this is an increasing function of c, the bound on the L<sub>1</sub> norm. For almost all values of c, the number of stocks in the portfolio is highest for the Indian market and lowest for the US market. Results for the L<sub>2</sub>norm are similar.</p><p>For out of sample performance we first created portfolios for all the three datasets using the return data for the first 230 trading days. These portfolios are then held for one month and rebalanced at the end next month. The summary statistics of these portfolios are presented for the three datasets as box-plots in Figures 5-7. 1-norm constrained portfolios were created for c = 2 and c = 3 for all the three nations. 2-norm constrained portfolios were created for the optimal c chosen by cross validation, as mentioned in Section 4. This value equals 1.2544, 1.14 and 1.0739 respectively for US, UK and Indian data.</p><p>The out-of-sample performance is very different for the three markets. For the US data, the 2-normcon-</p><fig id="fig4"  position="float"><label><xref ref-type="fig" rid="fig4">Figure 4</xref></label><caption><title> Number of stocks with respect to c with Y = “No short sale”</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/2-2870101x74.png"/></fig><fig id="fig5"  position="float"><label><xref ref-type="fig" rid="fig5">Figure 5</xref></label><caption><title> Out of sample performance of different portfolios for US data</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/2-2870101x75.png"/></fig><fig id="fig6"  position="float"><label><xref ref-type="fig" rid="fig6">Figure 6</xref></label><caption><title> Out of sample performance of different portfolios for UK data</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/2-2870101x76.png"/></fig><fig id="fig7"  position="float"><label><xref ref-type="fig" rid="fig7">Figure 7</xref></label><caption><title> Out of sample performance of different portfolios for Indian data</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/2-2870101x77.png"/></fig><p>strained, corrected and no-short-sale portfolios have close to zero average returns while the other methods yield negative average returns. The variances are almost same for all methods except the Markowitz, which has a lower variance. For UK data, the 1-norm with c = 3 and corrected portfolios have significantly negative average return while others have small positive or zero average returns. The variances are almost all the same. For the Indian data, all portfolios except the Markowitz have high positive average returns. In particular, the corrected portfolio has very high average returns, but the variance is also quite high. Overall, from the out-of-sample results, the 2-norm constrained portfolio has higher average and comparable variance to the Markowitz portfolio in all the markets.</p></sec><sec id="s6"><title>6. Conclusion</title><p>In this paper we study the effect of high dimension on the efficient frontier with real data on three markets. In particular we study how the recently suggested methods of corrected frontier based on normality assumptions and norm-constrained methods perform relative to Markowitz portfolio optimization. We observe that the Markowitz solution indeed leads to biased estimates of risk that can be improved with the corrected estimates. The norm-constrained methods are comparable and need less model assumptions. Alternative methods of improving the covariance matrix estimation are Bayesian shrinkage approach [<xref ref-type="bibr" rid="scirp.63407-ref8">8</xref>] or random matrix theory and principal component analysis [<xref ref-type="bibr" rid="scirp.63407-ref14">14</xref>] . We have ignored the time component of the data and treated the observations as i.i.d. A further improvement will be to take into account this aspect and model the high dimensional time series as in [<xref ref-type="bibr" rid="scirp.63407-ref15">15</xref>] .</p></sec><sec id="s7"><title>Cite this paper</title><p>RituparnaSen,PulkitGupta,DebanjanaDey, (2016) High Dimensionality Effects on the Efficient Frontier: A Tri-Nation Study. Journal of Data Analysis and Information Processing,04,13-20. doi: 10.4236/jdaip.2016.41002</p></sec></body><back><ref-list><title>References</title><ref id="scirp.63407-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">El Karoui, N. (2010) High-Dimensionality Effects in the Markowitz Problem and Other Quadratic Programs with Linear Constraints: Risk Underestimation. The Annals of Statistics, 38, 3487-3566. &lt;br /&gt;http://dx.doi.org/10.1214/10-AOS795</mixed-citation></ref><ref id="scirp.63407-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Winston, K. (1993) The Efficient Index and Prediction of Portfolio Variance. Journal of Portfolio Management, 19, 27-34. http://dx.doi.org/10.3905/jpm.1993.409446</mixed-citation></ref><ref id="scirp.63407-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Haugen, R. and Baker, N. (1991) The Efficient Market Inefficiency of Capitalization-Weighted Stock Portfolios. Journal of Portfolio Management, 17, 35-40. http://dx.doi.org/10.3905/jpm.1991.409335</mixed-citation></ref><ref id="scirp.63407-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Markowitz, H. (1952) Portfolio Selection. The Journal of Finance, 7, 77-91.  
&lt;br /&gt;http://dx.doi.org/10.1111/j.1540-6261.1952.tb01525.x</mixed-citation></ref><ref id="scirp.63407-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">El Karoui, N. (2013) On the Realized Risk of High-Dimensional Markowitz portfolios. SIAM Journal on Financial Mathematics, 4, 737-783. http://dx.doi.org/10.1137/090774926</mixed-citation></ref><ref id="scirp.63407-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">DeMiguel, V., Garlappi, L. and Uppal, R. (2009) Optimal versus Naive Diversification: How inefficient Is the 1/n Portfolio Strategy? Review of Financial Studies, 22, 1915-1953. &lt;br /&gt;http://dx.doi.org/10.1093/rfs/hhm075</mixed-citation></ref><ref id="scirp.63407-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Jagannathan, R. and Ma, T. (2003) Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps. Journal of Finance, 58, 1651-1684. http://dx.doi.org/10.1111/1540-6261.00580</mixed-citation></ref><ref id="scirp.63407-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Ledoit, O. and Wolf, M. (2004) A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices. Journal of Multivariate Analysis, 88, 365-411. &lt;br /&gt;http://dx.doi.org/10.1016/S0047-259X(03)00096-4</mixed-citation></ref><ref id="scirp.63407-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Fan, J., Zhang, J. and Yu, K. (2012) Vast Portfolio Selection with Gross-Exposure Constraints. Journal of the American Statistical Association, 55, 798-812. http://dx.doi.org/10.1080/01621459.2012.682825</mixed-citation></ref><ref id="scirp.63407-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">DeMiguel, V., Garlappi, L., Nogales, F.J. and Uppal, R. (2009) A Generalized Approach to Portfolio Optimization: Improving Performance by Constraining Portfolio Norms. Journal of Management Science, 107, 592-606.</mixed-citation></ref><ref id="scirp.63407-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004) Least Angle Regression (with Discussions). The Annals of Statistics, 32, 409-499.</mixed-citation></ref><ref id="scirp.63407-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Efron, B. and Gong, G. (1983) A Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation. American Statistician, 1, 36-48.</mixed-citation></ref><ref id="scirp.63407-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Hoerl, A.E. and Kennard, R.W. (1970) Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12, 55-67. http://dx.doi.org/10.1080/00401706.1970.10488634</mixed-citation></ref><ref id="scirp.63407-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Bai, Z., Liu, H. and Wong, W. (2009) Enhancement of the Applicability of Markowitz's Portfolio Optimization by Utilizing Random Matrix Theory. Mathematical Finance, 19, 639-667.  
&lt;br /&gt;http://dx.doi.org/10.1111/j.1467-9965.2009.00383.x</mixed-citation></ref><ref id="scirp.63407-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Liu, H., Aue, A. and Paul, D. (2015) On the Marcenko-Pastur Law for Linear Time Series. The Annals of Statistics, 43, 675-712. http://dx.doi.org/10.1214/14-AOS1294</mixed-citation></ref></ref-list></back></article>