Plotting Financial Data With Python: Efficient Frontier (N assets)

Programming |

Updated on

We already have the efficient frontier script that we created in the previous post but it has one major limitation: it does not allow us to plot more than two assets. Plotting two assets is enough to see diversification in action but it’s not practical to have a portfolio that consists of two assets. In this post we’re going to extend the previous script in order to support an arbitrary number of assets. Contents

This is the fifth part of the “Plotting Financial Data With Python” series and it’s better if you read it in chronological order:

1. Part 1 - History
2. Part 2 - Variance
3. Part 3 - Comparing Returns
4. Part 4 - Efficient Frontier (2 Assets)
5. Part 5 - Efficient Frontier: (N assets) (you are here)

Source Code

You can get the full source code here: https://github.com/bubelov/market-plots

Why Diversify?

Diversification helps to reduce portfolio volatility but to what extent? Well, it depends on the correlations between different assets but we can safely assume that the number of assets should be greater than 2. If you decide to add an another asset, the smaller the number of assets you already have in your portfolio, the better the effect of diversification. Here is the picture that helps to visualize how the number of assets affects the portfolio risk: As you can see, one thing is clear: having two assets does not allow us to get all of the benefits of diversification. There are many opinions on what number of assets is “right” but almost everyone agrees that two is far too low.

Goal

We already calculated the efficient frontier for a portfolio that consists of the IBM and DIS stocks. Let’s add one more stock to it. You can pick any stock or an index but I’ll go with Coca-Cola (KO).

So, how do we calculate our risks and rewards?

Expected Return

Here is how we can calculate the expected return on a portfolio:

$$E(R_p) = \sum_{i=1}^N w_i E(R_i)$$

Where:

$$R_p$$ = expected return on a portfolio

$$N$$ = number of assets in a portfolio

$$w_i$$ = weight of an asset i in a portfolio

$$R_i$$ = expected return on asset i

All of this is pretty simple, we just need to find the weighted average of the returns of every asset in a portfolio.

Variance

Variance is a bit more tricky to calculate because we have to include the correlations between each pair of assets:

$$σ_p^2 = \sum_{i=1}^N w_i^2 σ_i^2 + \sum_{i=1}^N \sum_{j \not = i}^N w_i w_j σ_i σ_j p_{ij}$$

Where:

$$σ_p^2$$ = portfolio volatility

$$w_i$$ = weight of an asset i in a portfolio

$$σ_i$$ = standard deviation of an asset i

$$p_{ij}$$ = correlation of returns between the assets i and j

Standard Deviation

Standard deviation of a portfolio is just a square root of it’s variance:

$$σ_p = (σ_p^2)^{1 \over 2}$$

That gives us a hint about the portfolio riskiness.

Implementation

Let’s create a new file and call it frontier.py:

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 import matplotlib.pyplot as plt import sys import pathlib import numpy as np import alpha_vantage def show_frontier(symbols, interval='MONTHLY'): #print(f'Symbols: {symbols}') returns_history = dict() min_length = None for symbol in symbols: history = alpha_vantage.get_stock_returns_history(symbol, interval) #print(f'Fetched {len(history)} records for symbol {symbol}') if min_length == None: min_length = len(history) if (len(history) < min_length): min_length = len(history) returns_history[symbol] = history #print(f'Min hisotry length = {min_length}') for symbol in symbols: returns_history[symbol] = returns_history[symbol][-min_length:] #for symbol in symbols: # print( # f'History for symbol {symbol} has {len(returns_history[symbol])} records') mean_returns = dict() variances = dict() standard_deviations = dict() for symbol in symbols: history = returns_history[symbol] history_length = len(history) #print(f'Return history for symbol {symbol} has {history_length} records') mean_returns[symbol] = np.mean(history) variances[symbol] = np.var(history) standard_deviations[symbol] = np.sqrt(variances[symbol]) portfolio_returns = [] portfolio_deviations = [] for i in range(0, 1_000): randoms = np.random.random_sample((len(symbols),)) weights = [random / sum(randoms) for random in randoms] expected_return = sum([weights[i] * mean_returns[symbol] for i, symbol in enumerate(symbols)]) weights_times_deviations = [ weights[i]**2 * standard_deviations[symbol]**2 for i, symbol in enumerate(symbols)] variance = sum(weights_times_deviations) for i in range(0, len(symbols)): for j in range(0, len(symbols)): if (i != j): symbol1 = symbols[i] symbol2 = symbols[j] #print('Pair = %s %s' % (symbol1, symbol2)) weight1 = weights[i] weight2 = weights[j] #print('Weights = %s %s' % (weight1, weight2)) deviation1 = standard_deviations[symbol1] deviation2 = standard_deviations[symbol2] #print('Deviations = %s %s' % (deviation1, deviation2)) correlation = np.corrcoef( returns_history[symbol1], returns_history[symbol2]) #print('Correlation = %f' % correlation) additional_variance = weight1 * weight2 * deviation1 * deviation2 * correlation #print('Additional variance = %f' % additional_variance) variance += additional_variance standard_deviation = np.sqrt(variance) #print('Portfolio expected return = %f' % expected_return) #print('Portfolio standard deviation = %f' % standard_deviation) plt.scatter(standard_deviation, expected_return, color='#007bff') portfolio_returns.append(expected_return) portfolio_deviations.append(standard_deviation) x_padding = np.average(portfolio_deviations) / 25 plt.xlim(min(portfolio_deviations) - x_padding, max(portfolio_deviations) + x_padding) y_padding = np.average(portfolio_returns) / 25 plt.ylim(min(portfolio_returns) - y_padding, max(portfolio_returns) + y_padding) plt.gca().set_xticklabels(['{:.2f}%'.format(x*100) for x in plt.gca().get_xticks()]) plt.gca().set_yticklabels(['{:.2f}%'.format(y*100) for y in plt.gca().get_yticks()]) plt.title(f'Efficient Frontier {symbols}') plt.xlabel('Risk') plt.ylabel('Return') pathlib.Path('img/frontier').mkdir(parents=True, exist_ok=True) plt.savefig(f'img/frontier/frontier.png') plt.close() show_frontier(sys.argv[1:])

Testing

Now, let’s run our new script in order to see the efficient frontier:

python frontier.py IBM DIS KO

You should see the following image: Conclusion

Now we are able to plot the efficient frontier based on an arbitrary number of assets. Please note that nothing is “for sure” in the world of investing and this model has a lot of limitations, although it’s probably the best model that is currently available. Our expected return is based purely on the past performance which might not be an accurate assumption about the future.

Another thing to consider is the limit of diversification. The benefits of having more assets tend to wear off with each new asset added to your portfolio. There is a huge difference between the 2-asset and 10-asset portfolios but there might be no gain in having 200 assets, especially if you take into account all of the transaction costs of rebalancing your portfolio.

This site doesn't have ads and the reasons are simple:

• Most people don't want to see ads, that's not what they look for when they open web pages.
• Ad scripts can track visitors, exposing private data to third parties.

If you found this post valuable and you wish to leave a tip, you can do it with Bitcoin:

34CXtg7c4Vbw8DZjAwFQVsrbu9eDEbTzbA 