Financial investments are based on future expectations regarding returns and risks. Historical data provides guidelines for estimating both. While expected returns often are then adjusted up or down based on contemporaneous information not captured by past data, the measures of risk invariably rely on historical data and vary primarily due to the time horizon used in the calculation. The risk for a single asset is measured by its variance, or the square root ov variance (the standard deviation). The risk for a portfolio is measured by the collection of standard deviations and their covariances, and the correlation between any two assets is measured by the ratio of their covariance to the product of their standard deviations. When moving from two assets to a portfolio of multiple assets, corresponding formulae are applied by expressing these statistical terms in vectors and matrices and applying linear algebra. Neat, tidy, and easy to program a computer to do. Alas, this occurs so routinely that it is done without thought.

However, a standard deviation is a proper measure of variability only when the underlying variable can be assumed to follow a symmetric distribution, and correlation as invariably measured using the classic Pearson formula is a proper measure of statistical association only when the underlying association is linear. While there is theory to support return (or log-return) symmetry, there is no theory to support linearity.

Pearson correlation is the measure of association taught in every introductory course. Sometime later one might be introduced to the Spearman rank-order correlation, although within investment circles even respected websites like *Investopedia *(https://www.investopedia.com/terms/c/correlationcoefficient.asp) does not suggest alternatives to classic Pearson, and claim: “*By adding a low or negatively correlated mutual fund to an existing portfolio, the investor gains **diversification **benefits*”.

But do they? Can classic Pearson estimates by trusted without considering alternatives — and without at least visually inspecting the pairwise association over time of a pair of asset returns? There are several alternatives, each with potential to better flush out the true underlying association between and among variables that deviate from an underlying linear pattern. The R software has functions to measure several alternatives. “R” is a freely-available open-source statistical computing and visualization environment that you can download or access from the cloud, and it is accompanied by about 60,000 “packages” that implement ideas across disciplines (to access “R” or learn more about it, click the “Computing Resources” link within the “Resource Links” tab at our website (https://retirementfinance.org/resources/). Below is a list of five correlation measures that are implemented in “R” and well-documented for ease of use.

**Five Measures of Correlation**** ****R Package**** ****Package Function**

- Pearson Correlation stats cor()
- Spearman Correlation pspearman pspearman()
- Hoeffding Correlation Hmisc foeffd()
- Maximal Correlation acepack ace()
- Distance Correlation energy dcor()

Following are examples where reliance on classic Pearson correlation may mislead. For each, three correlation measures are provided: classic Pearson (with which you likely are familiar); Spearman correlation (which substitutes rank-orders for the raw data, thereby tempering the impact of outliers), and; Maximal correlation (which considers various transformations separately for each variable and then selects the two transformations that provide the highest correlation between the two transformed variables. Maximal correlation equals 0.00 when the variables are independent, and equals 1.00 when the Pearson correlation between the optimally transformed pair of variables equals 1.00. Maximal correlation measures strength of relationship, not direction (since patterns can be multidirectional). So, for any pair of asset returns measured over time, Pearson correlation captures linear trends, Spearman correlation captures monotone trends, and Maximal correlation captures complex trends.

**Example A: The Anscombe Quartet**

Anscombe (https://garstats.files.wordpress.com/2016/08/anscombe-as-1973.pdf) used simulated data to demonstrate the power of visual observation. He create four variants of dt pairings, each with an outcome (y) and a predictor (x). All sets have 11 observations, the same mean of x (9) and y (7.5), the same fitted regression line (y = 3 + 0.5 x), the same regression and residual sum of squares and therefore the same r-squared (0.67) and same classic Pearson correlation (0.816). The summary measures are the same across the four pairings, but the situations vary greatly. Pairing #1 confirms to a pure linear relationship. #2 is nonlinear, #3 has an outlier, and #4 an influential point. The four displays below make clear these are four different scenarios.

**Example B: A Messy Arc**

An example of a perfect downward bending arc can be formed by letting x range from -100 to +100, and setting y equal to the square root of 100 squared minus x squared.

In this example, both the classic Pearson and the Spearman rank correlation measures equal 0.00, whereas the maximal correlation coefficien equals 1.00 (actually 0.999817).

A more plausible relationship between asset returns might add noise to y, adding an unbiased random normal element to each pairing with standard deviation set to 10. In this modified example, shown below, it is clear that the relationship between returns for Asset x and returns for Asset y is nonlinear, better captured by the arc than straight line.

Mechanical application of modern portfolio theory will include classic Pearson correlation as a key ingredient in estimation portfolio risk, clearly understating degree of association.

Pearson (and Spearman) correlation is measured in the [-1, +1] range, with the signage signalling direction of the monotonic association. In contrast, maximal correlation is expressed within the [0, +1] range, expressing strength of association without regard for direction. As seen in this example, the direction is positive for low x, negative for high x.

**Example C: Pennington Fund v. S&P500**

The Pennington Fund (https://fund.pennington-trading.com/) features a managed trading platform (MTP) with, in their words, returns that “*have historically had a near-zero relation to the S&P500*”. Their portfolio is a mix of blue chip stocks, initial public offerings (IPOs), options, real estate investment trusts (REITs), a commodities fund, and a client investment margin account. Over a recent 55-month period both the classic Pearson and the Spearman rank correlation measures indeed were near zero — but the maximal correlation coefficient (0.352) indicates potentially far less diversification than presumed.

**Example D: Asset and Sector ETFs**

Clearly, mechanical or robotic application of classic Pearson correlation measures to assess degree of diversification among assets or asset classes can be misleading. It nonetheless is done routinely by financial advisors and professionals, and reported without recognition of caveats such as the potential for curvatures, outliers, influential points or other empirical possibilities that are assumed away by Pearson correlations.

The *Portfolio Visualizer* website (https://www.portfoliovisualizer.com/asset-class-correlations)

provides a correlation matrix for 14 common exchange-traded funds (ETFs), representing typical asset classes and subclasses. Asset class Pearson correlations are for the time period 05/01/2009 – 08/31/2020, based on monthly returns and assuming linear relations.

**Ticker**** ****Name**

VTI Vanguard Total Stock Market ETF

VO Vanguard Mid-Cap ETF

VB Vanguard Small-Cap ETF

SHY iShares 1-3 Year Treasury Bond ETF

BND Vanguard Total Bond Market ETF

TLT iShares 20+ Year Treasury Bond ETF

TIP iShares TIPS Bond ETF

MUB iShares National Muni Bond ETF

VEU Vanguard FTSE All-Wld ex-US ETF

VSS Vanguard FTSE All-Wld ex-US SmCp ETF

VWO Vanguard FTSE Emerging Markets ETF

VNQ Vanguard Real Estate ETF

DBC Invesco DB Commodity Tracking

GLD SPDR Gold Shares

**Conclusion**

Before investing on the basis of this information, compute maximal correlations; while the truth may well lie somewhere between the two correlation estimates, collecting and plotting data pairs likely will be informative, and may suggest less diversifaction value.

Especially for those nearing or in retirement, a conservative approach to investing in risky portfolios likely is prudent, as there may not be opportunities to recover losses. In such cases a conservative approach would be to measure risk using Maximal correlations.