In what year do you think the first U.S. Social Security check was issued?
Two people are asked this question, neither knows the answer with certainty, so they separately are guided through a process to represent both their best guess and degree of uncertainty, resulting in the two probability distribution curves below, dating back to 1776. Which person did “better”?
That depends on how we define “better”. Clearly, the flat-liner expressed much less confidence, but perhaps the confidence exhibited by the narrow-bander is misplaced. Let’s look at how each responded and performed.
Neither knew the answer off-hand, so they were asked to pick two extreme values, the minimal and maximal plausible values, and their single best guess, the 50th-percentile, such that in each of their respective views the correct answer was equally likely to be above or below that estimate. Next, each was asked to express their degree of uncertainty by specifying a 25th-percentile and 75th-percentile that properly reflects their uncertainty regarding this subjective distribution of possible answers. Surely each would agree that there is 100% likelihood the true answer is greater than 1 A.D., greater than 2 A.D., and so on — eventually reaching amounts believed highly implausible but nonetheless possible. Continuing on until reaching an amount such each believed there is about a 25% chance the true value is less than it and a 75% chance it is larger. That is Q1, a personal first quartile. Continue on until reaching Q2, a number representing an equal 50% chance the true value is lower or higher, and finally to Q3, signifying the personal view there is a 75% chance the number is lower and 25% chance it is higher. Notice that the respondent should be equally confident the true number is inside the range between Q1 and Q3 as that it is outside that range. That middle 50% of data is a subjective interquartile range (IQR). Overconfidence would be exhibited by a general tendency for subjective estimates of the IQR to exclude the true value more frequently than it includes it, so that the true value is lodged somewhere below Q1 or above Q3. For each person, we now have a five-poiont summary, the parameters of a distribution.
She recalls with assuredness that the Social Security program was a creation of Franklin Delano Roosevelt during his presidency, which she asserts as having run from 1933 to 1945. She believes the program likely started after his first year in office but before the Pearl Harbor bombing, and was equally likely to have occurred inside or outside a two-year window. Her five-number summary is:
The probability density function (pdf) and its cumulative values (cdf):
He confesses to having no idea. Pressed to offer something, he realizes that the answer must fall between 1776 and 2020, since it is a program of the United States government. That’s a start. Pressed further, he suggests the program likely was initiated following a significant historical event, lists five candidate events, and deems each event equally likely to have resulted in the initiation of Social Security. He maps out his thoughts, and decides those five dates correspond to the five-number summary that reflect his subjective beliefs:
So, Mr. Variance believes the actual date is equally likely before or after 1918, with a 50% chance it happened between 1865 and 1945, and 25% probabilities each for the 1776-1865 or 1945-1965 intervals. Applying these parameters to the general lognormal distribution results in the following five-number summary, subjective probability density, and cumulative probability distribution. Note the initial estimates are slightly adjusted. That is because the five-number summary is a set of parameters, which are fed into a program to simulate distributions.
In fact, as part of his “New Deal”, it indeed was FDR who signed the Social Security Act of 1935. Although the first SSA office opened in 1936 and the first Social Security taxes were collected in 1937, the first check was not issued until January 31, 1940. The recipient of that first check was Ida May Fuller, a legal secretary who paid a total of $24.75 in Social Security taxes during 1937-38-39, then received that first check for $22.54 upon turning 65. In an ironic precursor of financial strains on the program, she lived to age 100, collecting $22,888.92 from SSA over 35 years.
Truly unbiased subjective curves would be centered at 1940, with equal belief the actual year would as likely be later than earlier than 1940. Based on their respective belief curves, for Mr. Variance 70.0% of the area under the simulated curve fell earlier than 1940, whereas 95.7% did so for Ms. Bias. Both are biased toward an earlier year, but Ms. Bias far more so.
Overconfidence is characterized by subjective confidence in judgements or beliefs that exceed the demonstrably objective accuracy of those judgements or beliefs. Psychology literature suggests it is far more common than being properly calibrated, and evidence of underconfidence is relatively sparse.
More Complex Example
Let’s work through a more complex example. You are asked to estimate the dollar value of all distributions of U.S. Social Security payments to beneficiaries during May, 2020. Assume you had responded Q1 = $1 billion, Q2 = $5 billion, and Q3 = $50 billion. I can conclude from this that your IQR does not include the true answer, but cannot from this conclude you are overconfident. After all, if you are properly calibrated, being inside/outside the IQR is a 50/50 proposition.
Given sufficient opportunity, time and access to resources, one could construct an informed estimate of the dollar value of all distributions of U.S. Social Security payments to beneficiaries during May, 2020. There are eleven types of recipients, and for each there is a record of the number of beneficiaries and the actual payout to each in May 2020. To multiply the number of beneficiaries times the average payout for each category of recipients requires 22 input values, and the sum of the 11 products equals a total payout by Social Security of about $90 billion ($89,939,234,989.04) during the month of May, 2020.
There is a compromise strategy, more elegant than the loosely-anchored initial estimates of Q1, Q2 and Q3, but far less computationally demanding than the explicit enumeration above. A “fast and frugal” compromise strategy is to provide subjective estimates for a small and manageable number of components that have the potential to reasonably substitute for a complete listing of categories, and that might more reasonably be estimated by an outsider than the required 22 input parameters above.
For example, one could device a four-parameter estimation process:
1. World population
2. U.S. population as percent of world
3. U.S. Social Security recipients as percent of population
4. Average monthly payment per Social Security recipient.
* Total Outflow of Social Security Benefits ~ Product of the four parameters.
If preferred, the problem can be simplified further, replacing the first two parameters by directly asking for the U.S. population, thereby reducing to a three-parameter estimation problem (but at a cost: loss of information):
1. U.S. population
2. U.S. Social Security recipients as percent of population
3. Average monthly payment per Social Security recipient.
* Total Outflow of Social Security Benefits ~ Product of three parameters.
Applying the latter three-parameter estimation process, you now are asked to provide Q1, Q2 and Q3 subjective estimates for each of the three parameters, as shown below. Taking the best guesses (Q2), your new estimate equals 15% of 320,000, times $4,000 per recipient, equalling $192 billion — quite a leap from your original estimate of $5 billion. Error rates for each component are as follow:
A single poor estimate greatly impaired your analysis, as your average payout per Social Security recipient is almost triple the actual value. With only three parameters to estimate, a single poor choice can greatly distort results. Nobody expects highly accurate estimates, but with more parameters to estimate, the opportunity for errors to offset and provide a decent final estimate increases greatly. For example, another person might complete the same three-parameter estimation process, with these results:
In her case, the two larger errors are in opposite directions, and essentially cancel their distortions. The three component estimates esch have much more error than the remarkably accurate final estimate, due to error cancellation.
Regarding the matter of assessment overconfidence, let’s return to your respective component quartile estimates and their implications. The table below displays your estimates (with actual correct answers in bold and to the far right):
We now also have four independent estimates of Q1 and Q3 for independent questions, and note that the true values are outside the respective IQRs in all four cases, with three above Q3 and one below Q1. This provides preliminary evidence of overconfidence, although it must be noted that two estimates are relatively close to the Q3 bound and the sample size of n=4 can provide no more than tentative support for overconfidence, as opposed to a collection of several such three-parameter estimation processes.
Finally, note from the table above that you also were asked to provide estimates of the far tails to your original estimate of the dollar value of all distributions of U.S. Social Security payments to beneficiaries during May, 2020. In particular, while your best guess of the value was $5 billion, you also estimate your 2%-tile at $1 million, and your 98%-tile at $50 billion. Even this range does not capture the true value, as your 98th-percentile is little more than halfway to the actual value. This provides another indication of overconfidence. Also, your Q1 and Q3 estimates for the three-parameters imply a means of inferring your 2%-tile and 98%-tile beliefs regarding the total payout value. Since you believe there is only a 25% chance that a given true parameter value is below its Q1 estimate, and there are three parameters, you must believe there is a 25% * 25% * 25% (= 0.25^3 = 0.02, or 2%) chance that each of the three parameter values are below your respective Q1 estimates. The same applies to the other tail, at the 98%-tile. The visual display below provides an intuitive view of the calculations. Each quartile of a given parameter has an equal number of values, represented here by a 4×4 box with 16 cells per quantile. As we combine two parameters, we are left with 25% of the original cells below Q1 and above Q3, or four cells. Combining three parameters reduces cells by 25% again, from four down to one (1/16th of 16; 1/64th of the original 64 cells).
Applying this logic to your estimates, the implied percentiles are shown below. Your Q1, Q2 and Q3 estimates for each of the three parameter questions imply that your best guess estimate of the dollar value of all distributions of U.S. Social Security payments to beneficiaries during May, 2020 is $192 billion — quite the contrast to the direct estimate of $5 billion when initially asked. Further, the implication is that you believe there is about a 2% chance the true value exceeds $292.5 billion, and about a 2% chance it falls below $94.5 billion. Since the actual value is below that lower limit, there is evidence that you exhibit overconfidence in expressing subjective beliefs regarding objective values.
A Standalone APP
We have developed a standalone app to estimate the pdf and cdf of your subjective probability estimates to any problem by applying 10,000 iterations of a computer simulation of the general lognormal distribution, based on the five-number summary you enter.
— Jerry Platt, Ph.D., Emeritus Professor of Finance, San Francisco State U.