# Who Mentioned Black Swans? Difficulties in estimating the probability of low probability events

The recent credit crisis has brought into focus some of the difficulties in estimating and calibrating risk analysis models in which events of low probability are used.  For example, suppose a AAA-rated security is deemed to have a 1% chance of default in a particular year.  How good is that 1% estimate?

More generally, suppose historic data has 100 trials in which an event has occurred once, or perhaps 1000 trials in which an event has occurred 10 times.  The “maximum likelihood estimator” (corresponding to human intuition) assigns a 1% chance for the “true but unknown” probability of such an event. Intuitively there is however a range of possible probabilities.  For example, for the case of 10 occurrences from 100 trials, standard probability theory shows that with a true probability of 10% probability this outcome would be observed about 13.2% of the time, whereas for a 9% probability it would be observed about 12.4% of the time (so that the 10% estimate is indeed slightly more likely than the 9% estimate).

In fact is well known that the uncertainty distribution for the probability of a binomial process given some observations from a certain number of trials is represented by the Beta distribution (the assumption of a binomial process will be adequate for practical purposes when dealing with low probability events, rather than say a Poisson).  A more detailed explanation of the Beta distribution and a spreadsheet example is in Chapter 4 of my book Financial Modelling in Practice, with many examples of financial risk analysis applications.

The screenshot shows a table of the most likely estimate (1%), as well as the frequency in which the true probability (using a beta distribution) is above that estimate.  The various table entries show an increasing number of trials, with the number of observations always equal to 1% of that. The graph shows the Beta distribution for various cases.

The key points about these results are: 1. The beta distribution is skewed, but becomes more symmetric as the number of trials increases 2. The range (standard deviation) of the beta distribution narrows as the number of trials increases, so that we become more confident that the estimate is closer to the true figure 3. The total number of trials needs to be around 1600 (16 occurrences) for – in about 95% of cases – the true probability to lie within a +/-50% band around the most likely estimate (i.e. in the range 0.5% to 1.5%). Note also that the RiskTheo statistics functions in @RISK enable these calculations to be performed directly in Excel (e.g. RiskTheoStdDev to calculate the standard deviation of the beta distribution, and so on).

Dr. Michael Rees
Director of Training and Consulting