Month: January 2009

Enough to Make a Developer Blush

As most Palisade customers know, computer processing speed is important in both Monte Carlo software and packages for genetic algorithm optimization.  So when I came across an InfoWorld article on the acceleration capabilities of GPU-based processing, I clicked right through to find out more about the gizmos called GPUs. 

A GPU (Graphics Processing Unit) is used to speed up graphics rendering and–much more interesting to me since I’m not a gamer — to process huge data sets via complex algorithms.  The reason the gizmo is good at both of these tasks, I learned, is that has a very, very parallel structure.  And this structure makes it ideal for "embarrassingly parallel"  –I’m not making this up–problems.  This, embarrassingly,  is standard developer lingo. 

What does it take to be embarrassingly parallel? It takes many independent, easily separated tasks.  Each of these tasks can be processed on a different computational unit.  Massive amounts of data, many separate algorithms, many machines.  Read risk analysis, decision evaluation, face recognition, and weather prediction.  So what’s not to be embarrassed about? 

I progressed and was tickled to discover that in the early days an embarrassingly parallel problem might have been solved with a gizmo called a "blitter," which is—-oh,never mind, it’s enough to make even the most senior developer blush!

Is Beta Better?

From a theoretical perspective, the Beta distribution is intimately related to the Binomial distribution.  In this context it represents the distribution of the uncertainty of the probability parameter of a Binomial process based on a certain number of observations. As a distribution of probability, its range is clearly between zero and one, and as it is related to noting successful outcomes within a number of trials, it is a two-parameter distribution. The BetaGeneral is directly derived from the Beta distribution by scaling the range of the Beta distribution with the use of minimum and maximum values, and is hence a four parameter distribution (min, max, alpha1, alpha2).

Special cases of the BetaGeneral distribution are however perhaps most frequently used as a way to build models, where a pragmatic modelling approach is desired. For example, the PERT distribution is a special case of the BetaGeneral distribution, using the three parameters min, most likely, and max. If the parameters alpha1, alpha2 of the BetaGeneral are derived from those of the PERT distribution (e.g. by setting alpha1 equal to 6*(µ-min)/(max-min), where µ (the mean of the PERT) is equal to (min+4*ML+max)/6, and alpha1 equal to 6*(max-µ)/(max-min). The BetaSubjective distribution is another variation, and requires the four parameters min, most-likely, mean, max.  In a sense it is quite an unusual distribution, having both the mode and the mean as a parameter, and in some contexts can therefore be quite useful when one has knowledge of both of these and requires a distribution to capture that.

Dr. Michael Rees
Director of Training and Consulting

Six Sigma’s Thirst for Information and Analytics

About Six months ago, Palisade started its Six Sigma Webinar series. We have had many experts in Six Sigma, Lean, and Design for Six Sigma give presentations on topics ranging from DFSS Design Optimization, Using Monte Carlo Simulation in LSS, to Creating a high performance culture. All of these webinar’s were hugely successful and highly attended. If you would like to view any of the past archived webcasts please do so, we have them posted for your convenience.

I am excited about the upcoming scheduled free webinars as well as the others that are currently being developed. Knowing that travel and training budgets have been slashed, I wanted to reach out to the community to get a sense of what you’d like to see or learn about in the world of Six Sigma and analytics over the next year. Armed with the VOC (Voice of the Customer – see bog posting on Jan 8th  for more Six Sigma Terminology) and a host of experts, I will work with them to hopefully bring you the topics and discussions that are most important to you. Please send me your thoughts to shunt at or comment to this posting.

How Far Can You Throw an Economist?

"Remember," a good friend of mine who is an investment adviser said the other day, "that economics is a behavioral  science."  This observation was born out the very next day by all the blogging activity triggered by Princeton economist’s Uwe Reinhardt’s New York Times blog, "Can Economists Be Trusted?"  Many of the  points made in these responses clustered around the various motivations for tinkering with variables and probabilities, but I found particularly interesting an exchange between the University of Oregon’s Mark Thoma and Columbia statistician and political scientist Andrew Gelman. 

Thoma and Gelman are concerned with the pathways of reasoning in economic analysis.  They highlight the crucial point that this reasoning can–and often does–start from either end of the analytical process and, accordingly, run forward or–policymaker beware!–backward.  The same is true of mathematical modeling–whether it be for decision evaluation,  risk analysis, or neural network prediction–when it is deployed to support policy recommendations.

The behavioral factor in any kind of quantitative analysis is assumption.  Analysts face all-to-human temptations to alter the inputs in their Microsoft statistics worksheets.  But because, as Gelman points out, policy decisions are always made under uncertainty with a particular set of assumptions, the findings of analysts with different political goals can still be useful when considered together.

So the answer to Thoma’s question, "How far can you throw an economist?" is "How many of them are you going to throw?"

Who Mentioned Black Swans? Difficulties in estimating the probability of low probability events

The recent credit crisis has brought into focus some of the difficulties in estimating and calibrating risk analysis models in which events of low probability are used.  For example, suppose a AAA-rated security is deemed to have a 1% chance of default in a particular year.  How good is that 1% estimate?

More generally, suppose historic data has 100 trials in which an event has occurred once, or perhaps 1000 trials in which an event has occurred 10 times.  The “maximum likelihood estimator” (corresponding to human intuition) assigns a 1% chance for the “true but unknown” probability of such an event. Intuitively there is however a range of possible probabilities.  For example, for the case of 10 occurrences from 100 trials, standard probability theory shows that with a true probability of 10% probability this outcome would be observed about 13.2% of the time, whereas for a 9% probability it would be observed about 12.4% of the time (so that the 10% estimate is indeed slightly more likely than the 9% estimate).

In fact is well known that the uncertainty distribution for the probability of a binomial process given some observations from a certain number of trials is represented by the Beta distribution (the assumption of a binomial process will be adequate for practical purposes when dealing with low probability events, rather than say a Poisson).  A more detailed explanation of the Beta distribution and a spreadsheet example is in Chapter 4 of my book Financial Modelling in Practice, with many examples of financial risk analysis applications.

The screenshot shows a table of the most likely estimate (1%), as well as the frequency in which the true probability (using a beta distribution) is above that estimate.  The various table entries show an increasing number of trials, with the number of observations always equal to 1% of that. The graph shows the Beta distribution for various cases.

The key points about these results are: 1. The beta distribution is skewed, but becomes more symmetric as the number of trials increases 2. The range (standard deviation) of the beta distribution narrows as the number of trials increases, so that we become more confident that the estimate is closer to the true figure 3. The total number of trials needs to be around 1600 (16 occurrences) for – in about 95% of cases – the true probability to lie within a +/-50% band around the most likely estimate (i.e. in the range 0.5% to 1.5%).

Note also that the RiskTheo statistics functions in @RISK enable these calculations to be performed directly in Excel (e.g. RiskTheoStdDev to calculate the standard deviation of the beta distribution, and so on).

Dr. Michael Rees
Director of Training and Consulting

The Art of Siffing

In the days just ahead of the Obama inauguration last week, there was a good deal of speculation in the press about the kind of recommendations the new administration’s economic team would be likely to make.   In the midst of this, Princeton economist Uwe Reinhardt posted an essay that offered an unvarnished view of how political goals can infuse the statistical analysis and mathematical modeling of economists grappling with decision making under uncertainty–which Reinhardt points out is nearly always the case.

"Can Economists Be Trusted" discusses two analyses by of the effects of different forms of economic stimuli, tax cuts and increases in government spending.   The analyses, which came to two different results, were the work of the same economist. This is, he declares, the "flexibility economists enjoy when they apply their professional skills to affairs of state."

This leads Reinhardt to "the art of siffing" (to "sif" is to Structure Information Felicitously) and leads me to recommend his paper on this topic if you are inclined to look father into the rationale behind any of the proposals competing for the affections of economic policy makers.

Reinhardt’s post prompted a number of other academic economists to bare their mathematical souls and try to home in on the real sources of bias in any economist’s decision evaluation.  More on economists and trustworthiness up next. 

The Scientific Method for Management

The scientific method for management is growing in popularity because it allows for organizational decisions—whether by business or government—to be formulated under more rigorous considerations. The quantitative approach to risk and decision making, with tools such as Palisade’s DecisionTools Suite, is one method for making management decisions with the aid of data and science.

In contrast, traditional ad hoc methods, whether for day-to-day operations management or  monumental management decisions, are being exposed as amateur approaches. We’ve seen so many companies, governments, and economies in trouble as artificial pillars of value have crumbled. Risk assessment was devalued, ignored, or applied haphazardly. 

The challenge for accurate decision making remains. In the presence of uncertainty and unknowns (lack of information), the scientific method for management decision-making allows for more defendable decisions. Sensitivity and scenario analyses in the context of probabilistically-defined Monte Carlo models can be applied to paint a picture of where an organization sits in an apparent asteroid field of risk.

The practice of quantitative risk management and decision making has been widely adopted in the U.S., the U.K. and many parts of the western world, and now the Chinese government has planted the seed. We’re seeing examples of the “Scientific Method for Management” in leading companies such as CNPC and Sinopec as a new wave of protocol in China. Statistical software that is able to help with decision making under uncertainty will play a large part in helping to implement better decision making in the future.

Mark Meurisse
Managing Director, Palisade Asia-Pacific Pty Ltd.

The Shape of Things to Come

It’s Inauguration Day, when everyone is looking to the future, which is always a brighter spot than the particular one we happen to be in.  But there are of course people who look to the future every day.  They make a profession of anticipating it. These folks, from meteorologists to economists to financiers to farmers, all have a common stock in trade: probability.   They are interested in pinpointing the moments when randomness and a particular event can meet. 

For their work professional future-watchers use mathematical expressions that trace the path of likelihood through chance to happening: probability functions.  These functions can be plotted graphically, and they come in many shapes and carry many names–Wikipedia lists at least a hundred different kinds of probability functions.  So, depending on whether you’re trying to calculate value-at-risk, doing statistical analysis for production forecasting, or helping a client with retirement planning,  it’s probable that there is a function formulated for that purpose.  Choosing the correct probability function is crucial to credible forecasting.

Whether they are standard-issue or designer-created, probability functions have work to do. Introduced into mathematical models (such as those spun out by Monte Carlo software) they mediate the force of chance to specify the future outcomes in, say —?  Population trends? Widgets? Income? Depending on which future you’re watching, the curve of a probability function is the shape of things to come.

It Goes on All the Time in Your Head

Let’s depart for the moment from probabilities and the details of distributions to consider the latest development in neural network computing.   While much of current neural network technology is based on simulating the interaction of simple computational “neurons,”  the British magazine The Engineer reports that engineers at Manchester, Sheffield, Southampton and Cambridge Universities are engaged an effort to simulate the much more complex interactivity of the nerves in a living brain.  The first can be accomplished with a desktop computer and is used for solving non-linear problems such production forecasting, price-point decisions, and reserve estimation.  The second will require million-processor parallel computing and will be used to gain insight on the electrical activity in a biological brain.

The project, SpinNNaker, focuses on the so-called spikes in the brain, the electrical impulses that pass between axons (the structures that connect biological neurons). there are billions of these connections–yes, billions as in buyout amounts– and they are constantly transmitting electrical “information.”

What seems remarkable to me is that the actual brain, say the one in my skull, is so much more sophisticated, so much more compact than the massive machine architecture envisioned for SpiNNaker.  So many neurons, so many connectors, oh so many electrical impulses.  It’s enough to give anyone a headache, but it goes on all the time in each human head in the world population.

Getting the Full Picture – Combining Monte Carlo Simulation with Decision Tree Analysis Part II

In Part I combining simulation and decision tree techniques was introduced. But what does that actually give you? What meaningful results are created to justify the work? Obviously there are good things to come, or I wouldn’t be bringing it up!

A regular spreadsheet model can produce a distribution of outcomes like this:


But as we know there may be fundamental decisions in a project that can’t be easily included in the static structure of  a spreadsheet model. A decision tree analysis can generate an optimal policy like this:


But this only shows a single estimate of the expected return for that particular decision path.

Now we look at our hypothetical decision tree (an oil project, say) and decide there should actually be an uncertain value for one node. Perhaps a lognormal distribution for the amount of oil produced by a well in its first year. A RiskLognormal() function is placed into the spreadsheet to replace the constant return value. Make the return in the root node an @RISK Output, then go ahead and run a simulation!

The optimal policy is still in affect, but now the actual value of the project will change based on the sampled value of this distribution during the simulation. This can be analysed in two ways – either by creating a distribution of the expected value of the optimal policy, or a distribution of the actual values the optimal policy generates. The distinction is due to the treatment of chance nodes. In a deterministic model, a chance node returns the expected value of that node based on the likelihood of following each path and each path’s expected value. When the distributions are sampled during a simulation the expected value of the effected chance nodes changes. These expected values are then carried through the tree, resulting in a distribution of the expected return based on the optimal policy.

Alternatively @RISK can step through the individual chance node branches (with or without probability distributions in the model – the chance nodes have essentially already included discrete distributions to the model) to ultimately generate specific values that can be generated by the optimal path. In this case the decision tree is followed along a particular path until reaching the end node. The value at the specific end node is recorded by @RISK.

The distribution of the expected value of the recommended decisions might look something like this:


While the distribution of actual values, arguably more useful from a risk analysis point of view, might look like this:


Notice how this graph gives very useful information to a decision-maker regarding their exposure to different/adverse outcomes. By combining the brute power of Monte Carlo simulation with the flexibility of decision tree structures a rich picture of possible outcomes ensues. In this instance the three (simplistic) outcomes of an empty well, reasonable returns and excellent returns are represent in the three clusters.

Good, hey?! Enjoy!

Rishi Prabhakar