Month: September 2009

A Neural Network by Any Other Name

You say toe-MAY-toe, and I say toe-MAH-toe.  But I still know exactly what you mean: a big, round, juicy red veggie that slices up nice for the burgers. But how I know the difference between toe-MAH-toe and toe-MAY-toe is a mystery that has eluded neuroscientists.  Until now.  
 
In the many press releases I see on scientific topics, I’ve noticed a trend toward using neural network technology to analyze its biological namesakes, the neural networks in the human brain. One of the latest examples of this is research from a team of scientists at Hebrew University who have used computational neural networks to analyze the cellular processes by which sensory neurons  adjust to differences in speech for the same word.  
 
The differences in the way I say tomato and you say tomato are largely a matter of timing and durations, and these sounds are received by single nerve cells.  The neural net algorithms devised by Dr. Robert Gutig and Dr. Haim Sompolinsky identify these differences by classifying the way the single nerve cells respond.  This innovation will not only be useful in such speech decoding applications as telephone voice dialing, but they also have promise in treating auditory problems.
 
Toe-MAY-toe?  Toe-MAH-toe? Let’s call the whole thing….Naw, the two brain scientists aren’t calling anything off.  Their neural network is just getting started.

The Discouraging But Enticing Scenario of Clinical Trials

One of the most expensive passages in the long road that a new drug must take to reach the marketplace is the series of mandatory clinical trials.  This past summer a "life-sciences advisory company,"  Value of Insight Consulting, based in Fort Lauderdale, Florida, provided a close look at the factors that make clinical trials so expensive–and so risky.

"Optimizing Global Clinical Trials," by Todd Clark, reports on the details of a complex model built with Monte Carlo software that was intended to help a pharmaceutical developer working out product strategy for clinical trials.  The company’s goal was to choose a country from which to launch trials for a specific drug for a specific kind of cancer.  Because the primary factor in locating clinical trials is probable patient enrollment, the report provides country-by-country risk assessments for 54 factors ranging from epidemiological data to satisfaction with existing cancer therapies.

 
For myself, having no idea how clinic trials are organized, Clark’s report is eye-opening.  It gives a very clear picture of the constraints under which pharmaceutical development takes place and of the huge budgets behind the process–which helps to justify the high costs of drugs.  Risk analysis should have a very happy home in this industry because the value-at-risk is very high and the probabilities are pretty sorry.  As Clark reports, “On average, drug sponsors can spend over 13 years studying the benefits and risks of a new compound, and several hundred millions of dollars completing these studies before seeking FDA’s approval. About 1 out of every 10,000 chemical compounds initially tested for their potential as 
new medicines is found safe and effective. . . ."
 
Amazingly enough in light of all this, Clark reports that the number of clinical trials is growing. It doesn’t take any statistical analysis to derive from this last fact that when a drug makes it to market and makes it  big there, the return on investment is a whopper.  

Have confidence in your analysis!

Confidence intervals are the most valuable statistical tools available to decision makers, and according a recent Six Sigma IQ article written by Dr. Andrew Sleeper of Successful Statistics, they are not being used as frequently as they should. Sleeper’s article  Have Confidence in Your Statistical Analysis!: Learning How to Use Confidence Intervals does an excellent job illustrating why point estimates are useless for making decisions, and how to determine what is the best confidence interval to use. Is it 90%, 95%, or some other value?

The article does not discuss how to calculate confidence intervals, since widely available software (for example, Palisade’s @RISK and StatTools) automates this task. Formulas and calculation methods are well documented in many books.

One example that Dr. Sleeper uses to illustrate his point: Suppose the CEO has decreed that we need CPK to exceed 1.50 for all critical characteristics. If I measure a sample of parts and announce “CPK is 1.63,” this sounds like good news. But then you ask a really good question: “How large is the sample size?” If you discover the sample size was only three, should you be worried? What if you discover the sample size was 300?
We have to make a decision about the capability of the population, but once again, the point estimate is not enough information by itself to make this decision. It is another useless number.

Instead, suppose I said “I am 95 percent confident that CPK is at least 1.52.” Or I could say “I am 97 percent confident that CPK is at least 1.50.” Either of these would be a true statement. And since sample size is used to make these calculations, they provide all information necessary to make the business decision.
These one-sided confidence intervals are often called lower confidence bounds, because the upper limit of each confidence interval is infinity. In the case of CPK, we usually don’t care how large it is, so a lower confidence bound is more appropriate than a two-sided confidence interval.

Because they are single numbers, point estimates are almost always above or below the parameters they are supposed to estimate. Without additional information, point estimates are useless for making decisions. But confidence interval estimates are very likely to be true, and the confidence level specifies and controls the probability that the interval estimates are true. Since properly applied confidence intervals incorporate sample size and other tested assumptions, these are reliable tools to make business decisions.

In addition to this article you can find alot of great information at Six Sigma IQ 

 “A point estimate by itself is just another useless number.” – Andy Sleeper, 2009

The Analysis of Breathtaking Risk

With the frequent press reports of the probability of an epidemic of the so-called Swine Flu (the H1N1 virus) , I’ve been surprised that there has been so little published about how the well-publicized predictions are made.   Last month, however, specialists from University of California, Davis, the Washington (D.C.) Hospital Center, and a private consulting group published a research note about their risk analysis model that predicts the incidence of acute respiratory failure caused by the new flu.  If the disease is literally breathtaking, the predictions are figuratively breathtaking as well.
 
Before we get to those predictions, let’s this model into context.  It was essentially an operations management study for the benefit of hospital ICU directors–that is, what do ICUs need to brace for in terms of numbers of patients and the severity of their illnesses?  Although one commentary called the model a kind of "back of the envelope calculation" and this may be true, this model seems like a very necessary starting point.  Whatever its flaws, this research note should be an effective heads-up that will prod other epidemiologists to fire up the Monte Carlo software to refine the assumptions and the data selection.
 
Now to those numbers.  Although offering only a few details of their risk analysis, the researchers predicted that  
• 15 percent of the U.S. population will be infected with H1N1.
• 6 percent of those infected will require hospitalization.
• 12 percent of those hospitalized will ?? to acute respiratory failure.
• 58 percent of those patients who go into acute respiratory failure will not survive it.
 
The nod to the grim reaper in the last item amounts to total fatalities of nearly 200,000.  While this estimate doesn’t approach the 25 million fatalities in the flu pandemic of 1918, it’s still enough to take your breath away.
 

The DNA of Cement

Last week a team of MIT scientists calling themselves Liquid Stone made a breakthrough (as it were) discovery about cement.  The Romans used cement to build their remarkable aqueducts, and the stuff is still in use.  In fact it’s one the most widely used building materials on the planet.  It has a chemical name, calcium-silica-hydrate.  But until last week, its molecular structure was unknown.
 
Scientists have been operating under the assumption that cement is a crystal, but the Liquid Stone group discovered this is not the case. It’s a hybrid structure in which the crystal form is interrupted by "messy areas" in which small voids allow water to attach.  
 
By now, you are probably wondering what the composition of cement has to do with risk analysis. The link is Monte Carlo simulation,  Liquid Stone used Monte Carlo software harnessed together with an atomistic modeling program to test various scenarios for how water attaches to the cement molecule in the messy areas.  
 
Why is this discovery important?  Because the manufacture of cement is accounts for about 5 percent of  worldwide carbon  emissions.   The new knowledge of the composition of cement will enable engineers to tinker with the manufacture of cement to reduce these emissions.  Now that Liquid Stone has what it calls the DNA of cement, they can progress to genetic engineering of the messy areas, and predictive statistical analysis will allow them to test various product strategies for replacing various atoms in the cement molecule.
 

What I love about all this is that apparently, Liquid Stone isn’t using risk analysis to get the messy areas better organized,the purpose of it is to figure out how to fit new stuff into the mess.  

Dr. Hans Rosling’s “Let my dataset change your mindset”

For Friday, here’s some inspiration for great presentations of datasets. In his address to the U.S. Department of State, Dr. Hans Rosling of Sweden’s Karolinska Institute lets his dataset paint a picture, making a strong argument to dispel outdated notions about the Developing vs. Developed world. Years of data gathering, experiments, and statistical analysis are presented in a concise, compelling — and entertaining — manner. There’s something to be learned from him not only about public health, but also about effective presentations, which any of us could apply when making our case in the fields of risk analysis and decision making under uncertainty.

Enjoy!

DMUU Training Team

Simulating the U.S. Economy: Where will we be in 100 years?

William Strauss is the President and founder of FutureMetrics. He brings more than thirty years of strategic planning, project management, data analysis, and modeling experience into the company’s stock of knowledge capital. Bill’s professional history includes executive positions as director, president, and senior vice president, as well as positions as senior analyst and field coordinator. He has an MBA (specializing in Finance) and a PhD (Economics).

Dr. Strauss will present a case study at the 2009 the 2009 Palisade Conference: Risk Analysis, Applications, & Training. The conference is set to take place on 21 – 22 October at the Hyatt Regency in Jersey City, 10 minutes by PATH from Manhattan’s Financial District.

See the abstract for his case study below, and see the full schedule for the Conference here.

Simulating the U.S. Economy:
Where will we be in 100 years?

There is an assumption that drives all of our expectations for how our economy will be in the future. That assumption is one of endless economic growth. Clearly endless exponential growth is impossible. Yet that is what we base all of our expectations upon. We all agree that zero or negative economic growth is bad (just look around now at the effects of the Great Recession). But we also know logically that 2% or 4% annual growth every year leads to an exponential growth outcome that is unsustainable. 

To see where this growth imperative will take us we first have to see how we go to where we are today. This work first models the 20th century. The model is both complex and simple. The basic schematic of the model’s relationships is easy to understand. Furthermore, the core of the model is a simple production function that combines capital, labor, and the useful work derived from energy to generate the output of the economy. Complexity is contained in the solutions to the internal workings of the model. What is unique is that there are no exogenous economic variables. Once the equations’ parameters are calibrated, setting the key outputs to "one" in 1900 results in their time paths very closely predicting the U.S. GDP and its key components from 1900 to 2006. 

The experiment in this work is about the future. If the model can very closely replicate the last 100 years, what does it have to say about the next 100 years? From 1900 to 2006 there are periods in which there was parameter switching. (The optimal parameters and the years for the switching were found using a constrained optimization technique.) That suggests that in the future there will also be changes. The experiment uses @RISK’s features to generate new combinations of parameters for each of tens of thousands of runs of the simulation. Changes in the parameters represent potential exogenous policy choices.

The "doing what you did gets you what you got" scenario leads to a surprising and unsettling outcome. The experiments using @RISK do find a path that works. Obviously if it is not "business-as-usual" that leads to a stable outcome, it is some other way. The policy choices that lead to a stable outcome suggest that the future of capitalism is not going to be what we expect it to be.

Please join us in October in New York for software training in best practicies in quantitiative risk analysis and decision making under uncertainty, real world case studies from risk services consultants and experts, and networking with practicioners from many different fields including oil and gas, pharmaceuticals, academics, finances, Six Sigma, and more.

October 2009 – Worldwide Training Schedule

Palisade Training services show you how to apply @RISK and the DecisionTools Suite to real-life problems, maximizing your software investment. All seminars include free step-by-step books and multimedia training CDs that include dozens of example models.

Europe

North America     

Bausch & Lomb’s Global Director of DFSS Gets Our Focus

As part of Palisade’s membership in the ISSSP, we get to participate in what are called Focused Sessions. For these webcast-like sessions, we are sponsors and exert no editorial control over their content . . . but we decide who the speaker is.

So we’ve decided to put the attendees in good hands! Jeff Slutsky, Global Director of Design for Six Sigma for Bausch & Lomb, will be giving a presentation on September 17th called Probabilistic Project Estimation Using Monte Carlo Simulation.

Registration for the event through the ISSSP is free. This presentation will feature @RISK for MS Project. If you ever wanted to find out more about @RISK for Project in Six Sigma and project estimation, this would be a good venue.

Last summer Jeff presented an excellent free live webcast: DFSS-based Design Optimization using Design of Experiments and @RISK. This is also something that can be viewed for free.

As for recommended reading in the future, Jeff is also the coauthor of Design for Six Sigma in Technology and Product Development. I’d highly recommend it, it is an excellent resource that is often used as the corner stone in many DFSS and Critical Parameter Management courses
 

Consulting With Impact, Webcast with StatTools

Ed Biernat’s Consulting With Impact recently used Palisade’s StatTools in a two-week training session for Six Sigma Green Belt candidates as part of their certification.  Ed said the response from the candidates was very positive, so he decided the software tool would be a good addition to all his Green Belt training. Also, he thought it would be a good subject of a Palisade Free Live Webcast.

So on September 3rd, please join Ed for his StatTools and Six Sigma. This session will cover the attributes in StatTools that apply to Six Sigma, with examples from Ed’s recent training session for his Green Belts. Registration for the webcast is free. Also, the webcast will be archived and viewable by anyone at anytime for free.

But why StatTools? "It’s a flexible, powerful statistics tool," Ed says. "The Green Belt students were able to jump right in with it!" Of course the Green Belt Candidates have some body of knowledge that made them ready users of the tool, perhaps we’ll have more knowledge too, after Ed’s webcast. And even though I know a lot about it, I’ll be there!