R books
R books
How to Recover the Missing X(1) for the USL Scalability Model

In Section 5.7.3 of my GCaP book, I explain how to use Excel to estimate X(1) from a fit to Amdahl's law. In that case, you only have to fit a single modeling parameter, whereas the USL requires fitting two modeling parameters. In the upcoming GDAT class, we will show you how to do this more accurately using nonlinear regression analysis based on iteratively minimizing the residuals (shown above) in both R and Mathematica. You really need to use these more sophisticated techniques to avoid numerical accuracy problems that might be lurking in Excel.
In setting up this discussion for the GDAT class, I ran into a frustrating problem with Mathematica. Here's the code from Jim's R script ...

R has a function called optimize() which successively calls a user-defined function, function(x), with new trial values of X(1) until it determines the minimum value of sse/sst (the ratio of the estimated sum-of-squares to the total) viz., the bottom of the curve shown above. The latter values are calculated using the nls (nonlinear least-squares) library function in R with the USL model as its argument (the thing with the 'sigma' and 'lambda' in it; using my old notation). David Lilja will explain the role of SSE and SST in the first part of the class.
You can do exactly the same thing in Mathematica and here is my equivalent user-defined function:

Slight problem. The call-back didn't work! The argument trialX1 was not being evaluated, so the NonlinearRegress function (the Mathematica equivalent of "nls") barfed. After a lot of debugging, I finally discovered the problem. Long story short: unlike R, Mathematica can also do symbolic computations. In fact, that's its primary use (don't you wish you had it to evaluate those nasty integrals in your Calculus class?). Because of this "bias", it first tries to evaluate the argument trialX1 as a symbol, rather than a number. This behavior is correct but the justification is technical, so I won't go into that here. To disuade Mathematica from doing that, I needed to tell it explicitly to treat the argument as a numeric type and that's what the incantation trialX1_?NumericQ does. Simple, when you know!
Once again, this proves what I've said on many occasions: All modeling is programming and all programming is debugging.
What is faster, Winbugs or Openbugs?
Guerrilla Data Analysis Class – Seats Still Available
In this class, computer engineering and statistics expert Prof. David Lilja presents an easy introduction to statistical methods and finally leads us into the topic of Design of Experiment (DOE) methods applied to performance and capacity planning data.
Having established the foundation theory, R expert, Jim Holtman will show you how to apply DOE and other statistical techniques using example case studies.
You can register for the class, and book your hotel room, online. Book early, book often! We look forward to seeing you in August.
Get your R on
useR! CONFERENCE, AUGUST 12-14 2008, DORTMUND GERMANY
Impressive statistical computing types like Andrew Gelman, Gary King, and others will be presenting at this year’s useR! conference. Decision Science News might just have to hop over and check it out. The program looks great. Those interested in learning R might be interested in our Decision Science News R tutorials one and two.
About the Conference
useR! 2008, the R user conference, takes place at the Fakultät Statistik, Technische Universität Dortmund, Germany from 2008-08-12 to 2008-08-14. Pre-conference tutorials will take place on August 11.
The conference is organized by the Fakultät Statistik, Technische Universität Dortmund and the Austrian Association for Statistical Computing (AASC). It is funded by the R Foundation for Statistical Computing.
Following the successful useR! 2004, useR! 2006, and useR! 2007 conferences, the conference is focused on
1. R as the “lingua franca” of data analysis and statistical computing,
2. providing a platform for R users to discuss and exchange ideas how R can be used to do statistical computations, data analysis, visualization and exciting applications in various fields,
3. giving an overview of the new features of the rapidly evolving R project.
As for the predecessor conference, the program consists of two parts:
1. invited lectures discussing new R developments and exciting applications of R,
2. user-contributed presentations reflecting the wide range of fields in which R is used to analyze data.
A major goal of the useR! conference is to bring users from various fields together and provide a platform for discussion and exchange of ideas: both in the formal framework of presentations as well as in the informal part of the conference in Dortmund’s famous beer pubs and restaurants.
Prior to the conference, on 2008-08-11, there are tutorials offered at the conference site. Each tutorial has a length of 3 hours and takes place either in the morning or afternoon.


