Although I use Excel all the time, and I strongly encourage my students to use it for performance analysis and CaP, I was forced to include a warranty disclaimer in my GCaP book because I discovered a serious numerical error while writing Appendix B. There, my intention was just to show that Excel gives essentially the same results as Mathematica when using the USL scalability model. It didn’t!
This topic came up in the recent Guerrilla Data Analysis Techniques class. Here, I’m repeating the exercise but comparing USLcalc.xls with R instead of Mathematica—the latter two tools being in sufficient agreement. Since the input data set is small, I’ll use R’s textConnection function, which fools read.table into thinking it’s digesting an external file.
<br />input <- read.table(textConnection("p X_p<br />1 20<br />4 78<br />8 130<br />12 170<br />16 190<br />20 200<br />24 210<br />28 230<br />32 260<br />48 280<br />64 310"), header=TRUE)<br />closeAllConnections()<br />The processor configuration is denoted by p and the corresponding throughput (as described in Chapter 5) is denoted by X_p. Fig. 1 shows the regression results in R. The R script for applying the USL model is available here.
The excitement really begins when we use the results of the regression fit to make projections about larger configurations. Fig. 2 shows the results of projecting out to p = 500 processors. In particular, it shows the predictions for the location of the maximum in each scalability curve. Excel predicts pmax = 435, whereas R predicts pmax = 288 (we round down): a difference of 50% more hardware according to Excel.
In this hardware series (SGI Origin NUMA) the backplane did not support more than 64 processors (or boards at that time), and it’s clear that scalability is all over, bar the shouting, by p = 100 or 128 processors; all other things being equal. But it doesn’t take much imagination to see what happens with potential expansion to p = 256 or 512 processors or cores or GPUs, in future generations. And that’s what CaP is about. Similarly, the x-axis could be a representation of N software users rather than hardware processors.
The precision problem arises in the computation of pmax, which is proportional to κ-1/2. Excel produces a value of κ ~ 10-6, while R and Mathematica yield κ ~ 10-5; different by an order of magnitude. Both Mathematica and R use infinite precision in software, so they are less prone to this kind of error. But see the Comments below for more discussion on this point [–njg: Sun, Aug 29, 2010].
Prof. Walter Gander of ETH in Switzerland reveals how similar precision problems can arise in Matlab, although it’s not a Matlab problem but an IEEE precision problem arising from the bit-width format used to represent FP numbers in hardware. What’s a person to do? I have to agree with the Prof. who says:
In any case I think it is important to teach our students that computing with real numbers deserves special care and attention. It is obviously not enough to write mathematically correct programs – they can also exhibit weird behavior.
In other words, just like any performance analysis, you can never be too careful.